Article

Prediction of Drug Binding Affinities by Comparative Binding Energy Analysis

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

A new computational method for deducing quantitative structure-activity relationships (QSARs) using structural data from ligand-macromolecule complexes is presented. First, the ligand-macromolecule interaction energy is computed for a set of ligands using molecular mechanics calculations. Then, by selecting and scaling components of the ligand-macromolecule interaction energy that show good predictive ability, a regression equation is obtained in which activity is correlated with the interaction energies of parts of the ligands and key regions of the macromolecule. Application to the interaction of the human synovial fluid phospholipase A2 with 26 inhibitors indicates that the derived QSAR has good predictive ability and provides insight into the mechanism of enzyme inhibition. The method, which we term comparative binding energy (COMBINE) analysis, is expected to be applicable to ligand-receptor interactions in a range of contexts including rational drug design, host-guest systems, and protein engineering.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Through the Py-CoMFA and Py-ComBinE applications available on the 3d-qsar.com portal [41], robust and predictive comparative molecular field analysis (CoMFA) [42] and comparative molecular binding energy analysis (COMBINE) [43][44][45][46][47] as LB and SB three-dimensional quantitative structure-activity relationships (3-D QSAR) [48] models were built to shed light on structural molecular determinant and inhibitor/protein residues interactions mainly responsible for the M pro inhibitory potency. As most of the calculations were run on 3d-qsar.com ...
... The prepared topology and parameter files were used to run a 500 gradient descent minimization steps through the OpenMM [68] python library. As in agreement with the original COMBINE protocol [43], the Py-ComBinE web app requires an equal number of residue number for each protein, therefore all extra residues were removed by means of UCSF Chimera [69] from longer sequence proteins to match the shortest one (6XMK). The minimized and adjusted complexes, separated into keys and locks were uploaded to the web portal 3d-qsar.com ...
... Four type of ligand/protein interactions are implemented in the Py-ComBinE app, steric (STE), electrostatic (ELE), desolvation (DRY) and hydrogen bond (HB), therefore with the key/lock pairs dataset, all the possible 15 combinations of ligand/per-residues energetic interactions were considered: STE, ELE, DRY, HB and all their possible combinations (STE + ELE, STE + DRY, STE + HB, ELE + DRY, ELE + HB, DRY + HB, STE + ELE + DRY, STE + ELE + HB, STE + DRY + HB, ELE + DRY + HB, STE + ELE + DRY + HB). Differently from the original COMBINE method, the STE, ELE, DRY and HB interaction energies were calculated by means of a the using the AutoDockTools python utilities using the AutoDock 4.2 force field [76] directly on the Mpro-inhibitor complexes [43]. The combined interactions were block scaled similarly as described by Ortiz et al. [77] The combination that led to the model endowed with the highest statistical coefficients was then optimized by means of a simulated annealing feature selection (SAFS) algorithm as implemented in the Py-ComBinE web app. ...
Article
Full-text available
The main protease (Mpro) of SARS-Cov-2 is the essential enzyme for maturation of functional proteins implicated in viral replication and transcription. The peculiarity of its specific cleavage site joint with its high degree of conservation among all coronaviruses promote it as an attractive target to develop broad-spectrum inhibitors, with high selectivity and tolerable safety profile. Herein is reported a combination of three-dimensional quantitative structure-activity relationships (3-D QSAR) and comparative molecular binding energy (COMBINE) analysis to build robust and predictive ligand-based and structure-based statistical models, respectively. Models were trained on experimental binding poses of co-crystallized Mpro-inhibitors and validated on available literature data. By means of deep optimization both models' goodness and robustness reached final statistical values of r2/q2 values of 0.97/0.79 and 0.93/0.79 for the 3-D QSAR and COMBINE approaches respectively, and an overall predictiveness values of 0.68 and 0.57 for the SDEPPRED and AAEP metrics after application to a test set of 60 compounds covered by the training set applicability domain. Despite the different nature (ligand-based and structure-based) of the employed methods, their outcome fully converged. Furthermore, joint ligand- and structure-based structure-activity relationships were found in good agreement with nirmatrelvir chemical features properties, a novel oral Mpro-inhibitor that has recently received U.S. FDA emergency use authorization (EUA) for the oral treatment of mild-to-moderate COVID-19 infected patients. The obtained results will guide future rational design and/or virtual screening campaigns with the aim of discovering new potential anti-coronavirus lead candidates, minimizing both time and financial resources. Moreover, as most of calculation were performed through the well-established web portal 3d-qsar.com the results confirm the portal as a useful tool for drug design.
... Here, we used COMBINE analysis 11 to predict k off values for ligand−protein complexes. Among chemometric methods, COMBINE analysis has the advantages of using information available from experimental structures of ligand−protein complexes and identifying specific protein residues that affect the computed parameters. ...
... COMBINE analysis was used originally to predict inhibitory activities for complexes of the human synovial fluid phospholipase A2 with small molecules. 11 Moreover, it has already been applied to predict affinities or inhibitory activities for complexes of proteins with small molecules, 12−14 peptides, 15 and other proteins. 16 One of the potential drawbacks of COMBINE analysis is that the ligand−protein complex is represented by a single energy-minimized structure, which limits the representation of protein flexibility and ligand-binding modes. ...
... Further terms can be included in COMBINE analysis, such as energy penalties for changes in the protein conformation or differences in the intramolecular energy terms in bound and unbound protein conformations. 11 One of the motivations for performing COMBINE analysis to predict k off values or other biological activities is the speed of the parametrization and application of the method, especially for large data sets with tens or hundreds of inhibitors. One of the advantages of using one structure over multiple structures in COMBINE analysis is the lower computational cost, as usually only a single crystal structure and no simulations, only energy minimizations, are required when one structure is used to represent the ligand−protein complex. ...
... analysis COMBINE analysis [46] is an approach for deriving quantitative structure-activity relationships (QSAR) by exploiting the information contained in the 3D structures of receptor-ligand complexes. In COMBINE analysis, the binding free energy, ∆G, or a related property (such as K d , k of f , k on , pK i , pIC 50 ) is correlated with a subset of weighted interaction energy components determined from the structures of energy-minimized receptor-ligand complexes. ...
... As more number of three-dimensional (3D) structures of ligand-protein complexes is becoming available, these QSAR approaches have been extended in three dimensions to derive 3D-QSARs by incorporating information on ligand and protein interactions into the models [86,87,88,89]. COMparative BINding Energy (COMBINE) analysis is one of such medium-throughput approaches that has been successfully ap-plied to a number of protein targets to derive target specific scoring functions for the prediction of binding affinity and target selectivity [46,90,91,92,93,94,95]. ...
... The student clicks on each entry in this list, and he is taken to pages that give an overview of the methods, along with a curated list of examples of previous applications of each method, with links to the relevant journal articles. The student selects COMBINE analysis [46] as the method he is interested in, and he then follows the link to the tutorial that describes how to perform COMBINE analysis on his data ( Figure 6.4, red boxes, anticlockwise). ...
Thesis
The drug-receptor binding kinetics are defined by the rate at which a given drug associates with and dissociates from its binding site on its macromolecular receptor. The lead optimization stage of drug discovery programs usually emphasizes optimizing the affinity (as described by the equilibrium dissociation constant, Kd) of a drug which depends on the strength of its binding to a specific target. Since affinity is optimized under equilibrium conditions, it does not always ensures higher potency in vivo. There has been a growing consensus that, in addition to Kd, kinetic parameters (kon and koff ) should be optimized to improve the chances of a good clinical outcome. However, current understanding of the physicochemical features that contribute to differences in binding kinetics is limited. Experimental methods that are used to determine kinetic parameters for drug binding and unbinding are often time consuming and labor-intensive. Therefore, robust, high-throughput in silico methods are needed to predict binding kinetic parameters and to explore the mechanistic determinants of drug-protein binding. As the experimental data on drug-binding kinetics is continuously growing and the number of crystallographic structures of ligand-receptor complexes is also increasing, methods to compute three dimensional (3D) Quantitative-Structure-Kinetics relationships (QSKRs) offer great potential for predicting kinetic rate constants for new compounds. COMparative BINding Energy(COMBINE) analysis is one example of such approach that was developed to derive target-specific scoring functions based on molecular mechanics calculations. It has been used extensively to predict properties such as binding affinity, target selectivity, and substrate specificity. In this thesis, I made the first application of COMBINE analysis to derive Quantitative Structure-Kinetics Relationships (QSKRs) for the dissociation rates. I obtained models for koff of inhibitors of HIV-1 protease and heat shock protein 90 (HSP90) with very good predictive power and identified the key ligand-receptor interactions that contribute to the variance in binding kinetics. With technological and methodological advances, the use of all-atom unbiased Molecular Dynamics (MD) simulations can allow sampling upto the millisecond timescale and investigation of the kinetic profile of drug binding and unbinding to a receptor. However, the residence times of drug-receptor complexes are usually longer than the timescales that are feasible to simulate using conventional molecular dynamics techniques. Enhanced sampling methods can allow faster sampling of protein and ligand dynamics, thereby resulting in application of MD techniques to study longer timescale processes. I have evaluated the application of Tau-Random Acceleration Molecular Dynamics (Tau-RAMD), an enhanced sampling method based on MD, to compute the relative residence times of a series of compounds binding to Haspin kinase. A good correlation (R2 = 0.86) was observed between the computed residence times and the experimental residence times of these compounds. I also performed interaction energy calculations, both at the quantum chemical level and at the molecular mechanics level, to explain the experimental observation that the residence times of kinase inhibitors can be prolonged by introducing halogen-aromatic pi interactions between halogen atoms of inhibitors and aromatic residues at the binding site of kinases. I determined different energetic contributions to this highly polar and directional halogen-bonding interaction by partitioning the total interaction energy calculated at the quantum-chemical level into its constituent energy components. It was observed that the major contribution to this interaction energy comes from the correlation energy which describes second-order intermolecular dispersion interactions and the correlation corrections to the Hartree-Fock energy. In addition, a protocol to determine diffusional kon rates of low molecular weight compounds from Brownian Dynamics (BD) simulations of protein-ligand association was established using SDA 7 software. The widely studied test case of benzamidine binding to trypsin was used to evaluate a set of parameters and a robust set of optimal parameters was determined that should be generally applicable for computing the diffusional association rate constants of a wide range of protein-ligand binding pairs. I validated this protocol on inhibitors of several targets with varying complexity such as Human Coagulation Factor Xa, Haspin kinase and N1 Neuraminidase, and the computed diffusional association rate constants were compared with the experiments. I contributed to the development of a toolbox of computational methods: KBbox (http://kbbox.h-its.org/toolbox/), which provides information about various computational methods to study molecular binding kinetics, and different computational tools that employ them. It was developed to guide researchers on the use of the different computational and simulation approaches available to compute the kinetic parameters of drug-protein binding.
... Scoring functions are commonly used in structure-based drug discovery techniques for evaluating the affinity of protein-ligand complexes. The first scoring function was made available in the early 1990s, and presently, there are over a hundred scoring functions published in literature [65][66][67][68][69][70]. Scoring functions are not the most accurate methods to find an estimate of binding affinity of a protein-ligand complex as they make various approximations in order to compensate for speed, time and computational resources. ...
... For the docking and scoring algorithms to work well, certain parameters need to be taken into account. The location of the binding site is one of them and plays an important role in estimating the protein-ligand binding [66,71,78,93,101]. Certain cases have been reported where the presence of allosteric sites within the receptor molecules can influence the protein-ligand binding [102][103][104] to a substantial extent. ...
... 380 R. Bhat et al. scoring functions are DOCK [2], AutoDock [60], BAPPL [72], BAPPL-Z [73], AADS [37], ParDOCK [45], COMBINE[66], GoldScore[74], MedusaScore[75] and so on. ...
... The possible binding energy of azoxystrobin and Cytb was broken down into the contributions of each amino acid residue by using comparative binding energy analysis (Ortiz et al., 1995). Binding energy was calculated using MOE with the AMBER10:EHT force field, and the implicit solvent was the reaction field (R-Field) model. ...
... Residues associated with QoIresistance are shown in red. The possible binding energy of azoxystrobin and cytochrome b was broken down into the contributions of each amino acid residue by using comparative binding energy analysis (Ortiz et al., 1995). Binding energy was calculated in Molecular Operating Environment, version 2019.0102 ...
Article
Rust fungi are generally considered low-risk pathogens in terms of resistance to fungicides. However, Puccinia horiana, which causes white rust in chrysanthemum , has developed resistance to various fungicides via mutations in genes encoding the target proteins of fungicides. We investigated cytochrome b haplotypes of 15 Japanese isolates of P. horiana. Among them, two were wild-type, and the others carried L299F, L275F + L299F, or N256S + L299F amino acid substitutions. To best of our knowledge, L299F and N256S + L299F haplotypes were found for the first time in this study and isolates of each haplotype showed 22-and 222-fold higher azoxystrobin EC 50 values than their wild-type counterparts in the basidiospore germination test, respectively. Further tests confirmed cross-resistance among some quinone outside inhibitor (QoI) fungicides in N256S + L299F-harboring isolates whereas L275F-harboring isolates remained sensitive to other QoIs. Our in planta tests revealed that the performance of azoxystrobin against mutant isolates was reduced even at the labeled concentration. The homology model predicted the involvement of L275 and L299 in the interaction between azoxystrobin and cytochrome b, while it was unlikely for N256. In some model organisms such as Saccharomyces cerevisiae, asparagine 256 is known to interact with glycine 137. Considering that G137R substitution confers azoxystrobin resistance in other phytopathogenic fungi, we suggest that N256S might also be responsible for azoxystrobin resistance in P. horiana. This is the first record of a substitution at cytochrome b N256 in azoxystrobin-resistant phyto-pathogenic fungi.
... The binding energy, an important indicator to measure the binding affinities, is defined as the smallest amount of energy required to remove a particle from a system of particles or to disassemble a system of particles into individual parts. 53,54 AutoDock is a suite of automated docking tools designed to predict how small molecules, such as substance or drug candidates, bind to a receptor of a known framework structure. ...
Article
The potential of chiral metal-organic frameworks (MOFs) for circularly polarized (CP) optics has been largely unexplored. Herein, we have successfully deposited monolithic and highly oriented chiral MOF thin films prepared by a layer-by-layer method (referred to as surface-coordinated MOF thin films, SURMOF) to fabricate CP photodetection devices and distinguish enantiomers. The helicity-sensitive absorption induced by a pair of enantiopure oriented SURMOF was found to be excellent, with an anisotropy factor reaching 0.41. Moreover, the chiral SURMOFs exhibited a pronounced difference in the uptake of the l- and d-tryptophan enantiomers. To demonstrate the potential of these novel MOF thin films for chirality analysis, we fabricated a portable sensor device that allows for chiral recognition by monitoring the photocurrent signals. Our findings not only introduce a new concept of using chiral building blocks for realizing direct CP photodetectors but also provide a blueprint for novel devices in chiral optics.
... (1), but broadly speaking they can be organized into a few distinct categories [12]. The first of these are the physics-based scoring functions that attempt to model the terms on the right-hand side of Eq. (1) explicitly [13][14][15][16][17][18][19]. They typically have a force field-like functional form and assume additivity among the terms. ...
Article
Full-text available
The advent of computational drug discovery holds the promise of significantly reducing the effort of experimentalists, along with monetary cost. More generally, predicting the binding of small organic molecules to biological macromolecules has far-reaching implications for a range of problems, including metabolomics. However, problems such as predicting the bound structure of a protein–ligand complex along with its affinity have proven to be an enormous challenge. In recent years, machine learning-based methods have proven to be more accurate than older methods, many based on simple linear regression. Nonetheless, there remains room for improvement, as these methods are often trained on a small set of features, with a single functional form for any given physical effect, and often with little mention of the rationale behind choosing one functional form over another. Moreover, it is not entirely clear why one machine learning method is favored over another. In this work, we endeavor to undertake a comprehensive effort towards developing high-accuracy, machine-learned scoring functions, systematically investigating the effects of machine learning method and choice of features, and, when possible, providing insights into the relevant physics using methods that assess feature importance. Here, we show synergism among disparate features, yielding adjusted R2 with experimental binding affinities of up to 0.871 on an independent test set and enrichment for native bound structures of up to 0.913. When purely physical terms that model enthalpic and entropic effects are used in the training, we use feature importance assessments to probe the relevant physics and hopefully guide future investigators working on this and other computational chemistry problems.
... Figure 3 illustrates a docking-interaction of the Leu-APDS analog with the binding pocket on IDE. Since binding energy is the amount of energy required for a stable ligand target interaction (Ortiz et al., 1995), a stable interaction with therapeutic benefits is implied 40 when the binding energy is low (Wang and Wade., 2002). RMSD refers to the distance between a ligand and a target and is measured in Å. ...
Thesis
Full-text available
Application of computer-aided drug design tools in the discovery of novel anti-glycemic agents against type 2 diabetes mellitus
... The computational screening approach enables the rapid discovery of promising compounds for developing effective therapeutics against SARS-CoV-2. It has been shown that less binding energy denotes more affinity of a compound for binding to its target (Ortiz et al., 1995). To elucidate the binding affinity to RdRp, the library of compounds was docked against RdRp using the PyRx tool (Dallakyan and Olson, 2015). ...
Article
The Coronavirus Disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) became a pandemic, resulting in an exponentially increased mortality globally and scientists all over the world are struggling to find suitable solutions to combat it. Multiple repurposed drugs have already been in several clinical trials or recently completed. However, none of them shows any promising effect in combating COVID-19. Therefore, developing an effective drug is an unmet global need. RdRp (RNA dependent RNA polymerase) plays a pivotal role in viral replication. Therefore, it is considered as a prime target of drugs that may treat COVID-19. In this study, we have screened a library of compounds, containing approved RdRp inhibitor drugs that were or in use to treat other viruses (favipiravir, sofosbuvir, ribavirin, lopinavir, tenofovir, ritonavir, galidesivir and remdesivir) and their structural analogues, in order to identify potential inhibitors of SARS-CoV-2 RdRp. Extensive screening, molecular docking and molecular dynamics show that five structural analogues have notable inhibitory effects against RdRp of SARS-CoV-2. Importantly, comparative protein-antagonists interaction revealed that these compounds fit well in the pocket of RdRp. ADMET analysis of these compounds suggests their potency as drug candidates. Our identified compounds may serve as potential therapeutics for COVID-19.
... The computational screening approach enables the rapid discovery of promising compounds for developing effective therapeutics against SARS-CoV-2. It has been shown that less binding energy denotes more affinity of a compound for binding to its target (Ortiz et al., 1995). To elucidate the binding affinity to RdRp, the library of compounds was docked against RdRp using the PyRx tool (Dallakyan and Olson, 2015). ...
Article
Full-text available
The Coronavirus Disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) became a pandemic, resulting in an exponentially increased mortality globally and scientists all over the world are struggling to find suitable solutions to combat it. Multiple repurposed drugs have already been in several clinical trials or recently completed. However, none of them shows any promising effect in combating COVID-19. Therefore, developing an effective drug is an unmet global need. RdRp (RNA dependent RNA polymerase) plays a pivotal role in viral replication. Therefore, it is considered as a prime target of drugs that may treat COVID-19. In this study, we have screened a library of compounds, containing approved RdRp inhibitor drugs that were or in use to treat other viruses (favipiravir, sofosbuvir, ribavirin, lopinavir, tenofovir, ritonavir, galidesivir and remdesivir) and their structural analogues, in order to identify potential inhibitors of SARS-CoV-2 RdRp. Extensive screening, molecular docking and molecular dynamics show that five structural analogues have notable inhibitory effects against RdRp of SARS-CoV-2. Importantly, comparative protein-antagonists interaction revealed that these compounds fit well in the pocket of RdRp. ADMET analysis of these compounds suggests their potency as drug candidates. Our identified compounds may serve as potential therapeutics for COVID-19.
... 31 The test set results demonstrated underestimation of the off-rates and predominantly overestimation of the on-rates. Also, for the HSP90 and HIV-1 protease inhibitors, comparative binding energy (COMBINE) analysis 32,33 has been applied and demonstrated slightly better results for off-rates. 34 In the other approach, VolSurf descriptors were analyzed by PLS regression for HIV-1 protease inhibitors, showing comparable results. ...
Article
Derivation of structure-kinetics relationships can help rational design and development of new small-molecule drug candidates with desired residence times. Efforts are now being directed toward the development of efficient computational methods. Currently, there is a lack of solid, high-throughput binding kinetics prediction approaches on bigger datasets. We present a prediction method for binding kinetics based on the machine learning analysis of protein-ligand structural features, which can serve as a baseline for more sophisticated methods utilizing molecular dynamics (MD). We showed that the random forest algorithm is capable of learning the protein binding site secondary structure and backbone/side-chain features to predict the binding kinetics of protein-ligand complexes but still with inferior performance to that of MD-based descriptor analysis. MD simulations had been applied to a limited number of targets and a series of ligands in terms of kinetics analysis, and we believe that the developed approach may guide new studies. The method was trained on a newly curated database of 501 protein-ligand unbinding rate constants, which can also be used for testing and training the binding kinetics prediction models.
... The computational screening approach enables the rapid discovery of promising compounds for developing effective therapeutics against SARS CoV-2. It has been shown that less binding energy denotes more a nity of a compound for binding to its target (Ortiz et al., 1995). To elucidate the binding a nity to RdRP, the library of compounds was docked against RdRP using the PyRx tool (Dallakyan and Olson, 2015). ...
Preprint
Full-text available
It’s been more than 8 months since COVID-19 became a pandemic and scientists all over the world are struggling to find suitable solutions to combat it. Multiple repurposed drugs have already been in several trials or recently completed. However, none of them shows any promising effect in combating COVID-19. Therefore, developing an effective drug is an unmet global need. RdRP (RNA dependent RNA polymerase) plays a pivotal role in viral replication therefore, it is considered as a prime target of drugs that may treat COVID-19. In this study, we have screened a library of compounds, containing approved RdRP inhibitor drugs in use to treat other viruses (Favipiravir, Sofosbuvir, Ribavirin, Lopinavir, Tenofovir, Ritonavir, Galidesivir and Remdesivir) and their structural homologues, in order to identify potential inhibitors of SARS-Cov-2 RdRP. Extensive screening, molecular docking and molecular dynamics show that five structural analogues have notable inhibitory effects against RdRP of SARS-Cov-2. Importantly, comparative protein-antagonists interaction revealed that these compounds fit well in the pocket of RdRP. ADMET analysis of these compounds suggests their potency as drug candidates. Our identified compounds may serve as potential therapeutics for COVID-19.
... Ortiz et al. [44] in 1995 developed a technique called COM-BINE-Comparative Binding Energy Analysis-which was developed to make the use of the structural data from ligand-protein complexes, within a 3D-QSAR methodology. The authors developed this method based on the hypothesis where free energy of binding can be correlated with a subset of energy components calculated from the structures of receptors and ligands in bound and unbound forms. ...
Chapter
One of the main approaches in cheminformatics, so-called a quantitative structure-activity relationship (QSAR) approach, nowadays plays an important role in lead structure optimization, as well as in prediction of various physicochemical properties, biological activity, and environmental toxicology. One of the recent developments in QSAR approaches for nanostructures is a three-dimensional QSAR. For the last two decades, 3D-QSAR has already been successfully applied to various datasets, especially of enzyme and receptor ligands. The application of 3D-QSAR for nanostructured materials is still at early stage. Often, 3D-QSAR studies are going together with protein-ligand docking studies, and this combination works synergistically, improving the accuracy of prediction. Carbon nanostructures, such as fullerenes, and carbon nanotubes are nanomaterials with specific properties that make them useful in pharmacological applications. In this methodological review, we outline recent advances in development and application of 3D-QSAR and protein-ligand docking approaches in the studies of nanostructured materials, such as fullerenes and carbon nanotubes.
... The Schrodinger software suite offers AutoQSAR for 3D-QSAR modeling [79]. In order to refine ligand-based 3D QSAR models, receptor-based 3D-QSAR emerged, including COMBINE [80] and AFMoC [81]. ...
Article
Full-text available
Natural products (NPs) are an indispensable source of drugs and they have a better coverage of the pharmacological space than synthetic compounds, owing to their high structural diversity. The prediction of their interaction profiles with druggable protein targets remains a major challenge in modern drug discovery. Experimental (off-)target predictions of NPs are cost- and time-consuming, whereas computational methods, on the other hand, are much faster and cheaper. As a result, computational predictions are preferentially used in the first instance for NP profiling, prior to experimental validations. This review covers recent advances in computational approaches which have been developed to aid the annotation of unknown drug-target interactions (DTIs), by focusing on three broad classes, namely: ligand-based, target-based, and target—ligand-based (hybrid) approaches. Computational DTI prediction methods have the potential to significantly advance the discovery and development of novel selective drugs exhibiting minimal side effects. We highlight some inherent caveats of these methods which must be overcome to enable them to realize their full potential, and a future outlook is given.
... Basically, one can classify SF methods into four different types, namely force-field-based SF, knowledgebased SF, empirical-based SF, and machine learning-based SF. 8 The force-field-based SFs commonly emphasize van der Walls (vdW) interactions, electrostatic energy, hydrogen bonding descriptions, solvation effects, and so on. The well-known SFs for this category are COMBINE, 9 MedusaScore, 10 to name only a few. Typical examples of knowledge-based SFs are, 11 DrugScore, 12 KECSA, 13 and IT-Score, 14 which utilize protein-ligand pairwise statistical potentials in an additive manner to predict binding affinities. ...
Preprint
We present the performances of our mathematical deep learning (MathDL) models for D3R Grand Challenge 4 (GC4). This challenge involves pose prediction, affinity ranking, and free energy estimation for beta secretase 1 (BACE) as well as affinity ranking and free energy estimation for Cathepsin S (CatS). We have developed advanced mathematics, namely differential geometry, algebraic graph, and/or algebraic topology, to accurately and efficiently encode high dimensional physical/chemical interactions into scalable low-dimensional rotational and translational invariant representations. These representations are integrated with deep learning models, such as generative adversarial networks (GAN) and convolutional neural networks (CNN) for pose prediction and energy evaluation, respectively. Overall, our MathDL models achieved the top place in pose prediction for BACE ligands in Stage 1a. Moreover, our submissions obtained the highest Spearman correlation coefficient on the affinity ranking of 460 CatS compounds, and the smallest centered root mean square error on the free energy set of 39 CatS molecules. It is worthy to mention that our method for docking pose predictions has significantly improved from our previous ones.
... We can find out the drug binding affinity by using the fitness of the drug, which can bind to the target molecule during the docking process and the second way is using Gibbs free energy calculations. According to this more negative value, we can consider a more effective drug [26]. In our calculation, the value of formate and accetate are -3078.01 ...
Article
Morphine is considered as the uncountable pain killer drug which is taken by both of mouth or injection. In this case, morpholinium ILs is the most applicable molecules due to liquid range so that the thermo-chemical, chemical reactivity and biological interaction of most expected morphonium formate and acetate ILs is considered under theoretical study by HyperChem 8.010 computer programming method. Some thermodynamic parameters such as free energy, entropy, dipole moment, binding energy, nuclear energy, electronics energy, heat of formation and QSAR properties of molecules like charge density, surface area grid, volume, LogP, polarizability, refractivity, molecular mass, and reactivity properties of molecule like HOMO, LUMO, HOMO-LUMO, ionization potential and electron affinity were determined using the HyperChem 8.0.10 programme. The morphonium formate is less biological active than morphonium acetate because LogP is 1.19 and -0.66 respectively. On the other hand, the HOMO LUMO gap in all transition level almost same that indicate similar chemical reactivity. The binding energy of both molecules is -3078.01 and -3351.25 kcal/mol respectively. The vibrational spectroscopy data provides them the identification and characterization.
... Chiu and Xie (2016) went beyond a static model by accounting for flexibility with a coarse-grained normal mode analysis to classify HIV-1 protease inhibitors in binding kinetics classes using a multi-target ML approach. Comparative Binding Energy (COMBINE) analysis (Ortiz et al., 1995;Perez et al., 1998), in which PLS (Partial Linear Regression Projection to Latent Structures) is used to reweight components of the bound protein-ligand interaction energies to predict binding properties, has recently been applied to datasets of HSP90 and HIV-1 protease inhibitors and was found to give models with good predictive ability for residence time. It should be noted that the COMBINE analysis method was originally developed for the prediction of binding affinity for congeneric series of compounds. ...
Article
Full-text available
Drug-target residence times can impact drug efficacy and safety, and are therefore increasingly being considered during lead optimization. For this purpose, computational methods to predict residence times, τ, for drug-like compounds and to derive structure-kinetic relationships are desirable. A challenge for approaches based on molecular dynamics (MD) simulation is the fact that drug residence times are typically orders of magnitude longer than computationally feasible simulation times. Therefore, enhanced sampling methods are required. We recently reported one such approach: the τRAMD procedure for estimating relative residence times by performing a large number of random acceleration MD (RAMD) simulations in which ligand dissociation occurs in times of about a nanosecond due to the application of an additional randomly oriented force to the ligand. The length of the RAMD simulations is used to deduce τ. The RAMD simulations also provide information on ligand egress pathways and dissociation mechanisms. Here, we describe a machine learning approach to systematically analyze protein-ligand binding contacts in the RAMD trajectories in order to derive regression models for estimating τ and to decipher the molecular features leading to longer τ values. We demonstrate that the regression models built on the protein-ligand interaction fingerprints of the dissociation trajectories result in robust estimates of τ for a set of 94 drug-like inhibitors of heat shock protein 90 (HSP90), even for the compounds for which the length of the RAMD trajectories does not provide a good estimation of τ. Thus, we find that machine learning helps to overcome inaccuracies in the modeling of protein-ligand complexes due to incomplete sampling or force field deficiencies. Moreover, the approach facilitates the identification of features important for residence time. In particular, we observed that interactions of the ligand with the sidechain of F138, which is located on the border between the ATP binding pocket and a hydrophobic transient sub-pocket, play a key role in slowing compound dissociation. We expect that the combination of the τRAMD simulation procedure with machine learning analysis will be generally applicable as an aid to target-based lead optimization.
Article
Full-text available
Binding kinetic properties of protein–ligand complexes are crucial factors affecting the drug potency. Nevertheless, the current in silico techniques are insufficient in providing accurate and robust predictions for binding kinetic properties. To this end, this work develops a variety of binding kinetic models for predicting a critical binding kinetic property, dissociation rate constant, using eight machine learning (ML) methods (Bayesian Neural Network (BNN), partial least squares regression, Bayesian ridge, Gaussian process regression, principal component regression, random forest, support vector machine, extreme gradient boosting) and the descriptors of the van der Waals/electrostatic interaction energies. These eight models are applied to two case studies involving the HSP90 and RIP1 kinase inhibitors. Both regression results of two case studies indicate that the BNN model has the state‐of‐the‐art prediction accuracy (HSP90: Rtest2=0.947 ${R}_{\text{test}}^{2}=0.947$, MAEtest = 0.184, rtest = 0.976, RMSEtest = 0.220; RIP1 kinase: Rtest2=0.745 ${R}_{\text{test}}^{2}=0.745$, MAEtest = 0.188, rtest = 0.961, RMSEtest = 0.290) in comparison with other seven ML models.
Article
Reliable target-ligand binding thermodynamics data are essential for successful drug design and molecular engineering projects. Besides experimental methods, a number of theoretical approaches have been introduced for the generation of...
Article
Bile salt hydrolases (BSHs) are currently being investigated as target enzymes for metabolic regulators in humans and as growth promoters in farm animals. Understanding structural features underlying substrate specificity is necessary for inhibitor design. Here, we used a multidisciplinary workflow including mass spectrometry, mutagenesis, molecular dynamic simulations, machine learning, and crystallography to demonstrate substrate specificity in Lactobacillus salivarius BSH, the most abundant enzyme in human and farm animal intestines. We show the preference of substrates with a taurine head and a dehydroxylated sterol ring for hydrolysis. A regression model that correlates the relative rates of hydrolysis of various substrates in various enzyme mutants with the residue-substrate interaction energies guided the identification of structural determinants of substrate binding and specificity. In addition, we found T208 from another BSH protomer regulating the hydrolysis. The designed workflow can be used for fast and comprehensive characterization of enzymes with a broad range of substrates.
Article
G protein-coupled receptors (GPCR) are integral membrane proteins of considerable interest as targets for drug development due to their role in transmitting cellular signals in a multitude of biological processes. Of the six classes categorizing GPCR (A, B, C, D, E, and F), class A contains the largest number of therapeutically relevant GPCR. Despite their importance as drug targets, many challenges exist for the discovery of novel class A GPCR ligands serving as drug precursors. Though knowledge of the structural and functional characteristics of GPCR has grown significantly over the past 20 years, a large portion of GPCR lack reported, experimentally determined structures. Furthermore, many GPCR have no known endogenous and/or synthetic ligands, limiting further exploration of their biochemical, cellular, and physiological roles. While many successes in GPCR ligand discovery have resulted from experimental high-throughput screening, computational methods have played an increasingly important role in GPCR ligand identification in the past decade. Here we discuss computational techniques applied to GPCR ligand discovery. This review summarizes class A GPCR structure/function and provides an overview of many obstacles currently faced in GPCR ligand discovery. Furthermore, we discuss applications and recent successes of computational techniques used to predict GPCR structure as well as present a summary of ligand- and structure-based methods used to identify potential GPCR ligands. Finally, we discuss computational hit list generation and refinement and provide comprehensive workflows for GPCR ligand identification.
Thesis
Full-text available
Towards reducing the timeframe and the high attrition rate in small­-molecule drug discovery, there is growing interest in integrating experimental data and computational methods to decipher the molecular mechanisms through which bioactive compounds interact with their target proteins. The goal of this dissertation is to develop and apply several of these data-­intensive integrative approaches. In the first study, an update was made for StreptomeDB, a chemogenomics database describing the physicochemical and biological properties of metabolites originating from bacteria of the genus Streptomyces. Substantial improvements were made over its forerunners, especially in terms of data content (~2500 new metabolites added) and interoperability (hyperlinks to several spectral, (bio)chemical and chemical vendor databases, and to a genome-­based metabolite prediction server). Next, a novel pharmacophore­-based target prediction tool was developed, named ePharmaLib. It was retrospectively validated using StreptomeDB metabolites. As proof-­of-concept, ePharmaLib predictions were complemented with bioassay experiments to identify the human purine nucleoside phosphorylase as a hitherto unknown protein target of the metabolite called neopterin. In another study, an in-depth structural and statistical analysis was carried out using the solved 3D structures of aromatic­-cage­-containing proteins complexed with their cationic ligands. As a follow­up, the scope of the aforementioned study was expanded to include ligands forming π­π or hydrophobic contacts with aromatic cages. Ultimately, the collected data set was integrated into a web database named AroCageDB. In the fifth study, the solved 3D structures of covalent protein–ligand complexes were manually expertly annotated from the Protein Data Bank and assimilated into a dedicated web database named CovPDB. Lastly, in the sixth study, was carried out an integrative drug repurposing approach based on computational modeling and in vitro enzymatic assays, to repurpose CovPDB serine targeted covalent inhibitors. This led to the identification of the phenylbororonic acid BC­-11, as a nanomolar covalent inhibitor of the human transmembrane protease serine 2, while it exhibited a unique selectivity profile for serine proteases ascribable to its boronic acid warhead. Moreover, BC-­11 showed significant inhibition of SARS­-CoV-­2 (Omicron variant) spike pseudotyped particles in a cell-­based entry assay, thus serving as a good starting point for further structural optimization to develop novel COVID­-19 antivirals.
Chapter
Structure-based (SBDD) and ligand-based (LBDD) drug design are extremely important and active areas of research in both the academic and commercial realms. This book provides a complete snapshot of the field of computer-aided drug design and associated experimental approaches. Topics covered include X-ray crystallography, NMR, fragment-based drug design, free energy methods, docking and scoring, linear-scaling quantum calculations, QSAR, pharmacophore methods, computational ADME-Tox, and drug discovery case studies. A variety of authors from academic and commercial institutions all over the world have contributed to this book, which is illustrated with more than 200 images. This is the only book to cover the subject of structure and ligand-based drug design, and it provides the most up-to-date information on a wide range of topics for the practising computational chemist, medicinal chemist, or structural biologist. Professor Kenneth Merz has been selected as the recipient of the 2010 ACS Award for Computers in Chemical & Pharmaceutical Research that recognizes the advances he has made in the use of quantum mechanics to solve biological and drug discovery problems.
Chapter
Structure-based (SBDD) and ligand-based (LBDD) drug design are extremely important and active areas of research in both the academic and commercial realms. This book provides a complete snapshot of the field of computer-aided drug design and associated experimental approaches. Topics covered include X-ray crystallography, NMR, fragment-based drug design, free energy methods, docking and scoring, linear-scaling quantum calculations, QSAR, pharmacophore methods, computational ADME-Tox, and drug discovery case studies. A variety of authors from academic and commercial institutions all over the world have contributed to this book, which is illustrated with more than 200 images. This is the only book to cover the subject of structure and ligand-based drug design, and it provides the most up-to-date information on a wide range of topics for the practising computational chemist, medicinal chemist, or structural biologist. Professor Kenneth Merz has been selected as the recipient of the 2010 ACS Award for Computers in Chemical & Pharmaceutical Research that recognizes the advances he has made in the use of quantum mechanics to solve biological and drug discovery problems.
Chapter
Structure-based (SBDD) and ligand-based (LBDD) drug design are extremely important and active areas of research in both the academic and commercial realms. This book provides a complete snapshot of the field of computer-aided drug design and associated experimental approaches. Topics covered include X-ray crystallography, NMR, fragment-based drug design, free energy methods, docking and scoring, linear-scaling quantum calculations, QSAR, pharmacophore methods, computational ADME-Tox, and drug discovery case studies. A variety of authors from academic and commercial institutions all over the world have contributed to this book, which is illustrated with more than 200 images. This is the only book to cover the subject of structure and ligand-based drug design, and it provides the most up-to-date information on a wide range of topics for the practising computational chemist, medicinal chemist, or structural biologist. Professor Kenneth Merz has been selected as the recipient of the 2010 ACS Award for Computers in Chemical & Pharmaceutical Research that recognizes the advances he has made in the use of quantum mechanics to solve biological and drug discovery problems.
Chapter
Structure-based (SBDD) and ligand-based (LBDD) drug design are extremely important and active areas of research in both the academic and commercial realms. This book provides a complete snapshot of the field of computer-aided drug design and associated experimental approaches. Topics covered include X-ray crystallography, NMR, fragment-based drug design, free energy methods, docking and scoring, linear-scaling quantum calculations, QSAR, pharmacophore methods, computational ADME-Tox, and drug discovery case studies. A variety of authors from academic and commercial institutions all over the world have contributed to this book, which is illustrated with more than 200 images. This is the only book to cover the subject of structure and ligand-based drug design, and it provides the most up-to-date information on a wide range of topics for the practising computational chemist, medicinal chemist, or structural biologist. Professor Kenneth Merz has been selected as the recipient of the 2010 ACS Award for Computers in Chemical & Pharmaceutical Research that recognizes the advances he has made in the use of quantum mechanics to solve biological and drug discovery problems.
Chapter
Structure-based (SBDD) and ligand-based (LBDD) drug design are extremely important and active areas of research in both the academic and commercial realms. This book provides a complete snapshot of the field of computer-aided drug design and associated experimental approaches. Topics covered include X-ray crystallography, NMR, fragment-based drug design, free energy methods, docking and scoring, linear-scaling quantum calculations, QSAR, pharmacophore methods, computational ADME-Tox, and drug discovery case studies. A variety of authors from academic and commercial institutions all over the world have contributed to this book, which is illustrated with more than 200 images. This is the only book to cover the subject of structure and ligand-based drug design, and it provides the most up-to-date information on a wide range of topics for the practising computational chemist, medicinal chemist, or structural biologist. Professor Kenneth Merz has been selected as the recipient of the 2010 ACS Award for Computers in Chemical & Pharmaceutical Research that recognizes the advances he has made in the use of quantum mechanics to solve biological and drug discovery problems.
Chapter
Structure-based (SBDD) and ligand-based (LBDD) drug design are extremely important and active areas of research in both the academic and commercial realms. This book provides a complete snapshot of the field of computer-aided drug design and associated experimental approaches. Topics covered include X-ray crystallography, NMR, fragment-based drug design, free energy methods, docking and scoring, linear-scaling quantum calculations, QSAR, pharmacophore methods, computational ADME-Tox, and drug discovery case studies. A variety of authors from academic and commercial institutions all over the world have contributed to this book, which is illustrated with more than 200 images. This is the only book to cover the subject of structure and ligand-based drug design, and it provides the most up-to-date information on a wide range of topics for the practising computational chemist, medicinal chemist, or structural biologist. Professor Kenneth Merz has been selected as the recipient of the 2010 ACS Award for Computers in Chemical & Pharmaceutical Research that recognizes the advances he has made in the use of quantum mechanics to solve biological and drug discovery problems.
Chapter
Structure-based (SBDD) and ligand-based (LBDD) drug design are extremely important and active areas of research in both the academic and commercial realms. This book provides a complete snapshot of the field of computer-aided drug design and associated experimental approaches. Topics covered include X-ray crystallography, NMR, fragment-based drug design, free energy methods, docking and scoring, linear-scaling quantum calculations, QSAR, pharmacophore methods, computational ADME-Tox, and drug discovery case studies. A variety of authors from academic and commercial institutions all over the world have contributed to this book, which is illustrated with more than 200 images. This is the only book to cover the subject of structure and ligand-based drug design, and it provides the most up-to-date information on a wide range of topics for the practising computational chemist, medicinal chemist, or structural biologist. Professor Kenneth Merz has been selected as the recipient of the 2010 ACS Award for Computers in Chemical & Pharmaceutical Research that recognizes the advances he has made in the use of quantum mechanics to solve biological and drug discovery problems.
Chapter
Structure-based (SBDD) and ligand-based (LBDD) drug design are extremely important and active areas of research in both the academic and commercial realms. This book provides a complete snapshot of the field of computer-aided drug design and associated experimental approaches. Topics covered include X-ray crystallography, NMR, fragment-based drug design, free energy methods, docking and scoring, linear-scaling quantum calculations, QSAR, pharmacophore methods, computational ADME-Tox, and drug discovery case studies. A variety of authors from academic and commercial institutions all over the world have contributed to this book, which is illustrated with more than 200 images. This is the only book to cover the subject of structure and ligand-based drug design, and it provides the most up-to-date information on a wide range of topics for the practising computational chemist, medicinal chemist, or structural biologist. Professor Kenneth Merz has been selected as the recipient of the 2010 ACS Award for Computers in Chemical & Pharmaceutical Research that recognizes the advances he has made in the use of quantum mechanics to solve biological and drug discovery problems.
Chapter
Structure-based (SBDD) and ligand-based (LBDD) drug design are extremely important and active areas of research in both the academic and commercial realms. This book provides a complete snapshot of the field of computer-aided drug design and associated experimental approaches. Topics covered include X-ray crystallography, NMR, fragment-based drug design, free energy methods, docking and scoring, linear-scaling quantum calculations, QSAR, pharmacophore methods, computational ADME-Tox, and drug discovery case studies. A variety of authors from academic and commercial institutions all over the world have contributed to this book, which is illustrated with more than 200 images. This is the only book to cover the subject of structure and ligand-based drug design, and it provides the most up-to-date information on a wide range of topics for the practising computational chemist, medicinal chemist, or structural biologist. Professor Kenneth Merz has been selected as the recipient of the 2010 ACS Award for Computers in Chemical & Pharmaceutical Research that recognizes the advances he has made in the use of quantum mechanics to solve biological and drug discovery problems.
Chapter
Since its introduction about four decades ago, docking and scoring are now the very heart of structure‐based drug design. The past 10–20 years have seen a plethora of docking and scoring tools successfully integrated into drug discovery pipelines. Interestingly, while artificial intelligence is now receiving significant attention in the computational modeling arena, docking and scoring have long utilized these techniques. This comprehensive summary highlights some of the most significant achievements of docking and scoring, reviews the current status, provides descriptions of many of the tools available, comments on some of the outstanding challenges facing the paradigm, and offers perspectives and advice on best practices for users. While significant development is still needed in docking of flexible molecules and accurate Gibb's free energy predictions, docking and scoring are very useful when handled by experienced practitioners, but less so if treated as a “black box.” Lastly, we present a hypothetical case so beginners may appreciate the nuances of setting up a docking study. Focus in docking is now shifting toward parallel applications, i.e. protein–protein, protein–oligosaccharide, protein–DNA, or protein–RNA docking and polypharmacology. In summary, this article is intended to elucidate the nuances of the subject, while providing guidelines for practical implementation of effective workflows in drug discovery and structural biology.
Chapter
Chemoinformatics is broadly a scientific discipline encompassing the design, creation, organization, management, retrieval, analysis, dissemination, visualization and use of chemical information. It is distinct from other computational molecular modeling approaches in that it uses unique representations of chemical structures in the form of multiple chemical descriptors; has its own metrics for defining similarity and diversity of chemical compound libraries; and applies a wide array of statistical, data mining and machine learning techniques to very large collections of chemical compounds in order to establish robust relationships between chemical structure and its physical or biological properties. Chemoinformatics addresses a broad range of problems in chemistry and biology; however, the most commonly known applications of chemoinformatics approaches have been arguably in the area of drug discovery where chemoinformatics tools have played a central role in the analysis and interpretation of structure-property data collected by the means of modern high throughput screening. Early stages in modern drug discovery often involved screening small molecules for their effects on a selected protein target or a model of a biological pathway. In the past fifteen years, innovative technologies that enable rapid synthesis and high throughput screening of large libraries of compounds have been adopted in almost all major pharmaceutical and biotech companies. As a result, there has been a huge increase in the number of compounds available on a routine basis to quickly screen for novel drug candidates against new targets/pathways. In contrast, such technologies have rarely become available to the academic research community, thus limiting its ability to conduct large scale chemical genetics or chemical genomics research. However, the landscape of publicly available experimental data collection methods for chemoinformatics has changed dramatically in very recent years. The term "virtual screening" is commonly associated with methodologies that rely on the explicit knowledge of three-dimensional structure of the target protein to identify potential bioactive compounds. Traditional docking protocols and scoring functions rely on explicitly defined three dimensional coordinates and standard definitions of atom types of both receptors and ligands. Albeit reasonably accurate in many cases, conventional structure based virtual screening approaches are relatively computationally inefficient, which has precluded them from screening really large compound collections. Significant progress has been achieved over many years of research in developing many structure based virtual screening approaches. This book is the first monograph that summarizes innovative applications of efficient chemoinformatics approaches towards the goal of screening large chemical libraries. The focus on virtual screening expands chemoinformatics beyond its traditional boundaries as a synthetic and data-analytical area of research towards its recognition as a predictive and decision support scientific discipline. The approaches discussed by the contributors to the monograph rely on chemoinformatics concepts such as: -representation of molecules using multiple descriptors of chemical structures -advanced chemical similarity calculations in multidimensional descriptor spaces -the use of advanced machine learning and data mining approaches for building quantitative and predictive structure activity models -the use of chemoinformatics methodologies for the analysis of drug-likeness and property prediction -the emerging trend on combining chemoinformatics and bioinformatics concepts in structure based drug discovery The chapters of the book are organized in a logical flow that a typical chemoinformatics project would follow - from structure representation and comparison to data analysis and model building to applications of structure-property relationship models for hit identification and chemical library design. It opens with the overview of modern methods of compounds library design, followed by a chapter devoted to molecular similarity analysis. Four sections describe virtual screening based on the using of molecular fragments, 2D pharmacophores and 3D pharmacophores. Application of fuzzy pharmacophores for libraries design is the subject of the next chapter followed by a chapter dealing with QSAR studies based on local molecular parameters. Probabilistic approaches based on 2D descriptors in assessment of biological activities are also described with an overview of the modern methods and software for ADME prediction. The book ends with a chapter describing the new approach of coding the receptor binding sites and their respective ligands in multidimensional chemical descriptor space that affords an interesting and efficient alternative to traditional docking and screening techniques. Ligand-based approaches, which are in the focus of this work, are more computationally efficient compared to structure-based virtual screening and there are very few books related to modern developments in this field. The focus on extending the experiences accumulated in traditional areas of chemoinformatics research such as Quantitative Structure Activity Relationships (QSAR) or chemical similarity searching towards virtual screening make the theme of this monograph essential reading for researchers in the area of computer-aided drug discovery. However, due to its generic data-analytical focus there will be a growing application of chemoinformatics approaches in multiple areas of chemical and biological research such as synthesis planning, nanotechnology, proteomics, physical and analytical chemistry and chemical genomics.
Chapter
Comparative Binding Energy (COMBINE) analysis is an approach for deriving a target-specific scoring function to compute binding free energy, drug-binding kinetics, or a related property by exploiting the information contained in the three-dimensional structures of receptor–ligand complexes. Here, we describe the process of setting up and running COMBINE analysis to derive a Quantitative Structure-Kinetics Relationship (QSKR) for the dissociation rate constants (koff) of inhibitors of a drug target. The derived QSKR model can be used to estimate residence times (τ, τ=1/koff) for similar inhibitors binding to the same target, and it can also help to identify key receptor–ligand interactions that distinguish inhibitors with short and long residence times. Herein, we demonstrate the protocol for the application of COMBINE analysis on a dataset of 70 inhibitors of heat shock protein 90 (HSP90) belonging to 11 different chemical classes. The procedure is generally applicable to any drug target with known structural information on its complexes with inhibitors.
Article
Full-text available
The review aims to present a classification and applicability analysis of methods for preliminary molecular modelling for targeted organic, catalytic and biocatalytic synthesis. The following three main approaches are considered as a primary classification of the methods: modelling of the target – ligand coordination without structural information on both the target and the resulting complex; calculations based on experimentally obtained structural information about the target; and dynamic simulation of the target – ligand complex and the reaction mechanism with calculation of the free energy of the reaction. The review is meant for synthetic chemists to be used as a guide for building an algorithm for preliminary modelling and synthesis of structures with specified properties. The bibliography includes 353 references.
Chapter
Molecular modeling and simulation play a central role in academic and industrial research focused on physico-chemical properties and processes. The efforts carried out in this field have crystallized in a variety of models, simulation methods, and computational techniques that are examining the relationship between the structure, dynamics and functional role of biomolecules and their interactions. In particular, there has been a huge advance in the understanding of the molecular determinants that mediate the interaction between small compounds acting as ligands and their macromolecular targets. This book provides an updated description of the advances experienced in recent years in the field of molecular modeling and simulation of biomolecular recognition, with particular emphasis towards the development of efficient strategies in structure-based drug design.
Article
In three-dimensional (3D)-quantitative structure-activity relationship (QSAR) analysis, the chemical structure of a studied molecule is typically optimized assuming its presence in a vacuum environment. However, in practical scenarios, the environment of even the most stable molecules contains water, proteins, and other species; therefore, their actual structures significantly differ from those in vacuum and have multiple structures. Herein, both two-dimensional and 3D molecular descriptors, which accepted the existence of multiple conformers, were calculated, and a conformer-based 3D-QSAR model (C3D-QSAR) that considered the chemical structures of conformers was developed. The prediction accuracy of the C3D-QSAR method determined by analyzing the data sets obtained for the angiotensin-converting enzyme and dihydrofolate reductase inhibitors was found to be higher than those of the existing QSAR models.
Article
Introduction Despite the availability of FDA approved inhibitors of HIV protease, numerous efforts are still ongoing to achieve ‘near-perfect’ drugs devoid of characteristic adverse side effects, toxicities, and mutational resistance. While experimental methods have been plagued with huge consumption of time and resources, there has been an incessant shift towards the use of computational simulations in HIV protease inhibitor drug discovery. Areas covered Herein, the authors review the numerous applications of 3D-QSAR modeling methods over recent years relative to the design of new HIV protease inhibitors from a series of experimentally derived compounds. Also, the augmentative contributions of molecular docking are discussed. Expert opinion Efforts to optimize 3D QSAR and molecular docking for HIV-1 drug discovery are ongoing, which could further incorporate inhibitor motions at the active site using molecular dynamics parameters. Also, highly predictive machine learning algorithms such as random forest, K-means, decision trees, linear regression, hierarchical clustering, and Bayesian classifiers could be employed.
Chapter
With the increasing price tag for bringing a drug molecule to the market, There is an increased desire for pharmaceutical companies to incorporate sustainable and environmentally benign approaches. Traditionally, medicinal chemists have tried to utilize several green techniques to improve the process of drug discovery. Sometimes, it requires the modification of the existing processes or substituting them with new synthetic routes to become more environmentally benign. Methods included the selection of solvents and/or catalysts or to replace reaction steps with sustainable reaction conditions. The advent of the powerful computer technologies, offers opportunities to develop more sustainable ways to do medicinal chemistry with predictive algorithms. Algorithms used to calculate the process mass intensity (PMI) in any pharmaceutical process chemistry helps attenuate product sustainability. Cheminformatics tools like quantitative structure-activity relationship (QSAR) or computational tools like structure- and ligand-based drug design and virtual screening help design more active molecules in a greener way. Predictive tools for molecular properties like absorption, distribution, metabolism, and excretion (ADME), solubility, and lipophilicity are essential in modern drug discovery processes to create more drug-like molecules. These green and sustainable tools made modern medicinal chemistry more robust and efficient.
Article
Receptor-based QSAR approaches can enumerate the energetic contributions of amino acid residues towards ligand binding only when experimental binding affinity is associated. The structural data of protein-ligand complexes is witnessing a tremendous growth in the Protein Data Bank deposited with few entries on binding affinity. We present here a new approach to compute the Energetic CONTributions of Amino acid residues and its possible Cross-Talk (ECONTACT) to study ligand binding using per-residue energy decomposition, molecular dynamics simulations and rescoring method without the need for experimental binding affinity. This approach recognizes potential cross-talks amongst amino acid residues imparting non-additive effect to the binding affinity with evidence of correlative motions in the dynamics simulations. The protein-ligand interaction energies deduced from multiple structures are decomposed into per-residue energy terms which are employed as variables to principal component analysis and generated cross-terms. Out of sixteen cross-talks derived from eight datasets of protein-ligand systems, the ECONTACT approach is able to associate ten potential cross-talks with site-directed mutagenesis, free energy and dynamics simulations data strongly. We modeled these key determinants of ligand binding using joint probability density function (jPDF) to identify cross-talks in protein structures. The top two cross-talks identified by ECONTACT approach corroborated with the experimental findings. Furthermore, virtual screening exercise using ECONTACT models better discriminated known inhibitors from decoy molecules. This approach proposes the jPDF metric to estimate the probability of observing cross-talks in any protein-ligand complex. The source code and related resources to perform ECONTACT modeling is available freely at https://www.gujaratuniversity.ac.in/econtact/
Thesis
Full-text available
Le criblage virtuel est utilisé dans la recherche de médicaments et la construction de modèle de prédiction de toxicité. L’application d’un protocole de criblage est précédée par une étape d’évaluation sur une banque de données de référence. La composition des banques d’évaluation est un point critique ; celles-ci opposent généralement des molécules actives à des molécules supposées inactives, faute de publication des données d’inactivité. Les molécules inactives sont néanmoins porteuses d’information. Nous avons donc créé la banque NR-DBIND composée uniquement de molécules actives et inactives expérimentalement validées et dédiées aux récepteurs nucléaires. L’exploitation de la NR-DBIND nous a permis d’étudier l’importance des molécules inactives dans l’évaluation de modèles de docking et dans la construction de modèles de pharmacophores. L’application de protocoles de criblage a permis d’élucider des modes de liaison potentiels de petites molécules sur FXR, NRP-1 et TNF⍺.
Article
Full-text available
We present the performances of our mathematical deep learning (MathDL) models for D3R Grand Challenge 4 (GC4). This challenge involves pose prediction, affinity ranking, and free energy estimation for beta secretase 1 (BACE) as well as affinity ranking and free energy estimation for Cathepsin S (CatS). We have developed advanced mathematics, namely differential geometry, algebraic graph, and/or algebraic topology, to accurately and efficiently encode high dimensional physical/chemical interactions into scalable low-dimensional rotational and translational invariant representations. These representations are integrated with deep learning models, such as generative adversarial networks (GAN) and convolutional neural networks (CNN) for pose prediction and energy evaluation, respectively. Overall, our MathDL models achieved the top place in pose prediction for BACE ligands in Stage 1a. Moreover, our submissions obtained the highest Spearman correlation coefficient on the affinity ranking of 460 CatS compounds, and the smallest centered root mean square error on the free energy set of 39 CatS molecules. It is worthy to mention that our method on docking pose predictions has significantly improved from our previous ones.
Article
An accurate scoring function is expected to correctly select the most stable structure from a set of pose candidates. One can hypothesize that a scoring function’s ability to identify the most stable structure might be improved by emphasizing the most relevant atom pairwise interactions. However, it is hard to evaluate the relevant importance for each atom pair using traditional means. With the introduction of machine learning methods, it has become possible to determine the relative importance for each atom pair present in a scoring function. In this work, we use the Random Forest (RF) method to refine a pair potential developed by our laboratory (GARF6) by identifying relevant atom pairs that optimize the performance of the potential on our given task. Our goal is to construct a machine learning (ML) model that can accurately differentiate the native ligand binding pose from candidate poses using a potential refined by RF optimization. We successfully constructed RF models on an unbalanced data set with the ‘comparison’ concept and, the resultant RF models were tested on CASF-2013.5 In a comparison of the performance of our RF models against 29 scoring functions, we found our models outperformed the other scoring functions in predicting the native pose. In addition, created two artificially designed potential function sets to address the importance of the GARF potential in the RF models: (1) a scrambled probability function set, which was obtained by mixing up atom pairs and probability functions in GARF, and (2) a uniform probability function set, which share the same peak positions with GARF but have fixed peak heights. The results of accuracy comparison from RF models based on the scrambled, uniform, and original GARF potential clearly showed that the peak positions in the GARF potential are important while the well depths are not. All code and data used in this work are available at: https://github.com/JunPei000/random_forest_protein_ligand_decoy_detection.
Article
Full-text available
A new parametric quantum mechanical molecular model, AM1 (Austin Model 1), based on the NDDO approximation, is described. In it the major weaknesses of MNDO, in particular failure to reproduce hydrogen bonds, have been overcome without any increase in computing time. Results for 167 molecules are reported. Parameters are currently available for C, H, O, and N.
Article
Full-text available
Classical Monte Carlo simulations have been carried out for liquid water in the NPT ensemble at 25 °C and 1 atm using six of the simpler intermolecular potential functions for the water dimer: Bernal–Fowler (BF), SPC, ST2, TIPS2, TIP3P, and TIP4P. Comparisons are made with experimental thermodynamic and structural data including the recent neutron diffraction results of Thiessen and Narten. The computed densities and potential energies are in reasonable accord with experiment except for the original BF model, which yields an 18% overestimate of the density and poor structural results. The TIPS2 and TIP4P potentials yield oxygen–oxygen partial structure functions in good agreement with the neutron diffraction results. The accord with the experimental OH and HH partial structure functions is poorer; however, the computed results for these functions are similar for all the potential functions. Consequently, the discrepancy may be due to the correction terms needed in processing the neutron data or to an effect uniformly neglected in the computations. Comparisons are also made for self‐diffusion coefficients obtained from molecular dynamics simulations. Overall, the SPC, ST2, TIPS2, and TIP4P models give reasonable structural and thermodynamic descriptions of liquid water and they should be useful in simulations of aqueous solutions. The simplicity of the SPC, TIPS2, and TIP4P functions is also attractive from a computational standpoint.
Article
Full-text available
In the introductory section of this review, we alluded to the difficulty of applying electrostatic theory to problems in molecular biophysics. As an extreme example of this point, most experiments on biophysical systems are carried out in water, usually at some nonzero ionic strength. However until recently there have been no adequate methods for treating solvent effects on anything but highly simplified models of biological macromolecules. Advances in theoretical and computational methods have changed this situation dramatically. Indeed, as summarized in the applications section of this article, it is now possible to account for a wide range of phenomena that are primarily electrostatic in origin and that depend critically on the details of the three-dimensional structure of a macromolecule and on the properties of the surrounding solvent. Moreover, with the availability of accurate experimental probes of these phenomena as well as detailed structural information, critical testing and refinement of theoretical methods have become possible. We have attempted to summarize the basis for this specific progress and to identify general issues that arise in applications of electrostatic theory to problems associated with the structure and function of proteins and nucleic acids.
Article
Full-text available
The proinflammatory effects of intra-articular injection of purified phospholipase A2 from snake venom and rheumatoid synovial fluid were studied in rats. Purified soluble phospholipase A2 (PLA2) in concentrations ranging from 1000 to 20,000 units/ml, was injected intra-articularly. Histologic parameters examined were cell and protein content of synovial fluid, subsynovial cellular infiltration, synovial lining cell hyperplasia, bone erosion, and peri-articular soft tissue infiltration. Single intra-articular injections of PLA2 resulted in an acute inflammatory infiltrate of the subsynovium with maximal changes seen 2 to 6 hours after injection. Acute inflammatory changes were dose-dependent. Joints injected repeatedly at 24-hour intervals showed prominent synovial lining cell hyperplasia, maximal at 96 hours. Human synovial and snake venom PLA2s were equipotent at inducing both the acute and chronic articular changes. These changes were not seen in joints injected with inactivated PLA2. It is concluded that soluble PLA2 causes time- and dose-dependent acute inflammatory changes after a single intra-articular injection and synovial lining cell hyperplasia in response to repeated exposure to PLA2. The experimental proliferative synovitis in this model may correlate with features of acutely inflammed joints bathed in synovial fluids containing high levels of PLA2 in patients with rheumatoid arthritis.
Book
The third volume in the series on Computer Simulation of Biomolecular Systems continues with the format introduced in the first volume [1] and elaborated in the second volume [2]. The primary emphasis is on the methodological aspects of simulations, although there are some chapters that present the results obtained for specific systems of biological interest. The focus of this volume has changed somewhat since there are several chapters devoted to structure-based ligand design, which had only a single chapter in the second volume. It seems useful to set the stage for this volume by quoting from my preface to Volume 2 [2]. "The long-range 'goal of molecular approaches to biology is to describe living systems in terms of chemistry and physics. Over the last fifty years great progress has been made in applying the equations representing the underlying physical laws to chemical problems involv­ ing the structures and reactions of small molecules. Corresponding studies of mesoscopic systems have been undertaken much more recently. Molecular dynamics simulations, which are the primary focus of this volume, represent the most important theoretical approach to macromolecules of biological interest." ...
Chapter
This chapter discusses the recent advances in the design and evaluation of inhibitors of phospholipase A2 (PLA2). PLA2 refers to a large class of acylhydrolytic enzymes that specifically act at the sn-2 position of phospholipid substrate. Substrate sources are derived from two broad categories: 1) synthetic PL or 2) natural “membrane” lipid forms. The use of PLA2 administration as the initiating inflammatory insult are the most documented and well characterized models in terms of evaluating the proinflammatory activity of PLA2. The activity of PLA2 is greatly enhanced when the substrate is above its critical micelle concentration (CMC). The importance of comparing inhibitor activity against the same enzyme can be illustrated with manoalide. Its IC50 value for inhibition of PLA2 hydrolysis has been measured against a number of enzymes and varies over almost three log units. The inhibitory effects of these retinoids on the release and metabolism of AA from rat peritoneal macrophages challenged with A23187, zymosan, and 12-O-tetradecanoate phorbol-13-acetate (TPA) have also been studied. Several other natural products have also been reported to inhibit PLA2. Gossypol, a male non-steroidal antifertility agent, has been shown, at 100 μM, to completely inhibit intact human spermatozoa PLA2 hydrolysis of monolayers of phosphatidylglycerol. A three-dimensional study of the bovine pancreatic PLA2 X-ray structure, with the goal of identifying novel compounds that fit both sterically and electronically into the active site, has resulted in the identification of a series of potent in vitro inhibitors. Another study of the same X-ray structure has resulted in the synthesis of long chain alkylamine inhibitors. Although there has been considerable activity in this field over the last decade, much work remains before a clinical candidate can emerge.
Article
This paper presents the algorithm "DETMAX" whose purpose is to construct experimental designs that are "D-optimal." These are designs for which the determinant of X′X is maximum, where X is the "matrix of independent variables" in the usual linear model y = Xβ + ε. Although the algorithm does not guarantee D-optimality, it has performed well in many cases where D-optimal designs are known. Five examples are given, illustrating the use of DETMAX to construct designs "from scratch" and to augment existing data. A FORTRAN listing is available on request.
Article
The use of partial least squares (PLS) for handling collinearities among the independent variables X in multiple regression is discussed. Consecutive estimates $({\text{rank }}1,2,\cdots )$ are obtained using the residuals from previous rank as a new dependent variable y. The PLS method is equivalent to the conjugate gradient method used in Numerical Analysis for related problems. To estimate the “optimal” rank, cross validation is used. Jackknife estimates of the standard errors are thereby obtained with no extra computation. The PLS method is compared with ridge regression and principal components regression on a chemical example of modelling the relation between the measured biological activity and variables describing the chemical structure of a set of substituted phenethylamines.
Article
Modern drug design increasingly relies on structural information for the improvement of in vitro potency. Recently, these methods have been used to optimize in vitro properties as well. Barries to the sucessful integration of these technologies within the drug industry seem to be largely cultural rather than scientific.
Article
Absolute free energies of hydration have been computed for 13 diverse organic molecules using partial charges derived from ab initio 6-31G* wave functions. Both Mulliken charges and charges fit to the electrostatic potential surface (EPS) were considered in conjunction with OPLS Lennard–Jones parameters for the organic molecules and the TIP4P model of water. Monte Carlo simulations with statistical perturbation theory yielded relative free energies of hydration. These were converted to absolute quantities through perturbations to reference molecules for which absolute free energies of hydration had been obtained previously in TIP4P water. The average errors in the computed absolute free energies of hydration are 1.1 kcal/mol for the 6-31G* EPS charges and 4.0 kcal/mol for the Mulliken charges. For the EPS charges, the largest individual errors are under 2 kcal/mol except for acetamide, in which case the error is 3.7 kcal/mol. The hydrogen bonding between the organic solutes and water has also been characterized. © John Wiley & Sons, Inc.
Article
This paper presents the algorithm "DETMAX" whose purpose is to construct experimental designs that are "D-optimal." These are designs for which the determinant of X'X is maximum, where X is the imatrix of independent variables" in the usual linear model y = X beta + epsilon. Although the algorithm does not guarantee D-optimality, it has performed well in many cases where D-optimal designs are known. Five examples are given, illustrating the use of DETMAX to construct designs "from scratch" and to augment existing data. A FORTRAN Listing is available on request.
Article
The control of prostaglandin and leukotriene biosynthesis in inflammatory and other cells depends on the enzymatic release of free arachidonic acid from the sn-2 position of membrane phospholipids. Although many types of phospholipases have been implicated, a phospholipase A2 is the simplest and most obvious candidate for the responsible enzyme. Studies on the phospholipase A2 from snake venom and mammalian pancreas provide a paridigm for the phospholipases responsible for arachidonic acid release. Additionally, they provide the best source of a readily available, pure, stable phospholipase to test potential inhibitors. Different kinds of inhibitors require different analytical strategies. Reversible inhibitors, including polycyclic aromatic dyes, fatty acids and amide ether analogues of phospholipids will be considered as well as irreversible inhibitors such as p-bromophenacyl-bromide and manoalide, an unusual natural product obtained from sponge and which may act by a novel mechanism. Protein inhibitors of the lipocortin type may inhibit by a "substrate depletion model". Meaningful interpretation of inhibitor studies can only be accomplished within the framework of an understanding of the enzyme's kinetics and mechanism of action at the lipid-water interface.
Article
We have simulated the interaction of L-thyroxine (1), D-thyroxine (2), and their deamino (3) and decarboxy (4) analogues with the human plasma protein prealbumin by using molecular mechanics calculations. Starting geometries were taken from the high-resolution X-ray structure of prealbumin and difference electron density maps of the prealbumin-thyroxine complex. We model the interactions by using the atoms of the thyroxine analogue and approximately 250 atoms within the binding site of prealbumin, minimizing the total energy with respect to all geometric degrees of freedom. Using the molecular mechanics calculated interaction energies and a simple empirical method to estimate the solvation energy differences of 1-4, we qualitatively reproduce the experimentally observed relative free energies of association of these analogues to prealbumin and offer a structural and energetic model to account for the different binding affinities of analogues 1-4 to the protein.
Article
Molecular mechanics valence force field parameters for the sulfonamide group, SO[sub 2]NH, have been derived from ab initio calculations at the RHF/6-31G* level of theory. The force field parameters were designed to be used in conjunction with existing parameters from the MM2/MMP2 force field. The new parameters are demonstrated to accurately reproduce the ab initio optimized geometries of four molecules that contain the sulfonamide group. The strategy used in force field parametrization is discussed. The conformational flexibility of the sulfonamide group has been investigated. Calculations at the RHF/6-31G* level reveal the existence of two stable conformers and that interconversion is achieved by nitrogen inversion rather than rotation about the S-N bond. The energetic effects of expanding the basis set to 6-31G** and of including MP2 and MP3 corrections for electron correlation are discussed. The geometries and Mulliken charges for the ab initio optimized structures are also reported.
Article
Quantitative determinations of multicomponent fluorescent mixture have been made. The test substances used were humic acid, ligninsulfonate, and an optical whitener from a detergent. The fluorescence spectra from these substances have similar features with severe overlap in the whole wavelength region. For resolving these spectra and quantifying the substances, we used a numerical method, the partial least squares in latent variables (PLS). The results of its application and the theory of the method are presented and a comparison is made with other numerical methods.
Article
A review is presented of drugs that affect human proteins, inhibition of proteins that are unique to infectious organisms, selective inhibition of proteins from infectious organisms, and protein crystallography. It is shown that a new drug design strategy has emerged during the last decade based upon accurate three-dimensional structures of biomacromolecules that have to be specifically affected by pharmaceutically active compounds. In view of the great advantage of this strategy over existing ones, it is not hard to predict that a considerable percentage of new drugs coming on the market by the turn of the century will be based on approaches outlined in this article.
Article
It is demonstrated that semiempirical methods give electrostatic potential (ESP) derived atomic point charges that are in reasonable agreement with ab initio ESP charges. Furthermore, we find that MNDO ESP charges are superior to AM1 ESP charges in correlating with ESP charges derived from the 6-31G* basis set. Thus, it is possible to obtain 6-31G* quality point charges by simply scaling MNDO ESP charges. The charges are scaled in a linear (y = Mx) manner to conserve charge. In this way researchers desiring to carry out force field simulations or minimizations can obtain charges by using MNDO, which requires much less computer time than the corresponding 6-31G* calculation.
Article
CHARMM (Chemistry at HARvard Macromolecular Mechanics) is a highly flexible computer program which uses empirical energy functions to model macromolecular systems. The program can read or model build structures, energy minimize them by first- or second-derivative techniques, perform a normal mode or molecular dynamics simulation, and analyze the structural, equilibrium, and dynamic properties determined in these calculations. The operations that CHARMM can perform are described, and some implementation details are given. A set of parameters for the empirical energy function and a sample run are included.
Article
An advanced variable selection procedure, called GOLPE, aimed at obtaining PLS regression models with the highest prediction ability is presented and illustrated with an application in 3D‐QSAR. Key steps in the procedure are a preliminary variable selection by means of a D‐optimal design in the loading space, and an iterative evaluation of the effects of individual variables on the model predictivity based on the validation of a number of reduced models, on variables combinations selected according to a FFD strategy. The procedure is successfully applied to a real 3D‐QSAR case study: the results obtained by GOLPE are compared with those obtained by CoMFA and found to be in good agreement in terms of variable importance, but with a much higher prediction ability. Accordingly, the results encourage to think that it might be used within the CoMFA framework in the place of the present PLS version there, or in CoMFA‐like studies on the structures generated by GRID probes.
Article
A set of procedures and guidelines are presented for the estimation of bond length, bond angle, and torsional potential constants for molecular mechanics force fields. The force field constants are ultimately derived by “subtracting” nonbonded molecular mechanics energies from corresponding molecular orbital energies using a model compound containing the chemical structure to be parameterized. Case study examples of bond length, bond angle, and torsional rotation force field parameterizations are presented. A general discussion of molecular mechanics force field parameterization strategy is included for reference and completeness. Finally, a curve-fitting program to generate force field parameters from raw data is given in Appendix I.
Article
The frequency of chance correlation using partial least squares (PLS) has been measured experimentally for variously dimensioned data, comprising either completely random numbers, random numbers containing a perfect correlation within, and CoMFA field descriptors. This frequency, much lower than that for stepwise multiple regression, is maximal for datasets in which the number of descriptors equals the number of compounds, and surprisingly decreases indefinitely as the number of descriptors becomes much greater than the number of compounds. However, perfect correlations involving descriptor subsets are not detected by PLS if the number of irrelevant descriptors is excessive. In CoMFA applications, the probability of chance correlation is usually negligible. For example with 21 compounds a crossvalidated r2 value greater than 0.25 will occur by chance in less than 5% of trials.
Article
In this article we report how protein-ligand X-ray structures can be used to identify and develop new drug candidates. The structure-based drug design approach (SBDD) is described, illustrating the way crystallographic information is utilized in an iterative manner to identify and optimize novel lead compounds, thereby accelerating the overall drug discovery process. The requirements for the effective use of SBDD and its implementation, drawing upon case histories from our Thymidylate Synthase (TS) research program, are described.
Article
In a previous aqueous protein dynamics study, we compared the rms deviation relative to the crystal structure for distance-dependent and constant dielectric models with and without a nonbonded cutoff. The structures obtained from a constant dielectric simulation with a cutoff were substantially different from the structures obtained from a distance-dependent dielectric simulation, with and without cutoff, and a constant dielectric model without a cutoff. In fact, structures from the distance-dependent dielectric simulations were insensitive to the nonbonded cutoff and in good agreement with the structures generated from the constant dielectric simulation without a cutoff. In addition, the solute-solvent temperature differential and solvent evaporation artifacts, characteristic of the constant dielectric simulation with a cutoff, were not present for the distance-dependent dielectric simulations. In this current work, we explore whether this dielectric-dependent cutoff-sensitive behavior for a constant dielectric model arises from the discontinuities in the forces at the nonbonded cutoff or from neglecting the structure-stabilizing interactions beyond the nonbonded cutoff. We also examine the origin of the dielectric-dependent artifacts, and its potential influence on the structural disparity. Several protocols for protein dynamics simulations are compared using both constant and distance-dependent dielectric models, including implementation of a switching function and a nonbonded cutoff and two different temperature coupling algorithms. We show that the distance-dependent dielectric model conserves energy in the SPASMS molecular mechanics and dynamics software for the time steps and nonbonded cutoffs commonly used in macromolecule simulations. Although the switching function simulation also conserved energy over a range of commonly used cutoffs, the constant dielectric model with a switching function yielded conformational results more similar to a constant dielectric simulation without a switching function than to a constant dielectric model without a nonbonded cutoff. Therefore, the conformational disparity between the dielectric models arises from neglecting important structure-stabilizing interactions beyond the cutoff, rather than differences in energy conservation.
Article
Scitation is the online home of leading journals and conference proceedings from AIP Publishing and AIP Member Societies
Article
The Protein Data Bank is a computer-based archival file for macromolecular structures. The Bank stores in a uniform format atomic co-ordinates and partial bond connectivities, as derived from crystallographic studies. Text included in each data entry gives pertinent information for the structure at hand (e.g. species from which the molecule has been obtained, resolution of diffraction data, literature citations and specifications of secondary structure). In addition to atomic co-ordinates and connectivities, the Protein Data Bank stores structure factors and phases, although these latter data are not placed in any uniform format. Input of data to the Bank and general maintenance functions are carried out at Brookhaven National Laboratory. All data stored in the Bank are available on magnetic tape for public distribution, from Brookhaven (to laboratories in the Americas), Tokyo (Japan), and Cambridge (Europe and worldwide). A master file is maintained at Brookhaven and duplicate copies are stored in Cambridge and Tokyo. In the future, it is hoped to expand the scope of the Protein Data Bank to make available co-ordinates for standard structural types (e.g. alpha-helix, RNA double-stranded helix) and representative computer programs of utility in the study and interpretation of macromolecular structures.
Article
Although aqueous simulations with periodic boundary conditions more accurately describe protein dynamics than in vacuo simulations, these are computationally intensive for most proteins. Trp repressor dynamic simulations with a small water shell surrounding the starting model yield protein trajectories that are markedly improved over gas phase, yet computationally efficient. Explicit water in molecular dynamics simulations maintains surface exposure of protein hydrophilic atoms and burial of hydrophobic atoms by opposing the otherwise asymmetric protein-protein forces. This properly orients protein surface side chains, reduces protein fluctuations, and lowers the overall root mean square deviation from the crystal structure. For simulations with crystallographic waters only, a linear or sigmoidal distance-dependent dielectric yields a much better trajectory than does a constant dielectric model. As more water is added to the starting model, the differences between using distance-dependent and constant dielectric models becomes smaller, although the linear distance-dependent dielectric yields an average structure closer to the crystal structure than does a constant dielectric model. Multiplicative constants greater than one, for the linear distance-dependent dielectric simulations, produced trajectories that are progressively worse in describing trp repressor dynamics. Simulations of bovine pancreatic trypsin were used to ensure that the trp repressor results were not protein dependent and to explore the effect of the nonbonded cutoff on the distance-dependent and constant dielectric simulation models. The nonbonded cutoff markedly affected the constant but not distance-dependent dielectric bovine pancreatic trypsin inhibitor simulations. As with trp repressor, the distance-dependent dielectric model with a shell of water surrounding the protein produced a trajectory in better agreement with the crystal structure than a constant dielectric model, and the physical properties of the trajectory average structure, both with and without a nonbonded cutoff, were comparable.
Article
Most drugs have been discovered in random screens or by exploiting information about macromolecular receptors. One source of this information is in the structures of critical proteins and nucleic acids. The structure-based approach to design couples this information with specialized computer programs to propose novel enzyme inhibitors and other therapeutic agents. Iterated design cycles have produced compounds now in clinical trials. The combination of molecular structure determination and computation is emerging as an important tool for drug development. These ideas will be applied to acquired immunodeficiency syndrome (AIDS) and bacterial drug resistance.
Article
A new computer program is described, which positions small molecules into clefts of protein structures (e.g. an active site of an enzyme) in such a way that hydrogen bonds can be formed with the enzyme and hydrophobic pockets are filled with hydrophobic groups. The program works in three steps. First it calculates interaction sites, which are discrete positions in space suitable to form hydrogen bonds or to fill a hydrophobic pocket. The interaction sites are derived from distributions of nonbonded contacts generated by a search through the Cambridge Structural Database. An alternative route to generate the interaction sites is the use of rules. The second step is the fit of molecular fragments onto the interaction sites. Currently we use a library of 600 fragments for the fitting. The final step in the present program is the connection of some or all of the fitted fragments to a single molecule. This is done by bridge fragments. Applications are presented for the crystal packing of benzoic acid and the enzymes dihydrofolate reductase and trypsin.
Article
Hammett's breakthrough in defining σ constants for the electronic effect of substituents opened up the use of numbers and statistics so that it became clear just how good the relationship between structural modification and chemical reactivity in a set of congeners might be. This quantitative approach to structure–activity relationships enabled to see anomalous behavior of substituents as well as examples where linearity between the electronic effect and rate or equilibrium constants did not hold over a wide range, and thus enabled researchers to spot changes in reaction mechanisms. Using regression analysis, Taft was able to separate the electronic and steric effects of structural changes on reaction rates as well as to factor field/inductive and resonance effects of substituents. This chapter discusses the role of quantitative structure–activity relationships (QSAR) and molecular graphics in the evaluation of enzyme–ligand interactions. The success rate for developing enzymatic structure–activity relationships using the simple the Hammett equation or even the Taft equation, where σ is augmented with the steric parameter, is low. However, when hydrophobic effect was taken into account for studying enzymatic processes, the result showed that there is a statistically significant QSAR from rate or equilibrium enzymatic processes, provided there is a reasonable variation in the rate or equilibrium constants and that the structural changes are not too great. Too great means that suitable electronic, hydrophobic, and steric parameters must be made available to account for structural changes.
Article
Phospholipase A2 (PLA2) participates in a wide range of cellular processes including inflammation and transmembrane signaling. A human nonpancreatic secretory PLA2 (hnps-PLA2) has been identified that is found in high concentrations in the synovial fluid of patients with rheumatoid arthritis and in the plasma of patients with septic shock. This enzyme is secreted from certain cell types in response to the proinflammatory cytokines, tumor necrosis factor or interleukin-1. The crystal structures of the calcium-bound form of this enzyme have been determined at physiological pH both in the presence [2.1 angstrom (A) resolution] and absence (2.2 A resolution) of a transition-state analogue. Although the critical features that suggest the chemistry of catalysis are identical to those inferred from the crystal structures of other extracellular PLA2s, the shape of the hydrophobic channel of hnps-PLA2 is uniquely modulated by substrate binding.
Article
Phospholipases A2 (PLA2s) may be grouped into distinct families of proteins that catalyse the hydrolysis of the 2-acyl bond of phospholipids and perform a variety of biological functions. The best characterized are the small (relative molecular mass approximately 14,000) calcium-dependent, secretory enzymes of diverse origin, such as pancreatic and venom PLA2s. The structures and functions of several PLA2s are known. Recently, high-resolution crystal structures of complexes of secretory PLA2s with phosphonate phospholipid analogues have provided information about the detailed stereochemistry of transition-state binding, confirming the proposed catalytic mechanism of esterolysis. By contrast, studies on mammalian nonpancreatic secretory PLA2s (s-PLA2s) have only recently begun; s-PLA2s are scarce in normal cells and tissues but large amounts are found in association with local and systemic inflammatory processes and tissue injury in animals and man. Such s-PLAs have been purified from rabbit and rat inflammatory exudate, from synovial fluid from patients with rheumatoid arthritis and from human platelets. Cloning and sequencing shows that the primary structure of the human s-PLA2 has about 37% homology with that of bovine pancreatic PLA2 and 44% homology with that of Crotalus atrox PLA2. The human s-PLA2 is an unusually basic protein, yet contains most of the highly conserved amino-acid residues and sequences characteristic of the PLA2s sequenced so far. Here we report the refined, three-dimensional crystal structure at 2.2 A resolution of recombinant human rheumatoid arthritic synovial fluid PLA2. This may aid the development of potent and specific inhibitors of this enzyme using structure-based design.
Article
Phospholipases A2 play a part in a number of physiologically important cellular processes such as inflammation, blood platelet aggregation and acute hypersensitivity. These processes are all initiated by the release of arachidonic acid from cell membranes which is catalysed by intracellular phospholipases A2 and followed by conversion of arachidonic acid to prostaglandins, leukotrienes or thromboxanes. An imbalance in the production of these compounds can lead to chronic inflammatory diseases such as rheumatoid arthritis and asthma. Inhibitors of phospholipase A2 might therefore act to reduce the effects of inflammation, so structural information about the binding of phospholipase A2 to its substrates could be helpful in the design of therapeutic drugs. The three-dimensional structure is not known for any intracellular phospholipase A2, but these enzymes share significant sequence homology with secreted phospholipases, for which some of the structures have been determined. Here we report the structure of a complex between an extracellular phospholipase A2 and a competitively inhibiting substrate analogue, which reveals considerable detail about the interaction and suggests a mechanism for catalysis by this enzyme.
Article
A chemical description of the action of phospholipase A2 (PLA2) can now be inferred with confidence from three high-resolution x-ray crystal structures. The first is the structure of the PLA2 from the venom of the Chinese cobra (Naja naja atra) in a complex with a phosphonate transition-state analogue. This enzyme is typical of a large, well-studied homologous family of PLA2S. The second is a similar complex with the evolutionarily distant bee-venom PLA2. The third structure is the uninhibited PLA2 from Chinese cobra venom. Despite the different molecular architectures of the cobra and bee-venom PLA2s, the transition-state analogue interacts in a nearly identical way with the catalytic machinery of both enzymes. The disposition of the fatty-acid side chains suggests a common access route of the substrate from its position in the lipid aggregate to its productive interaction with the active site. Comparison of the cobra-venom complex with the uninhibited enzyme indicates that optimal binding and catalysis at the lipid-water interface is due to facilitated substrate diffusion from the interfacial binding surface to the catalytic site rather than an allosteric change in the enzyme's structure. However, a second bound calcium ion changes its position upon the binding of the transition-state analogue, suggesting a mechanism for augmenting the critical electrophile.
Article
Molecular mechanics methods have been applied to study the interaction between a series of 20 deprotonated benzenesulfonamides and the enzyme carbonic anhydrase. The different contributions to the binding energy have been evaluated and correlated with experimental inhibition data and molecular orbital indices of the sulfonamides in their bound conformation. The results suggest that the discrimination shown by the enzyme toward these inhibitors is dominated by the short-range van der Waals forces.
Article
An empirical energy function designed to calculate the interaction energy of a chemical probe group, such as a carbonyl oxygen or an amine nitrogen atom, with a target molecule has been developed. This function is used to determine the sites where ligands, such as drugs, may bind to a chosen target molecule which may be a protein, a nucleic acid, a polysaccharide, or a small organic molecule. The energy function is composed of a Lennard-Jones, an electrostatic and a hydrogen-bonding term. The latter is dependent on the length and orientation of the hydrogen bond and also on the chemical nature of the hydrogen-bonding atoms. These terms have been formulated by fitting to experimental observations of hydrogen bonds in crystal structures. In the calculations, thermal motion of the hydrogen-bonding hydrogen atoms and lone-pair electrons may be taken into account. For example, in a alcoholic hydroxyl group, the hydrogen may rotate around the C-O bond at the observed tetrahedral angle. In a histidine residue, a hydrogen atom may be bonded to either of the two imidazole nitrogens and movement of this hydrogen will cause a redistribution of charge which is dependent on the nature of the probe group and the surrounding environment. The shape of some of the energy functions is demonstrated on molecules of pharmacological interest.
Article
Extracellular phospholipase A2 was purified about 1.7 X 10(5) fold to near homogeneity from human synovial fluid of rheumatoid arthritis by sequential use of column chromatographies on heparin-Sepharose, butyl-Toyopearl, and reversed-phase HPLC. The final preparation showed a single band on SDS-polyacrylamide gel electrophoresis, and its molecular mass was estimated to be approximately 13,700 daltons. The purified enzyme had a pH optimum of 9.0 and required Ca2+ for maximum activity. It hydrolyzed phosphatidyl-ethanolamine more effectively than phosphatidylserine and phosphatidylcholine. These properties were similar to those of an extracellular phospholipase A2 detected in the peritoneal cavity of caseinate-treated rats.
Article
Finding novel leads from which to design drug molecules has traditionally been a matter of screening and serendipity. We present a method for finding a wide assortment of chemical structures that are complementary to the shape of a macromoleculer receptor site whose X-ray crystallographic structure is known. Each of a set of small molecules from the Cambridge Crystallographic Database (Allen; et al. J. Chem. Doc. 1973, 13, 119) is individually docked to the receptor in a number of geometrically permissible orientations with use of the docking algorithm developed by Kuntz et al. (J. Mol. Biol. 1982, 161, 269). The orientations are evaluated for goodness-of-fit, and the best are kept for further examination using the molecular mechanics program AMBER (Weiner; Kollman J. Comput. Chem. 1981, 106, 765). The shape-search algorithm finds known ligands as well as novel molecules that fit the binding site being studied. The highest scoring orientations of known ligands resemble binding modes generated by interactive modeling or determined crystallographically. We describe the application of this procedure to the binding sites of papain and carbonic anhydrase. While the compounds recovered from the Cambridge Crystallographic Database are not, themselves, likely to be inhibitors or substrates of these enzymes, we expect that the structures from such searches will be useful in the design of active compounds.
Article
A molecular dynamics simulation of myoglobin provides the first direct demonstration that the potential energy surface of a protein is characterized by a large number of thermally accessible minima in the neighborhood of the native structure (for example, approximately 2000 minima were sampled in a 300-picosecond trajectory). This is expected to have important consequences for the interpretation of the activity of transport proteins and enzymes. Different minima correspond to changes in the relative orientation of the helices coupled with side-chain rearrangements that preserve the close packing of the protein interior. The conformational space sampled by the simulation is similar to that found in the evolutionary development of the globins. Glasslike behavior is expected at low temperatures. The minima obtained from the trajectory do not satisfy certain criteria for ultrametricity.
Article
The interaction of a probe group with a protein of known structure is computed at sample positions throughout and around the macromolecule, giving an array of energy values. The probes include water, the methyl group, amine nitrogen, carboxy oxygen, and hydroxyl. Contour surfaces at appropriate energy levels are calculated for each probe and displayed by computer graphics together with the protein structure. Contours at negative energy levels delineate contours also enable other regions of attraction between probe and protein and are found at known ligand binding clefts in particular. The contours also enable other regions of attraction to be identified and facilitate the interpretation of protein-ligand energetics. They may, therefore, be of value for drug design.
Article
Phospholipase A2 (PLA2) activity was found in the sera and synovial fluids (SF) in rheumatoid arthritis (RA) and osteoarthritis (OA). PLA2 activity in RA SF was 6158 +/- 549 (SEM) U/ml (n = 48) and in RA sera 554 +/- 175 U/ml (normal sera-115 +/- 12 U/ml). In OA SF PLA2 activity was 5069 +/- 542 U/ml (n = 28), and in OA sera 268 +/- 55 U/ml. There was no significant difference between SF PLA2 activity in RA and OA. PLA2 activity in SF did not correlate with muramidase (lysozyme), beta-glucuronidase, total protein or white cell count, which were all significantly higher in RA SF than OA. A positive correlation between PLA2 in SF and matched sera was found in both RA and OA. It may be concluded that significant elevation of extracellular PLA2 occurs in both RA and OA, especially in the SF. The fact that high PLA2 did not correlate with other enzymes such as lysozyme and beta-glucuronidase, which are usually high in RA and low in OA SF, may mean that the handling of PLA2 in the joint space is different from other enzymes.