[Show abstract][Hide abstract] ABSTRACT: Among all tools available to design new drugs, molecular dynamics (MD) simulations have become an essential technique. Initially developed to investigate molecular models with a limited number of atoms, computers now enable investigations of large macromolecular systems with a simulation time reaching the microsecond range. The reviewed articles cover four years of research to give an overview on the actual impact of MD on the current medicinal chemistry landscape with a particular emphasis on studies of ligand–protein interactions. With a special focus on studies combining computational approaches with data gained from other techniques, this review shows how deeply embedded MD simulations are in drug design strategies and articulates what the future of this technique could be.
Drug Discovery Today 01/2015; 20(6). DOI:10.1016/j.drudis.2015.01.003 · 6.69 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The effect of removing a hydrogen-bond donor from the backbone of the 34-residue WW domain of the protein Pin1 is investigated for 20 residues that are part of the three-stranded β-sheet fold of this protein in aqueous solution. Forty-eight molecular dynamics (MD) simulations of the wild-type protein and 20 amide-to-ester mutants started from the X-ray crystal structure and the NMR solution structure are analyzed in terms of backbone-backbone hydrogen bonding and differences in free enthalpies of folding in order to provide a structural interpretation of the experimental chaotrope and thermal denaturation data available  for this protein and the 20 mutants. The forty enveloping distribution sampling (EDS) [2-5] simulations of the 20 mutants link the structural Boltzmann ensembles to relative free enthalpies of folding between mutants and wild-type protein. The contribution of the different β-sheet hydrogen bonds to the relative stability of the mutants with respect to wild type cannot be directly inferred from thermal denaturation temperatures or free enthalpies of chaotrope denaturation for the different mutants, because some β-sheet hydrogen bonds show sizeable variation in occurrence between the different mutants. A proper representation of unfolded state conformations appears to be essential for an adequate description of relative stabilities of protein mutants.
 S. Deechongkit, P. Dawson, J. Kelly, J. Am. Chem. Soc., 2004, 126, 16762-16771.
 C.D. Christ, W.F. van Gunsteren, J. Chem. Phys., 2007, 126, 184110.
 C.D. Christ, W.F. van Gunsteren, J. Chem. Theory Comput., 2009, 5, 276-286.
 S. Riniker, C.D. Christ, N. Hansen, A.E. Mark, P.C. Nair, W.F. van Gunsteren, J. Chem. Phys., 2011, 135, 024105.
 N. Hansen, J. Dolenc, M. Knecht, S. Riniker, W.F. van Gunsteren, J. Comput. Chem., 2012, 33, 640-651.
[Show abstract][Hide abstract] ABSTRACT: Background: The contribution of particular hydrogen bonds to the stability of a protein fold can be investigated experimentally as well as computationally by the construction of protein mutants which lack particular hydrogen-bond donors or acceptors with a subsequent determination of their structural stability. However, the comparison of experimental data with computational results is not straightforward. One of the difficulties is related to the representation of the unfolded state conformation. Methods: A series of molecular dynamics simulations of the 34-residue WW domain of protein Pin1 and 20 amide-to-ester mutants started from the X-ray crystal structure and the NMR solution structure are analysed in terms of backbone backbone hydrogen bonding and differences in free enthalpies of folding in order to provide a structural interpretation of the experimental data available. Results: The contribution of the different beta-sheet hydrogen bonds to the relative stability of the mutants with respect to wild type cannot be directly inferred from experimental thermal denaturation temperatures or free enthalpies of chaotrope denaturation for the different mutants, because some beta-sheet hydrogen bonds show sizeable variation in occurrence between the different mutants. Conclusions: A proper representation of unfolded state conformations appears to be essential for an adequate description of relative stabilities of protein mutants. General significance: The simulations may be used to link the structural Boltzmann ensembles to relative free enthalpies of folding between mutants and wild-type protein and show that unfolded conformations have to be treated with a sufficient level of detail in free energy calculations of protein stability. This article is part of a Special Issue entitled Recent developments of molecular dynamics.
Biochimica et Biophysica Acta (BBA) - General Subjects 09/2014; 1850(5). DOI:10.1016/j.bbagen.2014.09.014 · 4.38 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The research in the group for computational chemistry at the ETH Zurich focuses on the development of methods and software for classical molecular dynamics simulations and cheminformatics, and their application to biological and chemical questions. Here, important advances and challenges
in these subfields of computational chemistry are reviewed and potential opportunities for cross-fertilization are outlined.
CHIMIA International Journal for Chemistry 09/2014; 68(9). DOI:10.2533/chimia.2014.620 · 1.35 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Modern high-throughput screening (HTS) is a well-established approach for hit finding in drug discovery that is routinely employed in the pharmaceutical industry to screen more than a million compounds within a few weeks. However, as the industry shifts to more disease-relevant but more complex phenotypic screens, the focus has moved to piloting smaller but smarter chemically/biologically diverse subsets followed by an expansion around hit compounds. One standard method for doing this is to train a machine-learning (ML) model with the chemical fingerprints of the tested subset of molecules and then select the next compounds based on the predictions of this model. An alternative approach would be to take advantage of the wealth of bioactivity information contained in older (full-deck) screens using so-called HTS fingerprints, where each element of the fingerprint corresponds to the outcome of a particular assay, as input to machine-learning algorithms. We constructed HTS fingerprints using two collections of data: 93 in-house assays and 95 publicly available assays from PubChem. For each source, an additional set of 51 and 46 assays, respectively, was collected for testing. Three different ML methods, random forest (RF), logistic regression (LR) and naïve Bayes (NB), were investigated for both the HTS fingerprint and a chemical fingerprint, Morgan2. The RF was found to be best suited for learning from HTS fingerprints yielding AUC values > 0.8 for 78 % of the internal assays and enrichment factors at 5 % (EF(5%)) > 10 for 55 % of the assays. The RF(HTS-fp) generally outperformed the LR trained with Morgan2, which was the best ML method for the chemical fingerprint, for the majority of assays. In addition, HTS fingerprints were found to retrieve more diverse chemotypes. Combining the two models through heterogeneous classifier fusion led to a similar or better performance than the best individual model for all assays. Further validation using a pair of in-house assays and data from a confirmatory screen - including a prospective set of around 2000 compounds selected based on our approach - confirmed the good performance. Thus, the combination of machine-learning with HTS fingerprints and chemical fingerprints utilizes information from both domains and presents a very promising approach for hit expansion, leading to more hits. The source code used with the public data is provided.
Journal of Chemical Information and Modeling 06/2014; 54(7). DOI:10.1021/ci500190p · 3.74 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Molecular dynamics simulation of biomolecules in solvent using an atomic model for both the biomolecules and the solvent molecules is still computationally rather demanding considering the time scale of the biomolecular motions. The use of a supramolecular coarse-grained (CG) model can speed up the simulation considerably, but it also reduces the accuracy inevitably. Combining an atomic fine-grained (FG) level of modeling for the biomolecules and a supramolecular CG level for the solvent into a hybrid system, the increased computational efficiency may outweigh the loss of accuracy with respect to the biomolecular properties in the hybrid FG/CG simulation. Here, a previously published CG methanol model is reparametrized, and then a 1:1 mixture of FG and CG methanol is used to calibrate the FG-CG interactions using thermodynamic and dielectric screening data for liquid methanol. The FG-CG interaction parameter set is applied in hybrid FG/CG solute/solvent simulations of the folding equilibria of three β-peptides that adopt different folds. The properties of the peptides are compared with those obtained in FG solvent simulations and with experimental NMR data. The comparison shows that the folding equilibria in the pure CG solvent simulations are different from those in the FG solvent simulations because of the lack of hydrogen-bonding partners in the supramolecular CG solvent. Next, we introduced an FG methanol layer around the peptides in CG solvent to recover the hydrogen-bonding pattern of the FG solvent simulations. The result shows that with the FG methanol layer, the folding equilibria of the three β-peptides are very similar to those in the FG solvent simulations, while the computational efficiency is at least 3 times higher and the cutoff radius for nonbonded interactions could be increased from 1.4 to 2.0 nm.
Journal of Chemical Theory and Computation 05/2014; 10(6):2213–2223. DOI:10.1021/ct500048c · 5.50 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The concept of data fusion - the combination of information from different sources describing the same object with the expectation to generate a more accurate representation - has found application in a very broad range of disciplines. In the context of ligand-based virtual screening (VS), data fusion has been applied to combine knowledge from either different active molecules or different fingerprints to improve similarity search performance. Machine-learning (ML) methods based on fusion of multiple homogeneous classifiers, in particular random forests, have also been widely applied in the ML literature. The heterogeneous version of classifier fusion - fusing the predictions from different model types - has been less explored. Here, we investigate heterogeneous classifier fusion for ligand-based VS using three different ML methods, RF, naïve Bayes (NB) and logistic regression (LR), with four 2D fingerprints, atom pairs, topological torsions, RDKit fingerprint and circular fingerprint. The methods are compared using a previously developed benchmarking platform for 2D fingerprints which is extended to ML methods in this article. The original data sets are filtered for difficulty and a new set of challenging data sets from ChEMBL is added. Data sets were also generated for a second use case: starting from a small set of related actives instead of diverse actives. The final fused model consistently outperforms the other approaches across the broad variety of targets studied, indicating that heterogeneous classifier fusion is a very promising approach for ligand-based VS. The new data sets together with the adapted source code forML methods are provided in the supplementary material.
Journal of Chemical Information and Modeling 10/2013; 53(11). DOI:10.1021/ci400466r · 3.74 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Fingerprint similarity is a common method for comparing chemical structures. Similarity is an appealing approach because, with many fingerprint types, it provides intuitive results: a chemist looking at two molecules can understand why they have been determined to be similar. This transparency is partially lost with the fuzzier similarity methods that are often used for scaffold hopping and tends to vanish completely when molecular fingerprints are used as inputs to machine-learning (ML) models. Here we present similarity maps, a straightforward and general strategy to visualize the atomic contributions to the similarity between two molecules or the predicted probability of a ML model. We show the application of similarity maps to a set of dopamine D3 receptor ligands using atom-pair and circular fingerprints as well as two popular ML methods: random forests and naïve Bayes. An open-source implementation of the method is provided.
Journal of Cheminformatics 09/2013; 5(1):43. DOI:10.1186/1758-2946-5-43 · 4.55 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Similarity-search methods using molecular fingerprints are an important tool for ligand-based virtual screening. A huge variety of fingerprints exist and their performance, usually assessed in retrospective benchmarking studies using data sets with known actives and known or assumed inactives, depends largely on the validation data sets used and the similarity measure used. Comparing new methods to existing ones in any systematic way is rather difficult due to the lack of standard data sets and evaluation procedures. Here, we present a standard platform for the benchmarking of 2D fingerprints. The open-source platform contains all source code, structural data for the actives and inactives used (drawn from three publicly available collections of data sets), and lists of randomly selected query molecules to be used for statistically valid comparisons of methods. This allows the exact reproduction and comparison of results for future studies. The results for 12 standard fingerprints together with two simple baseline fingerprints assessed by seven evaluation methods are shown together with the correlations between methods. High correlations were found between the 12 fingerprints and a careful statistical analysis showed that only the two baseline fingerprints were different from the others in a statistically significant way. High correlations were also found between six of the seven evaluation methods, indicating that despite their seeming differences, many of these methods are similar to each other.
Journal of Cheminformatics 05/2013; 5(1):26. DOI:10.1186/1758-2946-5-26 · 4.55 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Theoretical-computational modeling with an eye to explaining experimental observations in regard to a particular chemical phenomenon or process requires choices concerning essential degrees of freedom and types of interactions and the generation of a Boltzmann ensemble or trajectories of configurations. Depending on the degrees of freedom that are essential to the process of interest, for example, electronic or nuclear versus atomic, molecular or supra-molecular, quantum- or classical-mechanical equations of motion are to be used. In multi-resolution simulation, various levels of resolution, for example, electronic, atomic, supra-atomic or supra-molecular, are combined in one model. This allows an enhancement of the computational efficiency, while maintaining sufficient detail with respect to particular degrees of freedom. The basic challenges and choices with respect to multi-resolution modeling are reviewed and as an illustration the differential catalytic properties of two enzymes with similar folds but different substrates with respect to these substrates are explored using multi-resolution simulation at the electronic, atomic and supra-molecular levels of resolution.
[Show abstract][Hide abstract] ABSTRACT: Theoretische und computergestützte Modellierungen, die der Erklärung experimenteller Beobachtungen im Hinblick auf ein bestimmtes chemisches Phänomen oder einen bestimmten chemischen Prozess dienen, erfordern eine Reihe von Annahmen. Diese Annahmen betreffen die essentiellen Freiheitsgrade, die Art der Wechselwirkungen und die Erzeugung eines Boltzmann-Ensembles oder einer Konfigurationstrajektorie. Abhängig von den Freiheitsgraden, die für den interessierenden Prozess unabdingbar sind, wie z. B. elektronische, nukleare oder atomare, molekulare oder supramolekulare, müssen quantenmechanische oder klassisch-mechanische Bewegungsgleichungen angewendet werden. In Simulationen mit unterschiedlichen Auflösungsniveaus werden verschiedene Ebenen wie elektronische, atomare, supraatomare oder supramolekulare Ebenen in einem einzigen Modell vereint. Dies erlaubt eine Steigerung der Recheneffizienz, wobei eine ausreichende Genauigkeit im Hinblick auf die bestimmten Freiheitsgrade erhalten bleibt. Im Folgenden wird ein Überblick über die grundlegenden Herausforderungen und Annahmen in Bezug auf Modellierungen mit unterschiedlichen Auflösungsniveaus gegeben. Zur Veranschaulichung werden die unterschiedlichen katalytischen Eigenschaften zweier Enzyme, die sich in ihrer Struktur ähneln, jedoch unterschiedliche Substrate binden, im Hinblick auf diese Substrate unter Verwendung von Simulationen mit elektronischen, atomaren und supramolekularen Auflösungsniveaus untersucht.
[Show abstract][Hide abstract] ABSTRACT: Atomistic molecular dynamics simulations of peptides or proteins in aqueous solution are still limited to the multi-nanosecond time scale and multi-nanometer range by computational cost. Combining atomic solutes with a supramolecular solvent model in hybrid fine-grained/coarse-grained (FG/CG) simulations allows atomic detail in the region of interest while being computationally more efficient. We used enveloping distribution sampling (EDS) to calculate the free enthalpy differences between different helical conformations, i.e., α-, π-, and 310-helices, of an atomic level FG alanine deca-peptide solvated in a supramolecular CG water solvent. The free enthalpy differences obtained show that by replacing the FG solvent by the CG solvent, the π-helix is destabilized with respect to the α-helix by about 2.5 kJ mol–1, and the 310-helix is stabilized with respect to the α-helix by about 9 kJ mol–1. In addition, the dynamics of the peptide becomes faster. By introducing a FG water layer of 0.8 nm around the peptide, both thermodynamic and dynamic properties are recovered, while the hybrid FG/CG simulations are still four times more efficient than the atomistic simulations, even when the cutoff radius for the nonbonded interactions is increased from 1.4 to 2.0 nm. Hence, the hybrid FG/CG model, which yields an appropriate balance between reduced accuracy and enhanced computational speed, is very suitable for molecular dynamics simulation investigations of biomolecules.
Journal of Chemical Theory and Computation 02/2013; 9(3):1328–1333. DOI:10.1021/ct3010497 · 5.50 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Water molecules in the binding pocket of a protein and their role in ligand binding have increasingly raised interest in recent years. Displacement of such water molecules by ligand atoms can be either favourable or unfavourable for ligand binding depending on the change in free enthalpy. In this study, we investigate the displacement of water molecules by an apolar probe in the binding pocket of two proteins, cyclin-dependent kinase 2 and tRNA-guanine transglycosylase, using the method of enveloping distribution sampling (EDS) to obtain free enthalpy differences. In both cases, a ligand core is placed inside the respective pocket and the remaining water molecules are converted to apolar probes, both individually and in pairs. The free enthalpy difference between a water molecule and a CH(3) group at the same location in the pocket in comparison to their presence in bulk solution calculated from EDS molecular dynamics simulations corresponds to the binding free enthalpy of CH(3) at this location. From the free enthalpy difference and the enthalpy difference, the entropic contribution of the displacement can be obtained too. The overlay of the resulting occupancy volumes of the water molecules with crystal structures of analogous ligands shows qualitative correlation between experimentally measured inhibition constants and the calculated free enthalpy differences. Thus, such an EDS analysis of the water molecules in the binding pocket may give valuable insight for potency optimization in drug design.
[Show abstract][Hide abstract] ABSTRACT: Considering N-methylacetamide (NMA) as a model compound, new interaction parameters are developed for the amide function in the GROMOS force field that are compatible with the recently derived 53A6(OXY) parameter set for oxygen-containing chemical functions. The resulting set, referred to as 53A6(OXY+A) , represents an improvement over earlier GROMOS force-field versions in the context of the pure-liquid properties of NMA, including the density, heat of vaporization, dielectric permittivity, self-diffusion constant and viscosity, as well as in terms of the Gibbs hydration free energy of this molecule. Assuming that NMA represents an adequate model compound for the backbone of peptides, 53A6(OXY+A) may be expected to also provide an improved description of polypeptide chains. As an initial test, simulations are reported for two β-peptides characterized by very different folding properties in methanol. For these systems, earlier force-field versions provided good agreement with experimental NMR data, and the test shows that the improved description achieved in the context of NMA is not accompanied by any deterioration in the representation of the conformational properties of these peptides.
[Show abstract][Hide abstract] ABSTRACT: The use of a supra-molecular coarse-grained (CG) model for liquid water as solvent in molecular dynamics simulations of biomolecules represented at the fine-grained (FG) atomic level of modelling may reduce the computational effort by one or two orders of magnitude. However, even if the pure FG model and the pure CG model represent the properties of the particular substance of interest rather well, their application in a hybrid FG/CG system containing varying ratios of FG versus CG particles is highly non-trivial, because it requires an appropriate balance between FG-FG, FG-CG, and CG-CG energies, and FG and CG entropies. Here, the properties of liquid water are used to calibrate the FG-CG interactions for the simple-point-charge water model at the FG level and a recently proposed supra-molecular water model at the CG level that represents five water molecules by one CG bead containing two interaction sites. Only two parameters are needed to reproduce different thermodynamic and dielectric properties of liquid water at physiological temperature and pressure for various mole fractions of CG water in FG water. The parametrisation strategy for the FG-CG interactions is simple and can be easily transferred to interactions between atomistic biomolecules and CG water.
The Journal of Chemical Physics 07/2012; 137(4):044120. DOI:10.1063/1.4739068 · 2.95 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Atomistic molecular dynamics simulations of proteins in aqueous solution are still limited to the multinanosecond time scale and multinanometer range by computational cost. Combining atomic solutes with a supra-molecular solvent model in hybrid fine-grained/coarse-grained (FG/CG) simulations allows atomic detail in the region of interest while being computationally more efficient. A recent comparison of the properties of four proteins in CG water versus FG water showed the preservation of the secondary and tertiary structure with a computational speed-up of at least an order of magnitude. However, an increased occurrence of hydrogen bonds between side chains was observed due to a lack of hydrogen-bonding partners in the supra-molecular solvent. Here, the introduction of a FG water layer around the protein to recover the hydrogen-bonding pattern of the atomistic simulations is studied. Three layer thicknesses of 0.2, 0.4, and 0.8 nm are considered. A layer thickness of 0.8 nm is found sufficient to recover the behavior of the proteins in the atomistic simulations, whereas the hybrid simulation is still three times more efficient than the atomistic one and the cutoff radius for nonbonded interactions could be increased from 1.4 to 2.0 nm.
The Journal of Physical Chemistry B 07/2012; 116(30):8873-9. DOI:10.1021/jp304188z · 3.30 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Simulation of the dynamics of a protein in aqueous solution using an atomic model for both the protein and the many water molecules is still computationally extremely demanding considering the time scale of protein motions. The use of supra-atomic or supra-molecular coarse-grained (CG) models may enhance the computational efficiency, but inevitably at the cost of reduced accuracy. Coarse-graining solvent degrees of freedom is likely to yield a favourable balance between reduced accuracy and enhanced computational speed. Here, the use of a supra-molecular coarse-grained water model that largely preserves the thermodynamic and dielectric properties of atomic level fine-grained (FG) water in molecular dynamics simulations of an atomic model for four proteins is investigated. The results of using an FG, a CG, an implicit, or a vacuum solvent environment of the four proteins are compared, and for hen egg-white lysozyme a comparison to NMR data is made. The mixed-grained simulations do not show large differences compared to the FG atomic level simulations, apart from an increased tendency to form hydrogen bonds between long side chains, which is due to the reduced ability of the supra-molecular CG beads that represent five FG water molecules to make solvent-protein hydrogen bonds. But, the mixed-grained simulations are at least an order of magnitude faster than the atomic level ones.
Biophysics of Structure and Mechanism 07/2012; 41(8):647-61. DOI:10.1007/s00249-012-0837-1 · 2.22 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: So-called coarse-grained models are a popular type of model for accessing long time scales in simulations of biomolecular processes. Such models are coarse-grained with respect to atomic models. But any modelling of processes or substances involves coarse-graining, i.e. the elimination of non-essential degrees of freedom and interactions from a more fine-grained level of modelling. The basic ingredients of developing coarse-grained models based on the properties of fine-grained models are reviewed, together with the conditions that must be satisfied in order to preserve the correct physical mechanisms in the coarse-graining process. This overview should help the reader to determine how realistic a coarse-grained model of a biomolecular system is, i.e. whether it reflects the underlying physical mechanisms or merely provides a set of pretty pictures of the process or substances of interest.
[Show abstract][Hide abstract] ABSTRACT: For most liquids, the static relative dielectric permittivity is a decreasing function of temperature, because enhanced thermal motion reduces the ability of the molecular dipoles to orient under the effect of an external electric field. Monocarboxylic fatty acids ranging from acetic to octanoic acid represent an exception to this general rule. Close to room temperature, their dielectric permittivity increases slightly with increasing temperature. Herein, the causes for this anomaly are investigated based on molecular dynamics simulations of acetic and propionic acids at different temperatures in the interval 283-363 K, using the GROMOS 53A6(OXY) force field. The corresponding methyl esters are also considered for comparison. The dielectric permittivity is calculated using either the box-dipole fluctuation (BDF) or the external electric field (EEF) methods. The normal and anomalous temperature dependences of the permittivity for the esters and acids, respectively, are reproduced. Furthermore, in the EEF approach, the response of the acids to an applied field of increasing strength is found to present two successive linear regimes before reaching saturation. The low-field permittivity ε, comparable to that obtained using the BDF approach, increases with increasing temperature. The higher-field permittivity ε' is slightly larger, and decreases with increasing temperature. Further analyses of the simulations in terms of radial distribution functions, hydrogen-bonded structures, and diffusion properties suggest that increasing the temperature or the applied field strength both promote a relative population shift from cyclic (mainly dimeric) to extended (chain-like) hydrogen-bonded structures. The lower effective dipole moment associated with the former structures compared to the latter ones provides an explanation for the peculiar dielectric properties of the two acids compared to their methyl esters.