[Show abstract][Hide abstract] ABSTRACT: As part of the SAMPL4 blind challenge, filtered AutoDock Vina ligand docking predictions and large scale binding energy distribution analysis method binding free energy calculations have been applied to the virtual screening of a focused library of candidate binders to the LEDGF site of the HIV integrase protein. The computational protocol leveraged docking and high level atomistic models to improve enrichment. The enrichment factor of our blind predictions ranked best among all of the computational submissions, and second best overall. This work represents to our knowledge the first example of the application of an all-atom physics-based binding free energy model to large scale virtual screening. A total of 285 parallel Hamiltonian replica exchange molecular dynamics absolute protein-ligand binding free energy simulations were conducted starting from docked poses. The setup of the simulations was fully automated, calculations were distributed on multiple computing resources and were completed in a 6-weeks period. The accuracy of the docked poses and the inclusion of intramolecular strain and entropic losses in the binding free energy estimates were the major factors behind the success of the method. Lack of sufficient time and computing resources to investigate additional protonation states of the ligands was a major cause of mispredictions. The experiment demonstrated the applicability of binding free energy modeling to improve hit rates in challenging virtual screening of focused ligand libraries during lead optimization.
Journal of Computer-Aided Molecular Design 02/2014; · 3.17 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Replica exchange represents a powerful class of algorithms used for enhanced configurational and energetic sampling in a range of physical systems. Computationally it represents a type of application with multiple scales of communication. At a fine-grained level there is often communication with a replica, typically an MPI process. At a coarse-grained level, the replicas communicate with other replicas -- both temporally as well as in amount of data exchanged. This paper outlines a novel framework developed to support the flexible execution of large-scale replica exchange. The framework is flexible in the sense that it supports different coupling schemes between replicas and is agnostic to the specific underlying simulation -- classical or quantum, serial or parallel simulation. The scalability of the framework is assessed using standard simulation benchmarks. In spite of the increasing communication and coordination requirements as a function of the number of replicas, our framework supports the execution of hundreds replicas without significant overhead. Although there are several specific aspects that will benefit from further optimization, a first working prototype has the ability to fundamentally change the scale of replica exchange simulations possible on production distributed cyberinfrastructure such as XSEDE, as well as support novel usage modes. This paper also represents the release of the framework to the broader biophysical simulation community and provides details on its usage.
Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery; 07/2013
[Show abstract][Hide abstract] ABSTRACT: The development of an effective AIDS vaccine has been a formidable task, but remains a critical necessity. The well conserved membrane-proximal external region (MPER) of the HIV-1 gp41 glycoprotein is one of the crucial targets for AIDS vaccine development, as it has the necessary attribute of being able to elicit antibodies capable of neutralizing diverse isolates of HIV.
Guided by X-ray crystallography, molecular modeling, combinatorial chemistry, and powerful selection techniques, we designed and produced six combinatorial libraries of chimeric human rhinoviruses (HRV) displaying the MPER epitopes corresponding to mAbs 2F5, 4E10, and/or Z13e1, connected to an immunogenic surface loop of HRV via linkers of varying lengths and sequences. Not all libraries led to viable chimeric viruses with the desired sequences, but the combinatorial approach allowed us to examine large numbers of MPER-displaying chimeras. Among the chimeras were five that elicited antibodies capable of significantly neutralizing HIV-1 pseudoviruses from at least three subtypes, in one case leading to neutralization of 10 pseudoviruses from all six subtypes tested.
Optimization of these chimeras or closely related chimeras could conceivably lead to useful components of an effective AIDS vaccine. While the MPER of HIV may not be immunodominant in natural infection by HIV-1, its presence in a vaccine cocktail could provide critical breadth of protection.
PLoS ONE 01/2013; 8(9):e72205. · 3.73 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The weighted histogram analysis method (WHAM) is routinely used for computing free energies and expectations from multiple ensembles. Existing derivations of WHAM require observations to be discretized into a finite number of bins. Yet, WHAM formulas seem to hold even if the bin sizes are made arbitrarily small. The purpose of this article is to demonstrate both the validity and value of the multi-state Bennet acceptance ratio (MBAR) method seen as a binless extension of WHAM. We discuss two statistical arguments to derive the MBAR equations, in parallel to the self-consistency and maximum likelihood derivations already known for WHAM. We show that the binless method, like WHAM, can be used not only to estimate free energies and equilibrium expectations, but also to estimate equilibrium distributions. We also provide a number of useful results from the statistical literature, including the determination of MBAR estimators by minimization of a convex function. This leads to an approach to the computation of MBAR free energies by optimization algorithms, which can be more effective than existing algorithms. The advantages of MBAR are illustrated numerically for the calculation of absolute protein-ligand binding free energies by alchemical transformations with and without soft-core potentials. We show that binless statistical analysis can accurately treat sparsely distributed interaction energy samples as obtained from unmodified interaction potentials that cannot be properly analyzed using standard binning methods. This suggests that binless multi-state analysis of binding free energy simulations with unmodified potentials offers a straightforward alternative to the use of soft-core potentials for these alchemical transformations.
The Journal of Chemical Physics 04/2012; 136(14):144102. · 3.12 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The results of computer simulations of the binding of etravirine (TMC125) and rilpivirine (TMC278) to HIV reverse transcriptase are reported. It is confirmed that consistent binding free energy estimates are obtained with or without the application of torsional restraints when the free energies of imposing the restraints are taken into account. The restraints have a smaller influence on the thermodynamics and apparent kinetics of binding of TMC125 compared to the more flexible TMC278 inhibitor. The concept of the reorganization free energy of binding is useful to understand and categorize these effects. Contrary to expectations, the use of conformational restraints did not consistently enhance convergence of binding free energy estimates due to suppression of binding/unbinding pathways and due to the influence of rotational degrees of freedom not directly controlled by the restraints. Physical insights concerning the thermodynamic driving forces for binding and the role of "jiggling" and "wiggling" motion of the ligands are discussed. Based on these insights we conclude that an ideal inhibitor, if chemically realizable, would possess the electrostatic charge distribution of TMC125, so as to form strong interactions with the receptor, and the larger and more flexible substituents of TMC278, so as to minimize reorganization free energy penalties and the effects of resistance mutations, suitably modified, as in TMC125, so as to disfavor the formation of non-binding competent extended conformations when free in solution.
[Show abstract][Hide abstract] ABSTRACT: BEDAM calculations are described to predict the free energies of binding of a series of anaesthetic drugs to a recently characterized acyclic cucurbituril host. The modeling predictions, conducted as part of the SAMPL3 host-guest affinity blind challenge, are generally in good quantitative agreement with the experimental measurements. The correlation coefficient between computed and measured binding free energies is 70% with high statistical significance. Multiple conformational stereoisomers and protonation states of the guests have been considered. Better agreement is obtained with high statistical confidence under acidic modeling conditions. It is shown that this level of quantitative agreement could have not been reached without taking into account reorganization energy and configurational entropy effects. Extensive conformational variability of the host, the guests and their complexes is observed in the simulations, affecting binding free energy estimates and structural predictions. A conformational reservoir technique is introduced as part of the parallel Hamiltonian replica exchange molecular dynamics BEDAM protocol to fully capture conformational variability. It is shown that these advanced computational strategies lead to converged free energy estimates for these systems, offering the prospect of utilizing host-guest binding free energy data for force field validation and development.
[Show abstract][Hide abstract] ABSTRACT: The Binding Energy Distribution Analysis Method (BEDAM) is employed to compute the standard binding free energies of a series of ligands to a FK506 binding protein (FKBP12) with implicit solvation. Binding free energy estimates are in reasonably good agreement with experimental affinities. The conformations of the complexes identified by the simulations are in good agreement with crystallographic data, which was not used to restrain ligand orientations. The BEDAM method is based on λ -hopping Hamiltonian parallel Replica Exchange (HREM) molecular dynamics conformational sampling, the OPLS-AA/AGBNP2 effective potential, and multi-state free energy estimators (MBAR). Achieving converged and accurate results depends on all of these elements of the calculation. Convergence of the binding free energy is tied to the level of convergence of binding energy distributions at critical intermediate states where bound and unbound states are at equilibrium, and where the rate of binding/unbinding conformational transitions is maximal. This finding mirrors similar observations in the context of order/disorder transitions as for example in protein folding. Insights concerning the physical mechanism of ligand binding and unbinding are obtained. Convergence for the largest FK506 ligand is achieved only after imposing strict conformational restraints, which however require accurate prior structural knowledge of the structure of the complex. The analytical AGBNP2 model is found to underestimate the magnitude of the hydrophobic driving force towards binding in these systems characterized by loosely packed protein-ligand binding interfaces. Rescoring of the binding energies using a numerical surface area model corrects this deficiency. This study illustrates the complex interplay between energy models, exploration of conformational space, and free energy estimators needed to obtain robust estimates from binding free energy calculations.
Journal of Chemical Theory and Computation 01/2012; 8(1):47-60. · 5.39 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The coupling of protein energetics and sequence changes is a critical aspect of computational protein design, as well as for the understanding of protein evolution, human disease, and drug resistance. To study the molecular basis for this coupling, computational tools must be sufficiently accurate and computationally inexpensive enough to handle large amounts of sequence data. We have developed a computational approach based on the linear interaction energy (LIE) approximation to predict the changes in the free-energy of the native state induced by a single mutation. This approach was applied to a set of 822 mutations in 10 proteins which resulted in an average unsigned error of 0.82 kcal/mol and a correlation coefficient of 0.72 between the calculated and experimental ΔΔG values. The method is able to accurately identify destabilizing hot spot mutations; however, it has difficulty in distinguishing between stabilizing and destabilizing mutations because of the distribution of stability changes for the set of mutations used to parameterize the model. In addition, the model also performs quite well in initial tests on a small set of double mutations. On the basis of these promising results, we can begin to examine the relationship between protein stability and fitness, correlated mutations, and drug resistance.
Proteins Structure Function and Bioinformatics 01/2012; 80(1):111-25. · 3.34 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Conformational dynamics plays a fundamental role in the regulation of molecular recognition processes. Conformational heterogeneity and entropy variations upon binding, although not always evident from the analysis of structural data, can substantially affect affinity and specificity. Computer modeling is able to provide some of the most direct insights into these aspects of molecular recognition. We review recent physics-based computational studies that employ advanced conformational sampling algorithms and effective potentials to model the three main classes of degrees of freedom relevant to the binding process: ligand positioning relative to the receptor, ligand and receptor internal reorganization, and hydration. Collectively these studies show that all of these elements are important for proper modeling of protein-ligand interactions.
Current Opinion in Structural Biology 02/2011; 21(2):161-6. · 8.74 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We present a new approach to study a multitude of folding pathways and different folding mechanisms for the 20-residue mini-protein Trp-Cage using the combined power of replica exchange molecular dynamics (REMD) simulations for conformational sampling, transition path theory (TPT) for constructing folding pathways, and stochastic simulations for sampling the pathways in a high dimensional structure space. REMD simulations of Trp-Cage with 16 replicas at temperatures between 270 and 566 K are carried out with an all-atom force field (OPLSAA) and an implicit solvent model (AGBNP). The conformations sampled from all temperatures are collected. They form a discretized state space that can be used to model the folding process. The equilibrium population for each state at a target temperature can be calculated using the weighted-histogram-analysis method (WHAM). By connecting states with similar structures and creating edges satisfying detailed balance conditions, we construct a kinetic network that preserves the equilibrium population distribution of the state space. After defining the folded and unfolded macrostates, committor probabilities (P(fold)) are calculated by solving a set of linear equations for each node in the network and pathways are extracted together with their fluxes using the TPT algorithm. By clustering the pathways into folding "tubes", a more physically meaningful picture of the diversity of folding routes emerges. Stochastic simulations are carried out on the network, and a procedure is developed to project sampled trajectories onto the folding tubes. The fluxes through the folding tubes calculated from the stochastic trajectories are in good agreement with the corresponding values obtained from the TPT analysis. The temperature dependence of the ensemble of Trp-Cage folding pathways is investigated. Above the folding temperature, a large number of diverse folding pathways with comparable fluxes flood the energy landscape. At low temperature, however, the folding transition is dominated by only a few localized pathways.
The Journal of Physical Chemistry B 02/2011; 115(6):1512-23. · 3.61 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We review recent theoretical and algorithmic advances for the modeling of protein ligand binding free energies. We first describe a statistical mechanics theory of noncovalent association, with particular focus on deriving the fundamental formulas on which computational methods are based. The second part reviews the main computational models and algorithms in current use or development, pointing out the relations with each other and with the theory developed in the first part. Particular emphasis is given to the modeling of conformational reorganization and entropic effect. The methods reviewed are free energy perturbation, double decoupling, the Binding Energy Distribution Analysis Method, the potential of mean force method, mining minima and MM/PBSA. These models have different features and limitations, and their ranges of applicability vary correspondingly. Yet their origins can all be traced back to a single fundamental theory.
Advances in protein chemistry and structural biology. 01/2011; 85:27-80.
[Show abstract][Hide abstract] ABSTRACT: The Binding Energy Distribution Analysis Method (BEDAM) for the computation of receptor-ligand standard binding free energies with implicit solvation is presented. The method is based on a well established statistical mechanics theory of molecular association. It is shown that, in the context of implicit solvation, the theory is homologous to the test particle method of solvation thermodynamics with the solute-solvent potential represented by the effective binding energy of the protein-ligand complex. Accordingly, in BEDAM the binding constant is computed by means of a weighted integral of the probability distribution of the binding energy obtained in the canonical ensemble in which the ligand is positioned in the binding site but the receptor and the ligand interact only with the solvent continuum. It is shown that the binding energy distribution encodes all of the physical effects of binding. The balance between binding enthalpy and entropy is seen in our formalism as a balance between favorable and unfavorable binding modes which are coupled through the normalization of the binding energy distribution function. An efficient computational protocol for the binding energy distribution based on the AGBNP2 implicit solvent model, parallel Hamiltonian replica exchange sampling and histogram reweighting is developed. Applications of the method to a set of known binders and non-binders of the L99A and L99A/M102Q mutants of T4 lysozyme receptor are illustrated. The method is able to discriminate without error binders from non-binders, and the computed standard binding free energies of the binders are found to be in good agreement with experimental measurements. Analysis of the results reveals that the binding affinities of these systems reflect the contributions from multiple conformations spanning a wide range of binding energies.
Journal of Chemical Theory and Computation 09/2010; 6(9):2961-2977. · 5.39 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The development of an effective AIDS vaccine remains the most promising long-term strategy to combat human immunodeficiency virus (HIV)/AIDS. Here, we report favorable antigenic characteristics of vaccine candidates isolated from a combinatorial library of human rhinoviruses displaying the ELDKWA epitope of the gp41 glycoprotein of HIV-1. The design principles of this library emerged from the application of molecular modeling calculations in conjunction with our knowledge of previously obtained ELDKWA-displaying chimeras, including knowledge of a chimera with one of the best 2F5-binding characteristics obtained to date. The molecular modeling calculations identified the energetic and structural factors affecting the ability of the epitope to assume conformations capable of fitting into the complementarity determining region of the ELDKWA-binding, broadly neutralizing human mAb 2F5. Individual viruses were isolated from the library following competitive immunoselection and were tested using ELISA and fluorescence quenching experiments. Dissociation constants obtained using both techniques revealed that some of the newly isolated chimeras bind 2F5 with greater affinity than previously identified chimeric rhinoviruses. Molecular dynamics simulations of two of these same chimeras confirmed that their HIV inserts were partially preorganized for binding, which is largely responsible for their corresponding gains in binding affinity. The study illustrates the utility of combining structure-based experiments with computational modeling approaches for improving the odds of selecting vaccine component designs with preferred antigenic characteristics. The results obtained also confirm the flexibility of HRV as a presentation vehicle for HIV epitopes and the potential of this platform for the development of vaccine components against AIDS.
Journal of Molecular Biology 02/2010; 397(3):752-66. · 3.91 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The use of the replica exchange (RE) molecular dynamics (MD) method for the efficient estimation of conformational populations of ligand-sized molecules in solution is investigated. We compare the computational efficiency of the traditional constant temperature MD technique with that of the parallel RE molecular dynamics method for a series of alkanes and rilpivirine (TMC278), an inhibitor against HIV-1 reverse transcriptase, with implicit solvation. We show that conformational populations are accurately estimated by both methods; however, replica exchange estimates converge at a faster rate, especially for rilpivirine, which is characterized by multiple stable states separated by high-free energy barriers. Furthermore, convergence is enhanced when the weighted histogram analysis method (WHAM) is used to estimate populations from the data collected from multiple RE temperature replicas. For small drug-like molecules with energetic barriers separating the stable states, the use of RE with WHAM is an efficient computational approach for estimating the contribution of ligand conformational reorganization to binding affinities.
Journal of Computational Chemistry 10/2009; 31(7):1357-67. · 3.84 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The non-nucleoside reverse transcriptase inhibitor (NNRTI) TMC278/rilpivirine is an anti-AIDS therapeutic agent with high oral bioavailability despite its high hydrophobicity. Previous studies established a correlation between ability of the drug molecule to form stable, homogeneous populations of spherical nanoparticles (approximately 100-120 nm in diameter) at low pH in surfactant-independent fashion and good oral bioavailability. Here, we hypothesize that the drug is able to assume surfactant-like properties under physiologically relevant conditions, thus facilitating formation of nanostructures in the absence of other surfactants. The results of all-atom molecular dynamics simulations indeed show that protonated drug molecules behave as surfactants at the water/aggregate interface while neutral drug molecules assist aggregate packing via conformational variability. Our simulation results suggest that amphiphilic behavior at low pH and intrinsic flexibility influence drug aggregation and are believed to play critical roles in the favorable oral bioavailability of hydrophobic drugs.
Journal of Medicinal Chemistry 10/2009; 52(19):5896-905. · 5.61 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We present an approach to recover kinetics from a simplified protein folding model at different temperatures using the combined power of replica exchange (RE), a kinetic network, and effective stochastic dynamics. While RE simulations generate a large set of discrete states with the correct thermodynamics, kinetic information is lost due to the random exchange of temperatures. We show how we can recover the kinetics of a 2D continuous potential with an entropic barrier by using RE-generated discrete states as nodes of a kinetic network. By choosing the neighbors and the microscopic rates between the neighbors appropriately, the correct kinetics of the system can be recovered by running a kinetic simulation on the network. We fine-tune the parameters of the network by comparison with the effective drift velocities and diffusion coefficients of the system determined from short-time stochastic trajectories. One of the advantages of the kinetic network model is that the network can be built on a high-dimensional discretized state space, which can consist of multiple paths not consistent with a single reaction coordinate.
The Journal of Physical Chemistry B 09/2009; 113(34):11702-9. · 3.61 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The AGBNP2 implicit solvent model, an evolution of the Analytical Generalized Born plus Non-Polar (AGBNP) model we have previously reported, is presented with the aim of modeling hydration effects beyond those described by conventional continuum dielectric representations. A new empirical hydration free energy component based on a procedure to locate and score hydration sites on the solute surface is introduced to model first solvation shell effects, such as hydrogen bonding, which are poorly described by continuum dielectric models. This new component is added to the Generalized Born and non-polar AGBNP terms. Also newly introduced is an analytical Solvent Excluded Volume (SEV) model which improves the solute volume description by reducing the effect of spurious high-dielectric interstitial spaces present in conventional van der Waals representations. The AGBNP2 model is parametrized and tested with respect to experimental hydration free energies of small molecules and the results of explicit solvent simulations. Modeling the granularity of water is one of the main design principles employed for the the first shell solvation function and the SEV model, by requiring that water locations have a minimum available volume based on the size of a water molecule. It is shown that the new volumetric model produces Born radii and surface areas in good agreement with accurate numerical evaluations of these quantities. The results of molecular dynamics simulations of a series of mini-proteins show that the new model produces conformational ensembles in substantially better agreement with reference explicit solvent ensembles than the original AGBNP model with respect to both structural and energetics measures.
Journal of Chemical Theory and Computation 07/2009; 5(9):2544-2564. · 5.39 Impact Factor