[Show abstract] [Hide abstract]
ABSTRACT: A new approach to the calibration of the force fields is proposed, in which force-field parameters are obtained by maximum-likelihood fitting of the calculated conformational ensembles to the experimental ensembles of training system(s). The maximum-likelihood function is composed of logarithms of the Boltzmann probabilities of the experimental conformations, calculated with the current energy function. Because the theoretical distribution is given in the form of the simulated conformations only, the contributions from all simulated conformations, with Gaussian weights in the distances from a given experimental conformation, are added up to give the contribution to the target function from this conformation. In contrast to earlier methods for force-field calibration, the approach does not suffer from the arbitrariness of dividing the decoy set into native-like and non-native structures; however, if such a division is made instead of using Gaussian weights, application of the maximum-likelihood method results in the well-known energy-gap maximization. The computational procedure consists of cycles of decoy generation and maximum-likelihood-function optimization, which are iterated until converged. The method was tested with Gaussian distributions and then applied to the physics-based coarse-grained UNRES force field for proteins. The NMR structures of the tryptophan cage, a small α-helical protein, determined at three temperatures (T = 280 K, T = 305 K, and T = 313 K) by Hałabis et al. (J. Phys. Chem. B, 2012, 116, 6898-6907), were used. Multiplexed replica-exchange molecular dynamics was used to generate the decoys. The iterative procedure exhibited steady convergence. Three variants of optimization were tried: optimization of energy-term weights alone and use of the experimental ensemble of the folded protein only at T = 280 K (run 1), optimization of energy-term weights, and use of experimental ensembles at all three temperatures (run 2), and optimization of the energy-term weights and of the coefficients of the torsional and multibody energy terms, and use of of experimental ensembles at all three temperatures (run 3). The force fields were subsequently tested with a set of 14 α-helical and 2 α+β proteins. Optimization run 1 resulted in a better agreement with the experimental ensemble at T = 280 K compared to optimization run 2 and in a comparable performance on the test set but in poorer agreement of the calculated folding temperature with the experimental folding temperature. Optimization run 3 resulted in the best fit of the calculated to the experimental ensembles of tryptophan cage but in much poorer performance on the training set, this suggesting that use of a small α-helical protein for extensive force-field calibration resulted in over-fitting the data of this protein at the expense of transferability. The optimized force field resulting from run 2 was found to fold 13 out of 14 tested α-helical proteins and one small α+β-protein with correct topology; the average structures of 10 of them were predicted with the accuracy of about 5 Å Cα-root-mean-square deviation or better. Test simulations with an additional set of 12 α-helical proteins demonstrated that this force field performed better on α-helical proteins than the previous parameterizations of UNRES. The proposed approach is applicable to any problem of maximum-likelihood parameter estimation when the contributions to the maximum-likelihood function cannot be evaluated at the experimental points and the dimension of the configurational space is too high to construct histograms of the experimental distributions.
Journal of Chemical Information and Modeling 08/2015; 55(9). DOI:10.1021/acs.jcim.5b00395 · 3.74 Impact Factor
Biophysical Journal 01/2015; 108(2):158a. DOI:10.1016/j.bpj.2014.11.870 · 3.97 Impact Factor
[Show abstract] [Hide abstract]
ABSTRACT: The folding temperature of the trp-cage mini-protein was determined to be in the range 311-317 K depending on the method used. Our study is focused on determining the structure and dynamics of the polypeptide chain close to its unfolding or melting temperature. At T = 305 K, Trp6-Arg16 and Trp6-Pro12 long-range interactions are observed, and at T = 313 K, only the Trp6-Arg16 interactions remain, while all of mentioned interactions are observed in the native state of the protein. Partial (at T = 305 K) and complete (at T = 313 K) melting of the N-terminal α-helix is observed, manifested by the appearance of minor sets of signals in NMR spectra. Our key findings are: (i) conformational phase transition (melting point) could be described as a cooperative breaking of the Trp6-Pro12 long-range hydrophobic interaction and the melting of the N-terminal α-helix; (ii) many ROE signals corresponding to local or short-range interactions vanish rapidly with temperature increase; however, long-range interaction such as Trp6-Arg16 remains until 313 K. The presence of the native long-range interaction at 313 K makes that conformational ensemble resemble a very diffuse native state structure, but it is not a simple mixture of the folded and unfolded states, as could be expected on the basis of the common two-state folding mechanism.
The Journal of Physical Chemistry B 04/2012; 116(23):6898-907. DOI:10.1021/jp212630y · 3.30 Impact Factor