Sequence preference of α-helix N-terminal tetrapeptide.
ABSTRACT The α-helix is the most abundant secondary structure in proteins. Due to the specific i, i+4 hydrogen bond pattern, the two termini have unsatisfied hydrogen bonds, and are less constrained; in order to compensate for this, specific residues are preferred for the terminal positions. However, a naive combination of the statistically-preferred residues for each position may not result in a stable N-terminal helical sequence. In order to provide a set of preferable N-terminal peptides for α-helix design, we have studied the N-terminal tetrapeptide sequence motifs that are favorable for helix formation using statistical analysis and atomistic simulations. A set of tetrapeptide sequences including TEEE and TPEE were found to be favorable motifs. In addition to forming more hydrogen bonds in the helical conformation, the favorable motifs also tended to form more capping boxes. To empirically test our predictions, we obtained 10 peptides with different N-terminal motifs and measured their α-helical content by circular dichroism spectroscopy. The experimental results agreed qualitatively with the statistical and simulation results. Furthermore, some of the suggested preferable tetrapeptide sequences have been successfully applied in de novo protein design.
- [Show abstract] [Hide abstract]
ABSTRACT: The prediction of protein structural classes is beneficial to understanding folding patterns, functions and interactions of proteins. In this study, we proposed a feature selection-based method to accurately predict protein structural classes. Three datasets with sequence identity lower than 25% were used to test the prediction performance of the method. Through jackknife cross-validation, we have verified that the overall accuracies of these three datasets are 92.1%, 89.7% and 84.0%, respectively. The proposed method is more efficient and accurate than other existing methods. The present study will offer an excellent alternative to other methods for predicting protein structural classes.Interdisciplinary Sciences Computational Life Sciences 09/2014; 6(3):235-40. DOI:10.1007/s12539-013-0205-6 · 0.66 Impact Factor