String kernels and high-quality data set for improved prediction of kinked helices in α-helical membrane proteins.

Johannes Gutenberg-University of Mainz , 55128 Mainz, Germany.
Journal of Chemical Information and Modeling (Impact Factor: 4.3). 11/2011; 51(11):3017-25. DOI: 10.1021/ci200278w
Source: PubMed

ABSTRACT The reasons for distortions from optimal α-helical geometry are widely unknown, but their influences on structural changes of proteins are significant. Hence, their prediction is a crucial problem in structural bioinformatics. For the particular case of kink prediction, we generated a data set of 132 membrane proteins containing 1014 manually labeled helices and examined the environment of kinks. Our sequence analysis confirms the great relevance of proline and reveals disproportionately high occurrences of glycine and serine at kink positions. The structural analysis shows significantly different solvent accessible surface area mean values for kinked and nonkinked helices. More important, we used this data set to validate string kernels for support vector machines as a new kink prediction method. Applying the new predictor, about 80% of all helices could be correctly predicted as kinked or nonkinked even when focusing on small helical fragments. The results exceed recently reported accuracies of alternative approaches and are a consequence of both the method and the data set.

1 Bookmark
  • [Show abstract] [Hide abstract]
    ABSTRACT: Helix kinks are a common feature of α-helical membrane proteins, but are thought to be rare in soluble proteins. In this study we find that kinks are a feature of long α-helices in both soluble and membrane proteins, rather than just transmembrane α-helices. The apparent rarity of kinks in soluble proteins is due to the relative infrequency of long helices (≥ 20 residues) in these proteins. We compare length-matched sets of soluble and membrane helices, and find that the frequency of kinks, the role of Proline, the patterns of other amino acid around kinks (allowing for the expected differences in amino acid distributions between the two types of protein), and the effects of hydrogen bonds are the same for the two types of helices. In both types of protein, helices that contain Proline in the the second and subsequent turns are very frequently kinked. However, there are a sizeable proportion of kinked helices that do not contain a Proline in either their sequence or sequence homologues. Moreover, we observe that in soluble proteins, kinked helices have a structural preference in that they typically point into the solvent. © Proteins 2014;. © 2014 Wiley Periodicals, Inc.
    Proteins Structure Function and Bioinformatics 03/2014; · 3.34 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We have combined molecular dynamics simulations and fold identification procedures to investigate the structure of 696 kinked and 120 unkinked transmembrane (TM) helices in the PDBTM database. Our main aim of this study is to understand the formation of helical kinks by simulating their quasi-equilibrium heating processes, which might be relevant to the prediction of their structural features. The simulated structural features of these TM helices, including the position and the angle of helical kinks, were analyzed and compared with statistical data from PDBTM. From quasi-equilibrium heating processes of TM helices with four very different relaxation time constants, we found that these processes gave comparable predictions of the structural features of TM helices. Overall, 95 % of our best kink position predictions have an error of no more than two residues and 75 % of our best angle predictions have an error of less than 15°. Various structure assessments have been carried out to assess our predicted models of TM helices in PDBTM. Our results show that, in 696 predicted kinked helices, 70 % have a RMSD less than 2 Å, 71 % have a TM-score greater than 0.5, 69 % have a MaxSub score greater than 0.8, 60 % have a GDT-TS score greater than 85, and 58 % have a GDT-HA score greater than 70. For unkinked helices, our predicted models are also highly consistent with their crystal structure. These results provide strong supports for our assumption that kink formation of TM helices in quasi-equilibrium heating processes is relevant to predicting the structure of TM helices.
    Journal of Computer-Aided Molecular Design 02/2014; · 3.17 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The reasons for distortions from optimal α-helical geometry are widely unknown, but their influences on structural changes of proteins are significant. Hence, their prediction is a crucial problem in structural bioinformatics. Here, we present a new web server, called SKINK, for string kernel based kink prediction. Extending our previous study, we also annotate the most probable kink position in a given α-helix sequence. The SKINK web server is freely accessible at Moreover, SKINK is a module of the BALL software, also freely available at
    Bioinformatics 02/2014; · 5.47 Impact Factor


Available from
May 19, 2014