Transmembrane helix prediction in proteins using hydrophobicity properties and higher-order statistics
Prediction of the transmembrane (TM) helices is important in the study of membrane proteins. A novel method to predict the location and length of both single and multiple TM helices in human proteins is presented. The proposed method is based on a combination of hydrophobicity and higher-order statistics, resulting in a TM prediction tool, namely K(4)HTM. A training dataset of 117 human single TM proteins and two test-datasets containing 499 and 484 human single and multiple TM proteins, respectively, were drawn from the SWISS-PROT public database and used for the optimisation and evaluation of K(4)HTM. Validation results showed that K(4)HTM correctly predicts the entire topology for 99.68% and 93.08% of the sequences in the single and multiple test-datasets, respectively. These results compare favourably with existing methods, such as SPLIT4, TMHMM2, WAVETM and SOSUI, constituting an alternative approach to the TM helix prediction problem.