Article

Wavelet images and Chou's pseudo amino acid composition for protein classification.

Department of Information Engineering, University of Padua, Via Gradenigo 6, 35131 Padova, Italy.
Amino Acids (impact factor: 3.25). 10/2011; 43(2):657-65. DOI:10.1007/s00726-011-1114-9 pp.657-65
Source: PubMed

ABSTRACT The last decade has seen an explosion in the collection of protein data. To actualize the potential offered by this wealth of data, it is important to develop machine systems capable of classifying and extracting features from proteins. Reliable machine systems for protein classification offer many benefits, including the promise of finding novel drugs and vaccines. In developing our system, we analyze and compare several feature extraction methods used in protein classification that are based on the calculation of texture descriptors starting from a wavelet representation of the protein. We then feed these texture-based representations of the protein into an Adaboost ensemble of neural network or a support vector machine classifier. In addition, we perform experiments that combine our feature extraction methods with a standard method that is based on the Chou's pseudo amino acid composition. Using several datasets, we show that our best approach outperforms standard methods. The Matlab code of the proposed protein descriptors is available at http://bias.csr.unibo.it/nanni/wave.rar .

0 0
 · 
0 Bookmarks
 · 
63 Views

Keywords

Adaboost ensemble
 
approach outperforms standard methods
 
Chou's pseudo amino acid composition
 
classifying
 
datasets
 
extracting features
 
feature extraction methods
 
last decade
 
machine systems capable
 
Matlab code
 
neural network
 
protein data
 
proteins
 
Reliable machine systems
 
support vector machine classifier
 
texture descriptors
 
texture-based representations