Improved prediction of critical residues for protein function based on network and phylogenetic analyses
Phylogenetic approaches are commonly used to predict which amino acid residues are critical to the function of a given protein. However, such approaches display inherent limitations, such as the requirement for identification of multiple homologues of the protein under consideration. Therefore, complementary or alternative approaches for the prediction of critical residues would be desirable. Network analyses have been used in the modelling of many complex biological systems, but only very recently have they been used to predict critical residues from a protein's three-dimensional structure. Here we compare a couple of phylogenetic approaches to several different network-based methods for the prediction of critical residues, and show that a combination of one phylogenetic method and one network-based method is superior to other methods previously employed.
We associate a network with each member of a set of proteins for which the three-dimensional structure is known and the critical residues have been previously determined experimentally. We show that several network-based centrality measurements ( connectivity , 2-connectivity , closeness centrality , betweenness and cluster coefficient ) accurately detect residues critical for the protein's function. Phylogenetic approaches render predictions as reliable as the network-based measurements, although, interestingly, the two general approaches tend to predict different sets of critical residues. Hence we propose a hybrid method that is composed of one network-based calculation – the closeness centrality – and one phylogenetic approach – the Conseq server. This hybrid approach predicts critical residues more accurately than the other methods tested here.
We show that network analysis can be used to improve the prediction of amino acids critical for protein function, when utilized in combination with phylogenetic approaches. It is proposed that such improvement is due to the complementary nature of these approaches: network-based methods tend to predict as critical those residues that are highly connected and internal (i.e., non-surface), although some surface residues are indeed identified as critical by network analyses; whereas residues chosen by phylogenetic approaches display a lower overall probability of being surface inaccessible.