Article

Identity and divergence of protein domain architectures after the yeast whole-genome duplication event.

Università degli Studi di Torino, Dip. Fisica Teorica-Via Giuria 1, 10125 Torino, Italy.
Molecular BioSystems (impact factor: 3.53). 11/2010; 6(11):2305-15. DOI:10.1039/c003507f pp.2305-15
Source: PubMed

ABSTRACT Gene duplication is a key mechanism in evolution for generating new functionality, and it is known to have produced a large proportion of genes. Duplication mechanisms include small-scale, or "local", events such as unequal crossing over and retroposition, together with global events, such as chromosomal or whole genome duplication (WGD). In particular, different studies confirmed that the yeast S. cerevisiae arose from a 100-150 million-year old whole-genome duplication. Detection and study of duplications are usually based on sequence alignment, synteny and phylogenetic techniques, but protein domains are also useful in assessing protein homology. We develop a simple and computationally efficient protein domain architecture comparison method based on the domain assignments available from public databases. We test the accuracy and the reliability of this method in detecting instances of gene duplication in the yeast S. cerevisiae. In particular, we analyze the evolution of WGD and non-WGD paralogs from the domain viewpoint, in comparison with a more standard functional analysis of the genes. A large number of domains is shared by genes that underwent local and global duplications, indicating the existence of a common set of "duplicable" domains. On the other hand, WGD and non-WGD paralogs tend to have different functions. We find evidence that this comes from functional migration within similar domain superfamilies, but also from the existence of small sets of WGD and non-WGD specific domain superfamilies with largely different functions. This observation gives a novel perspective on the finding that WGD paralogs tend to be functionally different from small-scale paralogs. WGD and non-WGD superfamilies carry distinct functions. Finally, the Gene Ontology similarity of paralogs tends to decrease with duplication age, while this tendency is weaker or not observable by the comparison of the domain architectures of paralogs. This suggests that the set of domains composing a protein tends to be maintained, while its function, cellular process or localization diversifies. Overall, the gathered evidence gives a different viewpoint on the biological specificity of the WGD and at the same time points out the validity of domain architecture comparison as a tool for detecting homology.

0 0
 · 
0 Bookmarks
 · 
50 Views

Keywords

100-150 million-year old whole-genome duplication
 
cellular process
 
computationally efficient protein domain architecture comparison method
 
detecting instances
 
domain architecture comparison
 
domain assignments available
 
Duplication mechanisms
 
functional migration
 
functionally different
 
Gene duplication
 
localization diversifies
 
non-WGD specific domain superfamilies
 
non-WGD superfamilies
 
public databases
 
sequence alignment
 
similar domain superfamilies
 
standard functional analysis
 
time points
 
whole genome duplication
 
yeast S. cerevisiae