Algorithm to find distant repeats in a single protein sequence

Bioinformatics Centre, Centre of Excellence in Structural Biology and Bio-computing, India.
Bioinformation (Impact Factor: 0.5). 02/2008; 3(1):28-32. DOI: 10.6026/97320630003028
Source: PubMed


Distant repeats in protein sequence play an important role in various aspects of protein analysis. A keen analysis of the distant repeats would enable to establish a firm relation of the
repeats with respect to their function and three-dimensional structure during the evolutionary process. Further, it enlightens the diversity of duplication during the evolution. To this end,
an algorithm has been developed to find all distant repeats in a protein sequence. The scores from Point Accepted Mutation (PAM) matrix has been deployed for the identification of amino acid
substitutions while detecting the distant repeats. Due to the biological importance of distant repeats, the proposed algorithm will be of importance to structural biologists, molecular biologists,
biochemists and researchers involved in phylogenetic and evolutionary studies.

Download full-text


Available from: Vasuki Ranjani Chellamuthu
  • Source
    • "As such, the sequence cannot be considered as a palindrome in the exact sense [4]. The importance of detecting such inverted repeats in both protein as well as nucleotide sequences and the methods used for the same has been elucidated before [5–7]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Various types of sequences in the human genome are known to play important roles in different aspects of genomic functioning. Among these sequences, palindromic nucleic acid sequences are one such type that have been studied in detail and found to influence a wide variety of genomic characteristics. For a nucleotide sequence to be considered as a palindrome, its complementary strand must read the same in the opposite direction. For example, both the strands i.e the strand going from 5' to 3' and its complementary strand from 3' to 5' must be complementary. A typical nucleotide palindromic sequence would be TATA (5' to 3') and its complimentary sequence from 3' to 5' would be ATAT. Thus, a new method has been developed using dynamic programming to fetch the palindromic nucleic acid sequences. The new method uses less memory and thereby it increases the overall speed and efficiency. The proposed method has been tested using the bacterial (3891 KB bases) and human chromosomal sequences (Chr-18: 74366 kb and Chr-Y: 25554 kb) and the computation time for finding the palindromic sequences is in milli seconds.
    Preview · Article · Mar 2013 · Bioinformation
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Multiple sequence alignments (MSAs) are powerful tools in modern molecular biology that rely on sequence comparison methods. Based on MSAs, structural models, functional predictions, and phylogenetic trees can be created. MSAs are also used to infer the function of newly sequenced genes, predict new members of gene families, explore evolutionary relationships, sequence annotation, structural and functional predictions for genes and proteins. In this mini-review we emphasize the practical application of different MSA methods.
    Full-text · Article · Jun 2009
  • [Show abstract] [Hide abstract]
    ABSTRACT: Internal repeats in protein sequences have wide-ranging implications for the structure and function of proteins. A keen analysis of the repeats in protein sequences may help us to better understand the structural organization of proteins and their evolutionary relations. In this paper, a mathematical method for searching for latent periodicity in protein sequences is developed. Using this method, we identified simple sequence repeats in the alkaline proteases and found that the sequences could show the same periodicity as their tertiary structures. This result may help us to reduce difficulties in the study of the relationship between sequences and their structures.
    No preview · Article · Sep 2011 · Biochemistry (Moscow)