Article

DegenRev: Degeneracy-Based Full Backtranslation Algorithm for Oligopeptide

07/2007;

ABSTRACT In order to design microarray oligonucleotides, in the context of new metabolic pathways discovery, it appears that a full backtranslation of oligopeptides is a promising approach. Protein to DNA reverse translation is a time-consuming task that can provide unreasonable quantities of data. This is why most current applications use genetic degenerated code or data mining-based techniques to select the best reverse translation of a short protein sequence called oligopeptide. When the purpose is only to design short oligos it is particularly interesting to have the complete sequences to solve the design problems of enzyme specific oligos for microarrays. In this paper, we revisit existing bioinformatics applications, which bring reverse translation solutions, and we present a new algorithm based on input oligopeptide degeneracy able to efficiently compute a full reverse translation. We propose an implementation with the C programming language and we show its performance statistics on simulated and real biological datasets.

0 0
 · 
0 Bookmarks
 · 
36 Views
  • Source
    Article: Basic local alignment search tool.
    [show abstract] [hide abstract]
    ABSTRACT: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score. Recent mathematical results on the stochastic properties of MSP scores allow an analysis of the performance of this method as well as the statistical significance of alignments it generates. The basic algorithm is simple and robust; it can be implemented in a number of ways and applied in a variety of contexts including straightforward DNA and protein sequence database searches, motif searches, gene identification searches, and in the analysis of multiple regions of similarity in long DNA sequences. In addition to its flexibility and tractability to mathematical analysis, BLAST is an order of magnitude faster than existing sequence comparison tools of comparable sensitivity.
    Journal of Molecular Biology 11/1990; 215(3):403-10. · 4.00 Impact Factor
  • Source
    Article: transAlign: using amino acids to facilitate the multiple alignment of protein-coding DNA sequences.
    [show abstract] [hide abstract]
    ABSTRACT: Alignments of homologous DNA sequences are crucial for comparative genomics and phylogenetic analysis. However, multiple alignment represents a computationally difficult problem. For protein-coding DNA sequences, it is more advantageous in terms of both speed and accuracy to align the amino-acid sequences specified by the DNA sequences rather than the DNA sequences themselves. Many implementations making use of this concept of "translated alignments" are incomplete in the sense that they require the user to manually translate the DNA sequences and to perform the amino-acid alignment. As such, they are not well suited to large-scale automated alignments of large and/or numerous DNA data sets. transAlign is an open-source Perl script that aligns protein-coding DNA sequences via their amino-acid translations to take advantage of the superior multiple-alignment capabilities and speed of an amino-acid alignment. It operates by translating each DNA sequence into its corresponding amino-acid sequence, passing the entire matrix to ClustalW for alignment, and then back-translating the resulting amino-acid alignment to derive the aligned DNA sequences. In the translation step, transAlign determines the optimal orientation and reading frame for each DNA sequence according to the desired genetic code. It also checks for apparent frame shifts in the DNA sequences and can handle frame-shifted sequences in one of three ways (delete, align as amino acids regardless, or profile align as DNA). As a set of comparative benchmarks derived from six protein-coding genes for mammals shows, the strategy implemented in transAlign always improves the speed and usually the apparent accuracy of the alignment of protein-coding DNA sequences. transAlign represents one of few full and cross-platform implementations of the concept of translated alignments. Both the advantages accruing from performing a translated alignment and the suite of user-definable options available in the program mean that transAlign is ideally suited for large-scale automated alignments of very large and/or very numerous protein-coding DNA data sets. However, the good performance offered by the program also translates to the alignment of any set of protein-coding sequences. transAlign, including the source code, is freely available at http://www.tierzucht.tum.de/Bininda-Emonds/ (under "Programs").
    BMC Bioinformatics 02/2005; 6:156. · 2.75 Impact Factor
  • Source
    Article: Multiple sequence alignment with the Clustal series of programs.
    [show abstract] [hide abstract]
    ABSTRACT: The Clustal series of programs are widely used in molecular biology for the multiple alignment of both nucleic acid and protein sequences and for preparing phylogenetic trees. The popularity of the programs depends on a number of factors, including not only the accuracy of the results, but also the robustness, portability and user-friendliness of the programs. New features include NEXUS and FASTA format output, printing range numbers and faster tree calculation. Although, Clustal was originally developed to run on a local computer, numerous Web servers have been set up, notably at the EBI (European Bioinformatics Institute) (http://www.ebi.ac.uk/clustalw/).
    Nucleic Acids Research 08/2003; 31(13):3497-500. · 8.03 Impact Factor

Full-text (2 Sources)

View
0 Downloads
Available from
22 Apr 2013

Keywords

bioinformatics applications
 
bring reverse translation solutions
 
C programming language
 
complete sequences
 
current applications use genetic degenerated code
 
design microarray oligonucleotides
 
design problems
 
design short oligos
 
DNA reverse translation
 
enzyme specific oligos
 
full backtranslation
 
full reverse translation
 
input oligopeptide degeneracy able
 
new algorithm
 
new metabolic pathways discovery
 
reverse translation
 
short protein sequence
 
simulated
 
unreasonable quantities