Article

SciDBMaker: new software for computer-aided design of specialized biological databases.

Unité de Protéomie Fonctionnelle & Biopréservation Alimentaire, Institut Supérieur des Sciences Biologiques Appliquées de Tunis, Université El Manar, Tunisie.
BMC Bioinformatics (impact factor: 2.75). 02/2008; 9:121. DOI:10.1186/1471-2105-9-121
Source: PubMed

ABSTRACT The exponential growth of research in molecular biology has brought concomitant proliferation of databases for stocking its findings. A variety of protein sequence databases exist. While all of these strive for completeness, the range of user interests is often beyond their scope. Large databases covering a broad range of domains tend to offer less detailed information than smaller, more specialized resources, often creating a need to combine data from many sources in order to obtain a complete picture. Scientific researchers are continually developing new specific databases to enhance their understanding of biological processes.
In this article, we present the implementation of a new tool for protein data analysis. With its easy-to-use user interface, this software provides the opportunity to build more specialized protein databases from a universal protein sequence database such as Swiss-Prot. A family of proteins known as bacteriocins is analyzed as 'proof of concept'.
SciDBMaker is stand-alone software that allows the extraction of protein data from the Swiss-Prot database, sequence analysis comprising physicochemical profile calculations, homologous sequences search, multiple sequence alignments and the building of new and more specialized databases. It compiles information with relative ease, updates and compares various data relevant to a given protein family and could solve the problem of dispersed biological search results.

0 0
 · 
0 Bookmarks
 · 
69 Views
  • Article: Protein sequence databases.
    Advances in protein chemistry 02/2000; 54:31-71. · 3.20 Impact Factor
  • Article: Reference points for comparisons of two-dimensional maps of proteins from different human cell types defined in a pH scale where isoelectric points correlate with polypeptide compositions.
    [show abstract] [hide abstract]
    ABSTRACT: A highly reproducible, commercial and nonlinear, wide-range immobilized pH gradient (IPG) was used to generate two-dimensional (2-D) gel maps of [35S]methionine-labeled proteins from noncultured, unfractionated normal human epidermal keratinocytes. Forty one proteins, common to most human cell types and recorded in the human keratinocyte 2-D gel protein database were identified in the 2-D gel maps and their isoelectric points (pI) were determined using narrow-range IPGs. The latter established a pH scale that allowed comparisons between 2-D gel maps generated either with other IPGs in the first dimension or with different human protein samples. Of the 41 proteins identified, a subset of 18 was defined as suitable to evaluate the correlation between calculated and experimental pI values for polypeptides with known composition. The variance calculated for the discrepancies between calculated and experimental pI values for these proteins was 0.001 pH units. Comparison of the values by the t-test for dependent samples (paired test) gave a p-level of 0.49, indicating that there is no significant difference between the calculated and experimental pI values. The precision of the calculated values depended on the buffer capacity of the proteins, and on average, it improved with increased buffer capacity. As shown here, the widely available information on protein sequences cannot, a priori, be assumed to be sufficient for calculating pI values because post-translational modifications, in particular N-terminal blockage, pose a major problem. Of the 36 proteins analyzed in this study, 18-20 were found to be N-terminally blocked and of these only 6 were indicated as such in databases. The probability of N-terminal blockage depended on the nature of the N-terminal group. Twenty six of the proteins had either M, S or A as N-terminal amino acids and of these 17-19 were blocked. Only 1 in 10 proteins containing other N-terminal groups were blocked.
    Electrophoresis 15(3-4):529-39. · 3.30 Impact Factor
  • Article: Statistical determination of the average values of the extinction coefficients of tryptophan and tyrosine in native proteins.
    [show abstract] [hide abstract]
    ABSTRACT: Spectroscopic measurement of protein concentration requires knowledge of the value of the relevant extinction coefficient. If the amino acid composition of a protein is known, however, extinction coefficients can be calculated approximately, provided that the values of the molar absorptivities for tryptophan and tyrosine residues in the protein are known. We have applied a matrix linear regression procedure and a mapping of average absolute deviations between experimental and calculated values to find molar extinction coefficients (epsilon M, 1 cm, 280 nm) of 5540 M-1 cm-1 for tryptophan and 1480 M-1 cm-1 for tyrosine residues in an "average" protein, as defined by a set of experimentally determined extinction coefficients for more than 30 proteins. Use of these values provides a significant improvement in extinction coefficient estimation over that obtained with the commonly used values obtained from solutions of model compounds in guanidine-HCl. The consistency of these results when compared to the large deviations often observed between experimentally determined extinction coefficients suggest that this method may offer acceptable accuracy in the initial estimation of molar absorptivities of globular proteins.
    Analytical Biochemistry 02/1992; 200(1):74-80. · 3.00 Impact Factor

Full-text

View
0 Downloads
Available from

Keywords

biological search results
 
broad range
 
databases
 
easy-to-use user interface
 
given protein family
 
homologous sequences search
 
Large databases
 
multiple sequence alignments
 
new specific databases
 
physicochemical profile calculations
 
protein data analysis
 
protein sequence databases
 
proteins
 
Scientific researchers
 
specialized databases
 
specialized protein databases
 
specialized resources
 
Swiss-Prot database
 
universal protein sequence database
 
various data relevant