[show abstract][hide abstract] ABSTRACT: The hypothesis that mimicry between a self and a microbial peptide antigen is strictly related to autoimmune pathology remains a debated concept in autoimmunity research. Clear evidence for a causal link between molecular mimicry and autoimmunity is still lacking. In recent studies we have demonstrated that viruses and bacteria share amino acid sequences with the human proteome at such a high extent that the molecular mimicry hypothesis becomes questionable as a causal factor in autoimmunity. Expanding upon our analysis, here we detail the bacterial peptide overlapping to the human proteome at the penta-, hexa-, hepta- and octapeptide levels by exact peptide matching analysis and demonstrate that there does not exist a single human protein that does not harbor a bacterial pentapeptide or hexapeptide motif. This finding suggests that molecular mimicry between a self and a microbial peptide antigen cannot be assumed as a basis for autoimmune pathologies. Moreover, the data are discussed in relation to the microbial immune escape phenomenon and the possible vaccine-related autoimmune effects.
Self/Nonself - Immune Recognition and Signaling 01/2010; 1(4):328-334.
[show abstract][hide abstract] ABSTRACT: We study the usage of specific peptide platforms in protein composition. Using the pentapeptide as a unit of length, we find that in the universal proteome many pentapeptides are heavily repeated (even thousands of times), whereas some are quite rare, and a small number do not appear at all. To understand the physico-chemical-biological basis underlying peptide usage at the proteomic level, in this study we analyse the energetic costs for the synthesis of rare and never-expressed versus frequent pentapeptides. In addition, we explore residue bulkiness, hydrophobicity, and codon number as factors able to modulate specific peptide frequencies. Then, the possible influence of amino acid composition is investigated in zero- and high-frequency pentapeptide sets by analysing the frequencies of the corresponding inverse-sequence pentapeptides. As a final step, we analyse the pentadecamer oligodeoxynucleotide sequences corresponding to the never-expressed pentapeptides.
We find that only DNA context-dependent constraints (such as oligodeoxynucleotide sequence location in the minus strand, introns, pseudogenes, frameshifts, etc.) provide a coherent mechanistic platform to explain the occurrence of never-expressed versus frequent pentapeptides in the protein world.
This study is of importance in cell biology. Indeed, the rarity (or lack of expression) of specific 5-mer peptide modules implies the rarity (or lack of expression) of the corresponding n-mer peptide sequences (with n < 5), so possibly modulating protein compositional trends. Moreover the data might further our understanding of the role exerted by rare pentapeptide modules as critical biological effectors in protein-protein interactions.
[show abstract][hide abstract] ABSTRACT: The proteomes catalogued in the UniRef100 database were collected into a single proteome set and examined for actual versus theoretical pentapeptide occurrences. We found a highly diversified degree of pentapeptide redundancy. Numerically, 953 pentamers are expressed only once in the protein world, whereas 103 pentamers occur more than 50,000 times. Moreover, it seems that 417 potentially possible pentapeptides are not present in the protein world. On the whole, tracing the redundancy profile of the protein world as a function of pentapeptide occurrences reveals a quasi-Gaussian curve, with tails representing scarcely and repeatedly occurring 5-mers. Analysis of physico-chemical-biological parameters shows that codon number is the main factor influencing and favoring specific pentapeptide frequencies in the universal proteome composition. That is, when compared to the set of never-expressed 5-mers, the pentapeptides frequently represented in the universal proteome are endowed with a higher number of multi-codonic amino acids. In contrast, the bulkiness degree and the hydrophobicity level play a smaller role. Unexpectedly, the heat of formation of pentapeptide appears to have the least influence.
[show abstract][hide abstract] ABSTRACT: Major histocompatibility complex class I genes are among the most polymorphic genes characterized. The high level of polymorphism is essential for generating host immune responses. In humans, three distinct genomic loci encode human leukocyte antigen (HLA) class I genes, allowing individuals to express up to six different HLA class I molecules. In cattle, the number of distinct genomic loci are currently at least six, and the number of different bovine leukocyte antigens (BoLA) class I molecules that are expressed in individual animals are variable. The extent of allele variation within the cattle population is unknown. In this study, the number and variety of BoLA class I sequences expressed by 36 individuals were determined from full-length BoLA class I cDNA clones. Twenty distinct BoLA class I alleles were identified, with only four being previously reported. The number of expressed BoLA class I alleles in individual animals ranged between one and four, with none of the animals having an identical complement of BoLA class I molecules. Variation existed in the number of BoLA class I alleles expressed as well as the composition of expressed alleles, however, several BoLA class I alleles were found in multiple individual animals. Polymorphic amino acid sites were analyzed for positive and negative selection using the ADAPTSITE program. In the antigen recognition sites (ARS), there were eight positions that were predicted to be under positive selection and three positions that were predicted to be under negative selection from 62 positions. In contrast, for non-antigen recognition sites (non-ARS), there were three positions that were predicted to be under positive selection and 20 that were predicted to be under negative selection from 278, indicating that positive selection of amino acids occurs at a greater frequency within the antigen recognition sites.
[show abstract][hide abstract] ABSTRACT: Peptides derived from endogenous antigens can bind to MHC class I molecules. Those which bind with high affinity can invoke a CD8+ immune response, resulting in the destruction of infected cells. Much work in immunoinformatics has involved the algorithmic prediction of peptide binding affinity to various MHC-I alleles. A number of tools for MHC-I binding prediction have been developed, many of which are available on the web.
We hypothesize that peptides predicted by a number of tools are more likely to bind than those predicted by just one tool, and that the likelihood of a particular peptide being a binder is related to the number of tools that predict it, as well as the accuracy of those tools. To this end, we have built and tested a heuristic-based method of making MHC-binding predictions by combining the results from multiple tools. The predictive performance of each individual tool is first ascertained. These performance data are used to derive weights such that the predictions of tools with better accuracy are given greater credence. The combined tool was evaluated using ten-fold cross-validation and was found to significantly outperform the individual tools when a high specificity threshold is used. It performs comparably well to the best-performing individual tools at lower specificity thresholds. Finally, it also outperforms the combination of the tools resulting from linear discriminant analysis.
A heuristic-based method of combining the results of the individual tools better facilitates the scanning of large proteomes for potential epitopes, yielding more actual high-affinity binders while reporting very few false positives.
[show abstract][hide abstract] ABSTRACT: for a poster One purpose of our work is the analysis of physico- chemical-biological factors affecting the usage of specific amino acid platforms in protein compositio n. The issue is of special importance in cell biolo gy and immunology, since peptide amino acid composition dictates the relationships between primary structure and function (1,2). Using the pentapepti de as a unit of length, recently we found that the universal proteome contains many heavily repeated pentapeptides (664 times per pentapeptide as an average redundancy value) (3). In order to underst and the factors underlying pentapeptide redundancy, we have analyzed the following physico-(bio)chemical parameters: heat of formation, side-chain bulkiness, hydrophobicity and amino acid codon number for the 5-mers most frequently expressed. Our d ata indicate that the pentapeptide redundancy appears modulated by the amino acid codon number whereas, unexpectedly, �G° is not the primary determinant in the peptide co mposition of the universal proteome. References