High-throughput prediction of protein antigenicity using protein microarray data

Institute for Genomics and Bioinformatics, School of Information and Computer Sciences, University of California, Irvine, CA 92697, USA.
Bioinformatics (Impact Factor: 4.62). 10/2010; 26(23):2936-43. DOI: 10.1093/bioinformatics/btq551
Source: PubMed

ABSTRACT Discovery of novel protective antigens is fundamental to the development of vaccines for existing and emerging pathogens. Most computational methods for predicting protein antigenicity rely directly on homology with previously characterized protective antigens; however, homology-based methods will fail to discover truly novel protective antigens. Thus, there is a significant need for homology-free methods capable of screening entire proteomes for the antigens most likely to generate a protective humoral immune response.
Here we begin by curating two types of positive data: (i) antigens that elicit a strong antibody response in protected individuals but not in unprotected individuals, using human immunoglobulin reactivity data obtained from protein microarray analyses; and (ii) known protective antigens from the literature. The resulting datasets are used to train a sequence-based prediction model, ANTIGENpro, to predict the likelihood that a protein is a protective antigen. ANTIGENpro correctly classifies 82% of the known protective antigens when trained using only the protein microarray datasets. The accuracy on the combined dataset is estimated at 76% by cross-validation experiments. Finally, ANTIGENpro performs well when evaluated on an external pathogen proteome for which protein microarray data were obtained after the initial development of ANTIGENpro.
ANTIGENpro is integrated in the SCRATCH suite of predictors available at

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Sporotrichosis is a polymorphic disease that affects both humans and animals worldwide. The fungus gains entry into a warm-blooded host through minor trauma to the skin, typically by contaminated vegetation or by scratches and bites from a diseased cat. Cellular and humoral responses triggered upon pathogen introduction play important roles in the development and severity of the disease. We investigated molecules expressed during the host-parasite interplay that elicit the humoral response in human sporotrichosis. For antigenic profiling, Sporothrix yeast cell extracts were separated by two-dimensional (2D) gel electrophoresis and probed with pooled sera from individuals with fixed cutaneous and lymphocutaneous sporotrichosis. Thirty-five IgG-seroreactive spots were identified as eight specific proteins by MALDI-ToF/MS. Remarkable cross-reactivity among S. brasiliensis, S. schenckii, and S. globosa was noted and antibodies strongly reacted with the 70-kDa protein (gp70), irrespective of clinical manifestation. Gp70 was successfully identified in multiple spots as 3-carboxymuconate cyclase. In addition, 2D-DIGE characterization suggested that the major antigen of sporotrichosis undergoes post-translational modifications involving glycosylation and amino acid substitution, resulting in at least six isoforms and glycoforms that were present in the pathogenic species but absent in the ancestral non-virulent S. mexicana. Although a primary environmental function related to the benzoate degradation pathway of aromatic polymers has been attributed to orthologs of this molecule, our findings support the hypothesis that gp70 is important for pathogenesis and invasion in human sporotrichosis. We propose a diverse panel of new putative candidate molecules for diagnostic tests and vaccine development. Outbreaks due to Sporothrix spp. have emerged over time, affecting thousands of patients worldwide. A sophisticated host-pathogen interplay drives the manifestation and severity of infection, involving immune responses elicited upon traumatic exposure of the skin barrier to the pathogen followed by immune evasion. Using an immunoproteomics approach we characterized proteins of potential significance in pathogenesis and invasion that trigger the humoral response during human sporotrichosis. We found gp70 to be a cross-immunogenic protein shared among pathogenic Sporothrix spp. but absent in the ancestral environmental S. mexicana, supporting the hypothesis that gp70 plays key roles in pathogenicity. For the first time, we demonstrate with 2D-DIGE that post-translational modifications putatively involve glycosylation and amino acid substitution, resulting in at least six isoforms and glycoforms, all of them IgG-reactive. These findings of a convergent humoral response highlight gp70 as an important target serological diagnosis and for vaccine development among phylogenetically related agents of sporotrichosis. Copyright © 2014. Published by Elsevier B.V.
    Journal of Proteomics 11/2014; DOI:10.1016/j.jprot.2014.11.013 · 3.93 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In our earlier study, an immunoblot analysis using sera from febrile patients revealed that a 50-kDa band from an outer membrane protein fraction of Salmonella enterica serovar Typhi was specifically recognized only by typhoid sera and not sera from other febrile illnesses. Here, we investigated the identities of the proteins contained in the immunogenic 50-kDa band to pinpoint antigens responsible for its immunogenicity. We first used LC-MS/MS for protein identification, then used the online tool ANTIGENpro for antigenicity prediction and produced recombinant proteins of the lead antigens for validation in an enzyme-linked immunosorbent assay (ELISA). We found that proteins TolC, GlpK and SucB were specific to typhoid sera but react to antibodies differently under native and denatured conditions. This difference suggests the presence of linear and conformational epitopes on these proteins.
    Applied Biochemistry and Biotechnology 08/2014; 174(5). DOI:10.1007/s12010-014-1173-y · 1.69 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: With increasing efficiency, accuracy and speed we can access complete genome sequences from thousands of infectious microorganisms, however, the ability to predict antigenic targets of the immune system based on amino acid sequence alone is still needed. Here we use a Leptospira interrogans microarray expressing 91% (3359) of all leptospiral predicted ORFs (3667) and make an empirical accounting of all the antibody reactive antigens recognized in sera from naturally infected humans; 191 antigens elicited an IgM and/or IgG response, representing 5% of the whole proteome. We classified the reactive antigens into 26 annotated COGs (clusters of orthologous groups), 26 JCVI Mainrole annotations, and 11 computationally predicted proteomic features. Altogether 14 significantly enriched categories were identified, which are associated with immune recognition including mass spectrometry evidence of in-vitro expression and in vivo mRNA up-regulation. Together this group of 14 enriched categories accounts for just 25% of the leptospiral proteome but contains 50% of the immunoreactive antigens. These findings are consistent with our previous studies of other gram-negative bacteria. This genome-wide approach provides an empirical basis to predict and classify antibody reactive antigens based on structural, physical-chemical, and functional proteomic features, and a framework for understanding the breadth and specificity of the immune response to L. interrogans.
    Journal of Proteome Research 10/2014; 14(1). DOI:10.1021/pr500718t · 5.00 Impact Factor

Full-text (2 Sources)

Available from
Jun 2, 2014