Clark, H.F. et al. The secreted protein discovery initiative (SPDI), a large-scale effort to identify novel human secreted and transmembrane proteins: a bioinformatics assessment. Genome Res. 13, 2265-2270

Departments of Bioinformatics, Molecular Biology and Protein Chemistry, Genentech, Inc, South San Francisco, California 94080, USA.
Genome Research (Impact Factor: 14.63). 10/2003; 13(10):2265-70. DOI: 10.1101/gr.1293003
Source: PubMed


A large-scale effort, termed the Secreted Protein Discovery Initiative (SPDI), was undertaken to identify novel secreted and transmembrane proteins. In the first of several approaches, a biological signal sequence trap in yeast cells was utilized to identify cDNA clones encoding putative secreted proteins. A second strategy utilized various algorithms that recognize features such as the hydrophobic properties of signal sequences to identify putative proteins encoded by expressed sequence tags (ESTs) from human cDNA libraries. A third approach surveyed ESTs for protein sequence similarity to a set of known receptors and their ligands with the BLAST algorithm. Finally, both signal-sequence prediction algorithms and BLAST were used to identify single exons of potential genes from within human genomic sequence. The isolation of full-length cDNA clones for each of these candidate genes resulted in the identification of >1000 novel proteins. A total of 256 of these cDNAs are still novel, including variants and novel genes, per the most recent GenBank release version. The success of this large-scale effort was assessed by a bioinformatics analysis of the proteins through predictions of protein domains, subcellular localizations, and possible functional roles. The SPDI collection should facilitate efforts to better understand intercellular communication, may lead to new understandings of human diseases, and provides potential opportunities for the development of therapeutics.

Download full-text


Available from: Jeremy A Stinson
  • Source
    • "These variants encode the ADAM12-Lb and ADAM12-Sb protein isoforms, respectively. ADAM12var-2a, containing the shorter exon 4a and encoding the ADAM12-Sa isoform, was later identified in a screen for novel secreted proteins [47]. ADAM12var-1a transcript and ADAM12-La protein isoform are not featured in any of the DNA/protein databases analyzed (Table 1). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Human ADAM12, transcript variant 1 (later on referred to as Var-1b), present in publicly available databases contains the sequence 5'-GTAATTCTG-3' at the nucleotide positions 340-348 of the coding region, at the 3' end of exon 4. The translation product of this variant, ADAM12-Lb, includes the three amino acid motif (114)VIL(116) in the prodomain. This motif is not conserved in ADAM12 from different species and is not present in other human ADAMs. Currently, it is not clear whether a shorter variant, Var-1a, encoding the protein version without the (114)VIL(116) motif, ADAM12-La, is expressed in human. In this work, we have established that human mammary epithelial cells and breast cancer cells express both Var-1a and Var-1b transcripts. Importantly, the proteolytic processing and intracellular trafficking of the corresponding ADAM12-La and ADAM12-Lb proteins are different. While ADAM12-La is cleaved and trafficked to the cell surface in a manner similar to ADAM12 in other species, ADAM12-Lb is retained in the ER and is not proteolytically processed. Furthermore, the relative abundance of ADAM12-La and ADAM12-Lb proteins detected in several breast cancer cell lines varies significantly. We conclude that the canonical form of transmembrane ADAM12 is represented by Var-1a/ADAM12-La, rather than Var-1b/ADAM12-Lb currently featured in major sequence databases.
    Full-text · Article · Oct 2013 · PLoS ONE
  • Source
    • "We recently identified a candidate gene called Esophageal cancer related gene-4 (Ecrg4) that we proposed plays a sentinel function to monitor set points of homeostasis [13], [14], [15], [16]. Constitutively expressed by numerous cell types, localized in many normal tissues, and found in selected biological fluids, Ecgr4 is a member of both the secretome [17], [18] and neuropeptidome [13], [14], [19], [20] that is tethered to the epithelial cell surface [13], [14], [15], [16]. In cancer, its expression is epigenetically regulated by DNA methylation of >16 CpG sites in its promoter region [21], [22], [23] and as such, it is highly down-regulated in epithelial cancers via hypermethylation [21], [22], [23], [24], [25], [26]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We report an inverse relationship between expression of the orphan candidate tumor suppressor gene esophageal cancer related gene 4 (Ecrg4), and the mucosal epithelial cell response to infection in the middle ear (ME). First, we found constitutive Ecrg4 mRNA expression in normal, quiescent ME mucosa that was confirmed by immunostainning of mucosal epithelial cells and immunoblotting of tissue lysates for the 14 kDa Ecrg4 protein. Upon experimental ME infection, Ecrg4 gene expression rapidly decreased by over 80%, between 3 to 48 hrs, post infection. When explants of this infected mucosa were placed in culture and transduced with an adenovirus (AD) encoding Ecrg4 gene (ADEcrg4), the proliferative and migratory responses of mucosal cells were significantly inhibited. ADEcrg4 transduction of control explants from uninfected MEs had no effect on basal growth and migration. Over-expression of Ecrg4 in vivo, by pre-injecting MEs with ADEcrg4 48 hrs prior to infection, prevented the natural down-regulation of Ecrg4, reduced mucosal proliferation and prevented inflammatory cell infiltration normally observed after infection. Taken together, these data support a hypothesis that Ecrg4 plays a role in coordinating the inflammatory and proliferative response to infection of mucosal epithelium suggesting a possible mechanism for its putative anti-tumor activity.
    Full-text · Article · Apr 2013 · PLoS ONE
  • Source
    • "transmembrane proteins with bioinformatic approaches (Clark et al., 2003). (2) A diagnostic potential exists when secreted proteins are found in the serum. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Type 1 diabetes (T1D) represents a serious health burden in the world, complicated by the fact that disease onset can be preceded by a long time period without evident clinical signs. It would be then of critical importance to detect the disease in its early stages. In this direction, we seek here to identify early preinflammatory markers for autoimmune diabetes, mining our previously reported transcriptome data relevant to distinct early sub-phenotypes in the NOD mouse, associated with early insulin autoantibodies (E-IAA). More specifically we focus on secreted or transmembrane protein transcripts, identifying in this category 71 differentially expressed transcripts which are regulated at the early preinflammatory stages of T1D in the pancreatic lymph nodes (PLN). Following the expression patterns of these 71 transcripts, correspondence analysis (a multivariate analysis method) reveals a clear-cut segregation of the individual samples according to the early subphenotype used. Thus the 71 transcripts coding for secreted proteins constitute a candidate-set of predictive biomarkers for the development of autoimmune damage of the β cells of the pancreas. The majority of these genes have human orthologs and accordingly they represent potential candidate biomarkers for the human disease. In addition, for predictive purposes, the analysis reveals the possibility to reduce significantly the size of the candidate-set in practice, with various genes displaying identical expression profiles.
    Full-text · Article · Sep 2012 · Gene
Show more