InterPro, progress and status in 2005

University of Oxford, Oxford, England, United Kingdom
Nucleic Acids Research (Impact Factor: 9.11). 02/2005; 33(Database issue):D201-5. DOI: 10.1093/nar/gki106
Source: PubMed


InterPro, an integrated documentation resource of protein families, domains and functional sites, was created to integrate the major protein signature databases. Currently, it includes PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF and SUPERFAMILY. Signatures are manually integrated into InterPro entries that are curated to provide biological and functional information. Annotation is provided in an abstract, Gene Ontology mapping and links to specialized databases. New features of InterPro include extended protein match views, taxonomic range information and protein 3D structure data. One of the new match views is the InterPro Domain Architecture view, which shows the domain composition of protein matches. Two new entry types were introduced to better describe InterPro entries: these are active site and binding site. PIRSF and the structure-based SUPERFAMILY are the latest member databases to join InterPro, and CATH and PANTHER are soon to be integrated. InterPro release 8.0 contains 11 007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites. InterPro covers over 78% of all proteins in the Swiss-Prot and TrEMBL components of UniProt. The database is available for text- and sequence-based searches via a webserver (, and for download by anonymous FTP (

Download full-text


Available from: Sandra Orchard
  • Source
    • "The analysis was performed with a set of parameters as follows: number of repetitions, any; minimum width for each motif, 6; maximum width for each motif, 100; and maximum number of motifs to be found, 5. All obtained motifs were searched in the InterPro database with InterProScan [32]. The exon/intron structures of HMGR genes were obtained by comparing the genomic sequences and their predicted coding sequences (CDS) using GSDS ( "
    [Show abstract] [Hide abstract]
    ABSTRACT: The terpene compounds represent the largest and most diverse class of plant secondary metabolites which are important in plant growth and development. The 3-hydroxy-3-methylglutaryl coenzyme A reductase (HMGR; EC is one of the key enzymes contributed to terpene biosynthesis. To better understand the basic characteristics and evolutionary history of the HMGR gene family in plants, a genome-wide analysis of HMGR genes from 20 representative species was carried out. A total of 56 HMGR genes in the 14 land plant genomes were identified, but no genes were found in all 6 algal genomes. The gene structure and protein architecture of all plant HMGR genes were highly conserved. The phylogenetic analysis revealed that the plant HMGRs were derived from one ancestor gene and finally developed into four distinct groups, two in the monocot plants and two in dicot plants. Species-specific gene duplications, caused mainly by segmental duplication, led to the limited expansion of HMGR genes in Zea mays, Gossypium raimondii, Populus trichocarpa and Glycine max after the species diverged. The analysis of Ka/Ks ratios and expression profiles indicated that functional divergence after the gene duplications was restricted. The results suggested that the function and evolution of HMGR gene family were dramatically conserved throughout the plant kingdom.
    Full-text · Article · Apr 2014 · PLoS ONE
  • Source
    • "Annotation was performed by using the GenDB, version 2.2 system (Meyer et al., 2003), supplemented by the tool JCoast, version 1.6 (Richter et al., 2008). For each predicted ORF observations have been collected from similarity searches against sequence databases NCBI-nr, Swiss-Prot, KEGG and genomesDB (Richter et al., 2008) and for protein family databases from Pfam (Bateman et al., 2004) and InterPro (Mulder et al., 2005). SignalP has been used for signal peptide predictions (Bendtsen et al., 2004) and TMHMM for transmembrane helix-analysis (Krogh et al., 2001). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The majority of strains belonging to the genus Pseudovibrio have been isolated from marine invertebrates such as tunicates, corals and particularly sponges, but the physiology of these bacteria is poorly understood. In this study, we analyse for the first time the genomes of two Pseudovibrio strains - FO-BEG1 and JE062. The strain FO-BEG1 is a required symbiont of a cultivated Beggiatoa strain, a sulfide-oxidizing, autotrophic bacterium, which was initially isolated from a coral. Strain JE062 was isolated from a sponge. The presented data show that both strains are generalistic bacteria capable of importing and oxidizing a wide range of organic and inorganic compounds to meet their carbon, nitrogen, phosphorous and energy requirements under both, oxic and anoxic conditions. Several physiological traits encoded in the analysed genomes were verified in laboratory experiments with both isolates. Besides the versatile metabolic abilities of both Pseudovibrio strains, our study reveals a number of open reading frames and gene clusters in the genomes that seem to be involved in symbiont-host interactions. Both Pseudovibrio strains have the genomic potential to attach to host cells, interact with the eukaryotic cell machinery, produce secondary metabolites and supply the host with cofactors.
    Full-text · Article · Mar 2013 · Environmental Microbiology
  • Source
    • "Annotation was performed with GenDB, version 2.2 [40], supplemented with the tool JCoast, version 1.6 [41]. For each predicted ORF, observations were collected from similarity searches against the NCBI-nr, Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG) and genomesDB sequence databases [41] and against protein family databases from Pfam [42] and InterPro [43]. SignalP was used for signal peptide predictions [44] and TMHMM was used for transmembrane helix analysis [45]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent studies have indicated the existence of an extensive trans-genomic trans-mural co-metabolism between gut microbes and animal hosts that is diet-, host phylogeny- and provenance-influenced. Here, we analyzed the biodiversity at the level of small subunit rRNA gene sequence and the metabolic composition of 18 Mbp of consensus metagenome sequences and activity characteristics of bacterial intra-cellular extracts, in wild Iberian lynx (Lynx pardinus) fecal samples. Bacterial signatures (14.43% of all of the Firmicutes reads and 6.36% of total reads) related to the uncultured anaerobic commensals Anaeroplasma spp., which are typically found in ovine and bovine rumen, were first identified. The lynx gut was further characterized by an over-representation of 'presumptive' aquaporin aqpZ genes and genes encoding 'active' lysosomal-like digestive enzymes that are possibly needed to acquire glycerol, sugars and amino acids from glycoproteins, glyco(amino)lipids, glyco(amino)glycans and nucleoside diphosphate sugars. Lynx gut was highly enriched (28% of the total glycosidases) in genes encoding α-amylase and related enzymes, although it exhibited low rate of enzymatic activity indicative of starch degradation. The preponderance of β-xylosidase activity in protein extracts further suggests lynx gut microbes being most active for the metabolism of β-xylose containing plant N-glycans, although β-xylosidases sequences constituted only 1.5% of total glycosidases. These collective and unique bacterial, genetic and enzymatic activity signatures suggest that the wild lynx gut microbiota not only harbors gene sets underpinning sugar uptake from primary animal tissues (with the monotypic dietary profile of the wild lynx consisting of 80-100% wild rabbits) but also for the hydrolysis of prey-derived plant biomass. Although, the present investigation corresponds to a single sample and some of the statements should be considered qualitative, the data most likely suggests a tighter, more coordinated and complex evolutionary and nutritional ecology scenario of carnivore gut microbial communities than has been previously assumed.
    Full-text · Article · Dec 2012 · PLoS ONE
Show more