Article

InterPro, progress and status in 2005

EMBL Outstation-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
Nucleic Acids Research (Impact Factor: 8.81). 02/2005; 33(Database issue):D201-5. DOI: 10.1093/nar/gki106
Source: PubMed

ABSTRACT InterPro, an integrated documentation resource of protein families, domains and functional sites, was created to integrate the major protein signature databases. Currently, it includes PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF and SUPERFAMILY. Signatures are manually integrated into InterPro entries that are curated to provide biological and functional information. Annotation is provided in an abstract, Gene Ontology mapping and links to specialized databases. New features of InterPro include extended protein match views, taxonomic range information and protein 3D structure data. One of the new match views is the InterPro Domain Architecture view, which shows the domain composition of protein matches. Two new entry types were introduced to better describe InterPro entries: these are active site and binding site. PIRSF and the structure-based SUPERFAMILY are the latest member databases to join InterPro, and CATH and PANTHER are soon to be integrated. InterPro release 8.0 contains 11 007 entries, representing 2573 domains, 8166 families, 201 repeats, 26 active sites, 21 binding sites and 20 post-translational modification sites. InterPro covers over 78% of all proteins in the Swiss-Prot and TrEMBL components of UniProt. The database is available for text- and sequence-based searches via a webserver (http://www.ebi.ac.uk/interpro), and for download by anonymous FTP (ftp://ftp.ebi.ac.uk/pub/databases/interpro).

Full-text

Available from: Sandra Orchard, Jun 04, 2015
0 Followers
 · 
285 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: Abstract A key focus in 21(st) century integrative biology and drug discovery for neglected tropical and other diseases has been the use of BLAST-based computational methods for identification of orthologous groups in pathogenic organisms to discern orthologs, with a view to evaluate similarities and differences among species, and thus allow the transfer of annotation from known/curated proteins to new/non-annotated ones. We used here a profile-based sensitive methodology to identify distant homologs, coupled to the NCBI's COG (Unicellular orthologs) and KOG (Eukaryote orthologs), permitting us to perform comparative genomics analyses on five protozoan genomes. OrthoSearch was used in five protozoan proteomes showing that 3901 and 7473 orthologs can be identified by comparison with COG and KOG proteomes, respectively. The core protozoa proteome inferred was 418 Protozoa-COG orthologous groups and 704 Protozoa-KOG orthologous groups: (i) 31.58% (132/418) belongs to the category J (translation, ribosomal structure, and biogenesis), and 9.81% (41/418) to the category O (post-translational modification, protein turnover, chaperones) using COG; (ii) 21.45% (151/704) belongs to the categories J, and 13.92% (98/704) to the O using KOG. The phylogenomic analysis showed four well-supported clades for Eukarya, discriminating Multicellular [(i) human, fly, plant and worm] and Unicellular [(ii) yeast, (iii) fungi, and (iv) protozoa] species. These encouraging results attest to the usefulness of the profile-based methodology for comparative genomics to accelerate semi-automatic re-annotation, especially of the protozoan proteomes. This approach may also lend itself for applications in global health, for example, in the case of novel drug target discovery against pathogenic organisms previously considered difficult to research with traditional drug discovery tools.
    Omics A Journal of Integrative Biology 06/2014; DOI:10.1089/omi.2013.0172 · 2.73 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Vaccines are one of the most effective interventions to improve public health, however, the generation of highly effective vaccines for many diseases has remained difficult. Three chronic diseases that characterise these difficulties include malaria, tuberculosis and HIV, and they alone account for half of the global infectious disease burden. The whole organism vaccine approach pioneered by Jenner in 1796 and refined by Pasteur in 1857 with the "isolate, inactive and inject" paradigm has proved highly successful for many viral and bacterial pathogens causing acute disease but has failed with respect to malaria, tuberculosis and HIV as well as many other diseases. A significant advance of the past decade has been the elucidation of the genomes, proteomes and transcriptomes of many pathogens. This information provides the foundation for new 21(st) Century approaches to identify target antigens for the development of vaccines, drugs and diagnostic tests. Innovative genome-based vaccine strategies have shown potential for a number of challenging pathogens, including malaria. We advocate that genome-based rational vaccine design will overcome the problem of poorly immunogenic, poorly protective vaccines that has plagued vaccine developers for many years.
    International Journal for Parasitology 09/2014; 44(12). DOI:10.1016/j.ijpara.2014.07.010 · 3.40 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In silico screening both in the forward (traditional virtual screening) and reverse sense (inverse virtual screening (IVS)) are helpful techniques for interlacing the chemical universe of small molecules with the proteome. The former, which is using a protein structure and large chemical databases, is well-known by the scientific community. We have chosen here to provide an overview on the latter focusing on validation and target prioritization strategies. By comparing it to complementary or alternative wet-lab approaches, we put IVS in the broader context of chemical genomics, target discovery and drug design. By giving examples from the literature and an own example on how to validate the approach, we provide guidance on the issues related to IVS.
    Methods 09/2014; 71. DOI:10.1016/j.ymeth.2014.08.001 · 3.22 Impact Factor