IMGT/LIGM-DB: a systematized approach for ImMunoGeneTics database coherence and data distribution improvement
IMGT, the international ImMunoGeneTics database (http:(/)/imgt.cnusc.fr:8104), created by Marie-Paule Lefranc, Montpellier, France, is an integrated database specializing in antigen receptors and MHC of all vertebrate species. IMGT includes LIGM-DB, developed for Immunoglobulins and T-cell-receptors. LIGM-DB distributes high quality data with an important increment value added by the LIGM expert annotations. LIGM-DB accurate immunogenetics data is based on the standardization of biological knowledge related to keywords, annotation labels and gene identification. The management of such data resulting from biological research requires an high flexible implementation to quickly reflect up-to-date results, and to integrate new knowledge. We developed a systematized approach and defined LIGM-DB systems which manage and realize the major tasks for the database survey. In this paper, we will focus on the coherence system, which became absolutely crucial to maintain data quality as the database is growing up and as the biological knowledge continues to improve, and on the distribution system which makes LIGM-DB data easy to access, download and reuse. Efforts have been done to improve the data distribution procedures and adapt them to the current bioinformatics needs. Recently, we have developed an API which allows Java programmers to remotely access and integrate LIGM-DB data in other computer environments.
Available from: PubMed Central
- "Over the past few years, large collections of sequences have been deposited in the IMGT/LIGM-DB database [33,34]. From this database, 10,507 human sequences and 8,362 murine sequences were queried. "
[Show abstract] [Hide abstract]
Immunoglobulin (IG) complementarity determining region (CDR) includes VH CDR1, VH CDR2, VH CDR3, VL CDR1, VL CDR2 and VL CDR3. Of these, VH CDR3 plays a dominant role in recognizing and binding antigens. Three major mechanisms are involved in the formation of the VH repertoire: germline gene rearrangement, junctional diversity and somatic hypermutation. Features of the generation mechanisms of VH repertoire in humans and mice share similarities while VH CDR3 amino acid (AA) composition differs. Previous studies have mainly focused on germline gene rearrangement and the composition and structure of the CDR3 AA in humans and mice. However the number of AA changes due to somatic hypermutation and analysis of the junctional mechanism have been ignored.
Here we analyzed 9,340 human and 6,657 murine unique productive sequences of immunoglobulin (IG) variable heavy (VH) domains derived from IMGT/LIGM-DB database to understand how VH CDR3 AA compositions significantly differed between human and mouse. These sequences were identified and analyzed by IMGT/HighV-QUEST (http://www.imgt.org), including gene usage, number of AA changes due to somatic hypermutation, AA length distribution of VH CDR3, AA composition, and junctional diversity.
Analyses of human and murine IG repertoires showed significant differences. A higher number of AA changes due to somatic hypermutation and more abundant N-region addition were found in human compared to mouse, which might be an important factor leading to differences in VH CDR3 amino acid composition.
These findings are a benchmark for understanding VH repertoires and can be used to characterize the VH repertoire during immune responses. The study will allow standardized comparison for high throughput results obtained by IMGT/HighV-QUEST, the reference portal for NGS repertoire.
Available from: Quentin Kaas
- "Queries in IMGT/GENE-DB can be performed according to IG and TR gene classification criteria and IMGT reference sequences have been defined for each allele of each gene based on one or, whenever possible, several of the following criteria: germline sequence, first sequence published, longest sequence, mapped sequence . IMGT/GENE-DB interacts dynamically with IMGT/LIGM-DB    to download and display gene-related sequence data. This is the first example of an interaction between IMGT databases using the CLASSIFICATION concept. "
[Show abstract] [Hide abstract]
ABSTRACT: IMGT, the international ImMunoGeneTics information system (http://imgt.cines.fr), was created in 1989 at Montpellier, France. IMGT is a high quality integrated knowledge resource specialized in immunoglobulins (IG), T cell receptors (TR), major histocompatibility complex (MHC) of human and other vertebrates, and related proteins of the immune system (RPI) which belong to the immunoglobulin superfamily (IgSF) and MHC superfamily (MhcSF). IMGT provides a common access to standardized data from genome, proteome, genetics and three-dimensional structures. The accuracy and the consistency of IMGT data are based on IMGT-ONTOLOGY, a semantic specification of terms to be used in immunogenetics and immunoinformatics. IMGT-ONTOLOGY has been formalized using XML Schema (IMGT-ML) for interoperability with other information systems. We are developing Web services to automatically query IMGT databases and tools. This is the first step towards IMGT-Choreography which will trigger and coordinate dynamic interactions between IMGT Web services to process complex significant biological and clinical requests. IMGT-Choreography will further increase the IMGT leadership in immunogenetics and immunoinformatics for medical research (repertoire analysis of the IG antibody sites and of the TR recognition sites in autoimmune and infectious diseases, AIDS, leukemias, lymphomas, myelomas), veterinary research (IG and TR repertoires in farm and wild life species), genome diversity and genome evolution studies of the adaptive immune responses, biotechnology related to antibody engineering (single chain Fragment variable (scFv), phage displays, combinatorial libraries, chimeric, humanized and human antibodies), diagnostics (detection and follow up of residual diseases) and therapeutical approaches (grafts, immunotherapy, vaccinology). IMGT is freely available at http://imgt.cines.fr.
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.