Next generation sequencing for TCR repertoire profiling: Platform-specific features and correction algorithms.

Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia.
European Journal of Immunology (Impact Factor: 4.97). 07/2012; DOI: 10.1002/eji.201242517
Source: PubMed

ABSTRACT The TCR repertoire is a mirror of the human immune system that reflects processes caused by infections, cancer, autoimmunity, and aging. Next generation sequencing (NGS) is becoming a powerful tool for deep TCR profiling; yet, questions abound regarding the methodological approaches for sample preparation and correct data interpretation. Accumulated PCR and sequencing errors along with library preparation bottlenecks and uneven PCR efficiencies lead to information loss, biased quantification, and generation of huge artificial TCR diversity. Here, we compare Illumina, 454, and Ion Torrent platforms for individual TCR profiling, evaluate the rate and character of errors, and propose advanced platform-specific algorithms to correct massive sequencing data. These developments are applicable to a wide variety of next generation sequencing applications. We demonstrate that advanced correction allows the removal of the majority of artificial TCR diversity with concomitant rescue of most of the sequencing information. Thus, this correction enhances the accuracy of clonotype identification and quantification as well as overall TCR diversity measurements.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Our ability to analyze adaptive immunity and engineer its activity has long been constrained by our limited ability to identify native pairs of heavy-light antibody chains and alpha-beta T-cell receptor (TCR) chains - both of which comprise coupled "halves of a key", collectively capable of recognizing specific antigens. Here we report a cell-based emulsion RT-PCR approach that allows the selective fusion of the native pairs of amplified TCR alpha and beta chain genes for complex samples. A new type of PCR suppression technique was developed that makes it possible to amplify the fused library with minimal noise for subsequent analysis by high-throughput paired-end Illumina sequencing. With this technique, single analysis of a complex blood sample allows identification of multiple native TCR chain pairs. This approach may be extended to identify native antibody chain pairs and, more generally, pairs of mRNA molecules that are co-expressed in the same living cells. This article is protected by copyright. All rights reserved.
    European Journal of Immunology 05/2013; · 4.97 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: T and B cell repertoires are collections of lymphocytes, each characterised by its antigen-specific receptor. We review here classical technologies and analysis strategies developed to assess Immunoglobulin (IG) and T cell receptor (TR) repertoire diversity, and describe recent advances in the field. First, we describe the broad range of available methodological tools developed in the past decades, each of which answering different questions and showing complementarity for progressive identification of the level of repertoire alterations: global overview of the diversity by flow cytometry, IG repertoire descriptions at the protein level for the identification of IG reactivities, IG/TR CDR3 spectratyping strategies, and related molecular quantification or dynamics of T/B cell differentiation. Additionally, we introduce the recent technological advances in molecular biology tools allowing deeper analysis of IG/TR diversity by next-generation sequencing (NGS), offering systematic and comprehensive sequencing of IG/TR transcripts in a short amount of time. NGS provides several angles of analysis such as clonotype frequency, CDR3 diversity, CDR3 sequence analysis, V allele identification with a quantitative dimension, therefore requiring high-throughput analysis tools development. In this line, we discuss the recent efforts made for nomenclature standardisation and ontology development. We then present the variety of available statistical analysis and modelling approaches developed with regards to the various levels of diversity analysis, and reveal the increasing sophistication of modelling approaches. To conclude, we provide some examples of recent mathematical modelling strategies and perspectives that illustrate the active rise of a ìnext-generationî of repertoire analysis.
    Frontiers in Immunology 01/2013; 4.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Variable (V) domains of immunoglobulins (Ig) and T cell receptors (TCR) are generated from genomic V gene segments (V-genes). At present, such V-genes have been annotated only within the genome of a few species. We have developed a bioinformatics tool that accelerates the task of identifying functional V-genes from genome datasets. Automated recognition is accomplished by recognizing key V-gene signatures, such as recombination signal sequences, size of the exon region, and position of amino acid motifs within the translated exon. This algorithm also classifies extracted V-genes into either TCR or Ig loci. We describe the implementation of the algorithm and validate its accuracy by comparing V-genes identified from the human and mouse genomes with known V-gene annotations documented and available in public repositories. The advantages and utility of the algorithm are illustrated by using it to identify functional V-genes in the rat genome, where V-gene annotation is still incomplete. This allowed us to perform a comparative human-rodent phylogenetic analysis based on V-genes that supports the hypothesis that distinct evolutionary pressures shape the TCRs and Igs V-gene repertoires. Our program, together with a user graphical interface, is available as open-source software, downloadable at .
    Immunogenetics 06/2013; · 2.89 Impact Factor


Available from
Jun 5, 2014