Next generation sequencing for TCR repertoire profiling: Platform-specific features and correction algorithms

Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia.
European Journal of Immunology (Impact Factor: 4.52). 11/2012; 42(11). DOI: 10.1002/eji.201242517
Source: PubMed

ABSTRACT The TCR repertoire is a mirror of the human immune system that reflects processes caused by infections, cancer, autoimmunity, and aging. Next generation sequencing (NGS) is becoming a powerful tool for deep TCR profiling; yet, questions abound regarding the methodological approaches for sample preparation and correct data interpretation. Accumulated PCR and sequencing errors along with library preparation bottlenecks and uneven PCR efficiencies lead to information loss, biased quantification, and generation of huge artificial TCR diversity. Here, we compare Illumina, 454, and Ion Torrent platforms for individual TCR profiling, evaluate the rate and character of errors, and propose advanced platform-specific algorithms to correct massive sequencing data. These developments are applicable to a wide variety of next generation sequencing applications. We demonstrate that advanced correction allows the removal of the majority of artificial TCR diversity with concomitant rescue of most of the sequencing information. Thus, this correction enhances the accuracy of clonotype identification and quantification as well as overall TCR diversity measurements.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Analysis of the intestinal B-cell system and properties of immunoglobulin A, the main antibody isotype produced in the gut, has dominated the rise of mucosal immunology as a discipline. Seminal work established concepts describing the induction, transport, and function of mucosal antibodies. Still, open questions remain and we lack a comprehensive view of how the various sites and pathways of immunoglobulin A induction are integrated to respond to gut antigens. Next-generation sequencing (NGS) offers a novel approach to study B-cell responses, which might substantially enhance our tool box to answer key questions in the field and to take the next steps toward therapeutic exploitation of the mucosal B-cell system. In this review we discuss the potential, challenges, and emerging solutions for gut B-cell repertoire analysis by NGS.Mucosal Immunology advance online publication, 12 November 2014; doi:10.1038/mi.2014.103.
    Mucosal Immunology 11/2014; 8(1). DOI:10.1038/mi.2014.103 · 7.54 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Library preparation protocols for high-throughput DNA sequencing (HTS) include amplification steps in which errors can build up. In order to have confidence in the sequencing data, it is important to understand the effects of different Taq polymerases and PCR amplification protocols on the DNA molecules sequenced. We compared thirteen enzymes in three different marker systems: simple, single copy nuclear gene and complex multi-gene family. We also tested a modified PCR protocol, which has been suggested to reduce errors associated with amplification steps. We find that enzyme choice has a large impact on the proportion of correct sequences recovered. The most complex marker systems yielded fewer correct reads, and the proportion of correct reads was greatly affected by the enzyme used. Modified cycling conditions did reduce the number of incorrect sequences obtained in some cases, but enzyme had a much greater impact on the number of correct reads. Thus, the coverage required for the safe identification of genotypes using one of the low quality enzymes could be seven times larger than with more efficient enzymes in a biallelic system with equal amplification of the two alleles. Consequently, enzyme selection for downstream HTS has important consequences, especially in complex genetic systems.
    Scientific Reports 01/2015; 5:8056. DOI:10.1038/srep08056 · 5.08 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The next-generation sequencing technology has promoted the study on human TCR repertoire, which is essential for the adaptive immunity. To decipher the complexity of TCR repertoire, we developed an integrated pipeline, TCRklass, using K-string-based algorithm that has significantly improved the accuracy and performance over existing tools. We tested TCRklass using manually curated short read datasets in comparison with in silico datasets; it showed higher precision and recall rates on CDR3 identification. We applied TCRklass on large datasets of two human and three mouse TCR repertoires; it demonstrated higher reliability on CDR3 identification and much less biased V/J profiling, which are the two components contributing the diversity of the repertoire. Because of the sequencing cost, short paired-end reads generated by next-generation sequencing technology are and will remain the main source of data, and we believe that the TCRklass is a useful and reliable toolkit for TCR repertoire analysis. Copyright © 2014 by The American Association of Immunologists, Inc.


Available from
Jun 5, 2014