Next generation sequencing for TCR repertoire profiling: Platform-specific features and correction algorithms.

Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, Moscow, Russia.
European Journal of Immunology (Impact Factor: 4.52). 07/2012; DOI: 10.1002/eji.201242517
Source: PubMed

ABSTRACT The TCR repertoire is a mirror of the human immune system that reflects processes caused by infections, cancer, autoimmunity, and aging. Next generation sequencing (NGS) is becoming a powerful tool for deep TCR profiling; yet, questions abound regarding the methodological approaches for sample preparation and correct data interpretation. Accumulated PCR and sequencing errors along with library preparation bottlenecks and uneven PCR efficiencies lead to information loss, biased quantification, and generation of huge artificial TCR diversity. Here, we compare Illumina, 454, and Ion Torrent platforms for individual TCR profiling, evaluate the rate and character of errors, and propose advanced platform-specific algorithms to correct massive sequencing data. These developments are applicable to a wide variety of next generation sequencing applications. We demonstrate that advanced correction allows the removal of the majority of artificial TCR diversity with concomitant rescue of most of the sequencing information. Thus, this correction enhances the accuracy of clonotype identification and quantification as well as overall TCR diversity measurements.

  • Source
    Frontiers in Immunology 10/2014; 5:539.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The next-generation sequencing technology has promoted the study on human TCR repertoire, which is essential for the adaptive immunity. To decipher the complexity of TCR repertoire, we developed an integrated pipeline, TCRklass, using K-string-based algorithm that has significantly improved the accuracy and performance over existing tools. We tested TCRklass using manually curated short read datasets in comparison with in silico datasets; it showed higher precision and recall rates on CDR3 identification. We applied TCRklass on large datasets of two human and three mouse TCR repertoires; it demonstrated higher reliability on CDR3 identification and much less biased V/J profiling, which are the two components contributing the diversity of the repertoire. Because of the sequencing cost, short paired-end reads generated by next-generation sequencing technology are and will remain the main source of data, and we believe that the TCRklass is a useful and reliable toolkit for TCR repertoire analysis. Copyright © 2014 by The American Association of Immunologists, Inc.
    Journal of immunology (Baltimore, Md. : 1950). 11/2014;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We previously identified a distinct mutation pattern in the antibody genes of B cells isolated from cerebrospinal fluid (CSF) that can identify patients who have relapsing-remitting multiple sclerosis (RRMS) and patients with clinically isolated syndromes who will convert to RRMS. This antibody gene signature (AGS) was developed using Sanger sequencing of single B cells. While potentially helpful to patients, Sanger sequencing is not an assay that can be practically deployed in clinical settings. In order to provide AGS evaluations to patients as part of their diagnostic workup, we developed protocols to generate AGS scores using next-generation DNA sequencing (NGS) on CSF-derived cell pellets without the need to isolate single cells. This approach has the potential to increase the coverage of the B-cell population being analyzed, reduce the time needed to generate AGS scores, and may improve the overall performance of the AGS approach as a diagnostic test in the future. However, no investigations have focused on whether NGS-based repertoires will properly reflect antibody gene frequencies and somatic hypermutation patterns defined by Sanger sequencing. To address this issue, we isolated paired CSF samples from eight patients who either had MS or were at risk to develop MS. Here, we present data that antibody gene frequencies and somatic hypermutation patterns are similar in Sanger and NGS-based antibody repertoires from these paired CSF samples. In addition, AGS scores derived from the NGS database correctly identified the patients who initially had or subsequently converted to RRMS, with precision similar to that of the Sanger sequencing approach. Further investigation of the utility of the AGS in predicting conversion to MS using NGS-derived antibody repertoires in a larger cohort of patients is warranted.
    Frontiers in Neurology 09/2014; 5:166.


Available from
Jun 5, 2014