Article

High-throughput, high-fidelity HLA genotyping with deep sequencing

Stanford Genome Technology Center, Stanford University, Palo Alto, CA 94003, USA.
Proceedings of the National Academy of Sciences (Impact Factor: 9.81). 05/2012; 109(22):8676-81. DOI: 10.1073/pnas.1206614109
Source: PubMed

ABSTRACT Human leukocyte antigen (HLA) genes are the most polymorphic in the human genome. They play a pivotal role in the immune response and have been implicated in numerous human pathologies, especially autoimmunity and infectious diseases. Despite their importance, however, they are rarely characterized comprehensively because of the prohibitive cost of standard technologies and the technical challenges of accurately discriminating between these highly related genes and their many allelles. Here we demonstrate a high-resolution, and cost-effective methodology to type HLA genes by sequencing, which combines the advantage of long-range amplification, the power of high-throughput sequencing platforms, and a unique genotyping algorithm. We calibrated our method for HLA-A, -B, -C, and -DRB1 genes with both reference cell lines and clinical samples and identified several previously undescribed alleles with mismatches, insertions, and deletions. We have further demonstrated the utility of this method in a clinical setting by typing five clinical samples in an Illumina MiSeq instrument with a 5-d turnaround. Overall, this technology has the capacity to deliver low-cost, high-throughput, and accurate HLA typing by multiplexing thousands of samples in a single sequencing run, which will enable comprehensive disease-association studies with large cohorts. Furthermore, this approach can also be extended to include other polymorphic genes.

Download full-text

Full-text

Available from: Marcelo Fernande, Apr 18, 2014
1 Follower
 · 
254 Views
 · 
50 Downloads
    • "In recent years, various next generation sequencing platforms, 51 based on massively parallel clonal sequencing, have been used to 52 develop high resolution and high throughput HLA typing systems 53 [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]. In general, two basic strategies have been used: (1) an 54 amplicon sequencing approach focused on the highly polymorphic 55 regions (primarily exons) of HLA class I and class II genes [1] [2] [3] [4] [5] [6], or 56 (2) long-range PCR of full or partial length individual HLA gene loci, 57 followed by fragmentation, shot-gun sequencing and assembly [7– 58 11]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Compared to Sanger sequencing, next-generation sequencing offers advantages for high resolution HLA genotyping including increased throughput, lower cost, and reduced genotype ambiguity. Here we describe an enhancement of the Roche 454 GS GType HLA genotyping assay to provide very high resolution (VHR) typing, by the addition of 8 primer pairs to the original 14, to genotype 11 HLA loci. These additional amplicons help resolve common and well-documented alleles and exclude commonly found null alleles in genotype ambiguity strings. Simplification of workflow to reduce the initial preparation effort using early pooling of amplicons or the Fluidigm Access Array™ is also described. Performance of the VHR assay was evaluated on 28 well characterized cell lines using Conexio Assign MPS software which uses genomic, rather than cDNA, reference sequence. Concordance was 98.4%; 1.6% had no genotype assignment. Of concordant calls, 53% were unambiguous. To further assess the assay, 59 clinical samples were genotyped and results compared to unambiguous allele assignments obtained by prior sequence-based typing supplemented with SSO and/or SSP. Concordance was 98.7% with 58.2% as unambiguous calls; 1.3% could not be assigned. Our results show that the amplicon-based VHR assay is robust and can replace current Sanger methodology. Together with software enhancements, it has the potential to provide even higher resolution HLA typing. Copyright © 2015. Published by Elsevier Inc.
    Human immunology 05/2015; DOI:10.1016/j.humimm.2015.05.002 · 2.28 Impact Factor
  • Source
    • "Although high-throughput, locus specific methods for genotyping of human MHC (HLA) loci have recently been made available [22], [23], accurately genotyping MHC loci in non-model organisms remains a complicated challenge [24]. The biggest barriers to genotyping MHC genes occur at the PCR stage, because the MHC genes that encode the antigen binding regions often exist in multiple paralogous copies within genomes [2], [25]–[27], making traditional cloning followed by Sanger sequencing problematic. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Genes of the vertebrate major histocompatibility complex (MHC) are of great interest to biologists because of their important role in immunity and disease, and their extremely high levels of genetic diversity. Next generation sequencing (NGS) technologies are quickly becoming the method of choice for high-throughput genotyping of multi-locus templates like MHC in non-model organisms.Previous approaches to genotyping MHC genes using NGS technologies suffer from two problems:1) a "gray zone" where low frequency alleles and high frequency artifacts can be difficult to disentangle and 2) a similar sequence problem, where very similar alleles can be difficult to distinguish as two distinct alleles. Here were present a new method for genotyping MHC loci - Stepwise Threshold Clustering (STC) - that addresses these problems by taking full advantage of the increase in sequence data provided by NGS technologies. Unlike previous approaches for genotyping MHC with NGS data that attempt to classify individual sequences as alleles or artifacts, STC uses a quasi-Dirichlet clustering algorithm to cluster similar sequences at increasing levels of sequence similarity. By applying frequency and similarity based criteria to clusters rather than individual sequences, STC is able to successfully identify clusters of sequences that correspond to individual or similar alleles present in the genomes of individual samples. Furthermore, STC does not require duplicate runs of all samples, increasing the number of samples that can be genotyped in a given project. We show how the STC method works using a single sample library. We then apply STC to 295 threespine stickleback (Gasterosteus aculeatus) samples from four populations and show that neighboring populations differ significantly in MHC allele pools. We show that STC is a reliable, accurate, efficient, and flexible method for genotyping MHC that will be of use to biologists interested in a variety of downstream applications.
    PLoS ONE 07/2014; 9(7):e100587. DOI:10.1371/journal.pone.0100587 · 3.23 Impact Factor
  • Source
    • "In a previously reported Illumina genotyping study [Wang et al., 2012], encoded sequencing adaptors were used to identify the source of sequence reads for each sample, requiring library construction for each sample. In this study, both primer MIDs and encoded sequencing adaptors were used to trace the source of sequence reads for each sample. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Accurate genotyping is important for genetic testing. Sanger sequencing based typing is the gold standard for genotyping, but it has been underused, due to its high cost and low throughput. In contrast, short-read sequencing provides inexpensive and high throughput sequencing, holding great promise for reaching the goal of cost-effective and high-throughput genotyping. However, the short read length and the paucity of appropriate genotyping methods, pose a major challenge. Here we present RCHSBT - Reliable, Cost-effective and High-throughput Sequence Based Typing pipeline - which takes short sequence reads as input, but uses a unique variant calling, haploid sequence assembling algorithm, can accurately genotype with greater effective length per amplicon than even Sanger sequencing reads. The RCHSBT method was tested for the human MHC loci HLA-A, HLA-B, HLA-C, HLA-DQB1 and HLA-DRB1, upon 96 samples using Illumina PE 150 reads. Amplicons as long as 950 bp were readily genotyped, achieving 100% typing concordance between RCHSBT-called genotypes and genotypes previously called by Sanger sequence. Genotyping throughput was increased over ten times, and cost was reduced over five times, for RCHSBT as compared with Sanger sequence genotyping. We thus demonstrate RCHSBT to be a genotyping method comparable to Sanger sequencing based typing in quality, while being more cost-effective, and higher throughput. This article is protected by copyright. All rights reserved.
    Human Mutation 02/2014; 34(12). DOI:10.1002/humu.22439 · 5.05 Impact Factor
Show more