High-throughput, high-fidelity HLA genotyping with deep sequencing

Stanford Genome Technology Center, Stanford University, Palo Alto, CA 94003, USA.
Proceedings of the National Academy of Sciences (Impact Factor: 9.67). 05/2012; 109(22):8676-81. DOI: 10.1073/pnas.1206614109
Source: PubMed


Human leukocyte antigen (HLA) genes are the most polymorphic in the human genome. They play a pivotal role in the immune response and have been implicated in numerous human pathologies, especially autoimmunity and infectious diseases. Despite their importance, however, they are rarely characterized comprehensively because of the prohibitive cost of standard technologies and the technical challenges of accurately discriminating between these highly related genes and their many allelles. Here we demonstrate a high-resolution, and cost-effective methodology to type HLA genes by sequencing, which combines the advantage of long-range amplification, the power of high-throughput sequencing platforms, and a unique genotyping algorithm. We calibrated our method for HLA-A, -B, -C, and -DRB1 genes with both reference cell lines and clinical samples and identified several previously undescribed alleles with mismatches, insertions, and deletions. We have further demonstrated the utility of this method in a clinical setting by typing five clinical samples in an Illumina MiSeq instrument with a 5-d turnaround. Overall, this technology has the capacity to deliver low-cost, high-throughput, and accurate HLA typing by multiplexing thousands of samples in a single sequencing run, which will enable comprehensive disease-association studies with large cohorts. Furthermore, this approach can also be extended to include other polymorphic genes.

Download full-text


Available from: Marcelo Fernande, Apr 18, 2014
    • "In recent years, various next generation sequencing platforms, 51 based on massively parallel clonal sequencing, have been used to 52 develop high resolution and high throughput HLA typing systems 53 [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]. In general, two basic strategies have been used: (1) an 54 amplicon sequencing approach focused on the highly polymorphic 55 regions (primarily exons) of HLA class I and class II genes [1] [2] [3] [4] [5] [6], or 56 (2) long-range PCR of full or partial length individual HLA gene loci, 57 followed by fragmentation, shot-gun sequencing and assembly [7– 58 11]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Compared to Sanger sequencing, next-generation sequencing offers advantages for high resolution HLA genotyping including increased throughput, lower cost, and reduced genotype ambiguity. Here we describe an enhancement of the Roche 454 GS GType HLA genotyping assay to provide very high resolution (VHR) typing, by the addition of 8 primer pairs to the original 14, to genotype 11 HLA loci. These additional amplicons help resolve common and well-documented alleles and exclude commonly found null alleles in genotype ambiguity strings. Simplification of workflow to reduce the initial preparation effort using early pooling of amplicons or the Fluidigm Access Array™ is also described. Performance of the VHR assay was evaluated on 28 well characterized cell lines using Conexio Assign MPS software which uses genomic, rather than cDNA, reference sequence. Concordance was 98.4%; 1.6% had no genotype assignment. Of concordant calls, 53% were unambiguous. To further assess the assay, 59 clinical samples were genotyped and results compared to unambiguous allele assignments obtained by prior sequence-based typing supplemented with SSO and/or SSP. Concordance was 98.7% with 58.2% as unambiguous calls; 1.3% could not be assigned. Our results show that the amplicon-based VHR assay is robust and can replace current Sanger methodology. Together with software enhancements, it has the potential to provide even higher resolution HLA typing. Copyright © 2015. Published by Elsevier Inc.
    Human immunology 05/2015; DOI:10.1016/j.humimm.2015.05.002 · 2.14 Impact Factor
    • "As a first step toward the establishment of a general HLA typing method (HLATyphon), we began with a algorithm similar to that of Wang et al. [11] and further optimized it to deal with the comparatively short reads generated by the MiSeq and to remove intronic and intergenic reads. First, the Sequence Polymorphism (SP) Reference Panel was obtained from the International Histocompatibility Working Group [4] "
    [Show abstract] [Hide abstract]
    ABSTRACT: We report the development of a general methodology to genotype HLA class I and class II loci. A Whole Genome Amplification (WGA) step was used as a sample sparing methodology. HLA typing data could be obtained with as few as 300 cells, underlining the usefulness of the methodology for studies for which limited cells are available. The next generation sequencing platform was validated using a panel of cell lines from the International Histocompatibility Working Group (IHWG) for HLA-A, -B, and -C. Concordance with the known, previously determined HLA types was 99%. We next developed a panel of primers to allow HLA typing of alpha and beta chains of the HLA DQ and DP loci and the beta chain of the DRB1 locus. For the beta chain genes, we employed a novel strategy using primers in the intron regions surrounding exon 2, and the introns surrounding exons 3 through 4 (DRB1) or 5 (DQB1 and DPB1). Concordance with previously determined HLA Class II types was also 99%. To increase throughput and decrease cost, we developed strategies combining multiple loci from each donor. Multiplexing of 96 samples per run resulted in increases in throughput of approximately 8-fold. The pipeline developed for this analysis (HLATyphon) is available for download at https://github.com/LJI-Bioinformatics/HLATyphon. Copyright © 2015. Published by Elsevier Inc.
    Human immunology 05/2015; DOI:10.1016/j.humimm.2015.04.007 · 2.14 Impact Factor
  • Source
    • "Although high-throughput, locus specific methods for genotyping of human MHC (HLA) loci have recently been made available [22], [23], accurately genotyping MHC loci in non-model organisms remains a complicated challenge [24]. The biggest barriers to genotyping MHC genes occur at the PCR stage, because the MHC genes that encode the antigen binding regions often exist in multiple paralogous copies within genomes [2], [25]–[27], making traditional cloning followed by Sanger sequencing problematic. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Genes of the vertebrate major histocompatibility complex (MHC) are of great interest to biologists because of their important role in immunity and disease, and their extremely high levels of genetic diversity. Next generation sequencing (NGS) technologies are quickly becoming the method of choice for high-throughput genotyping of multi-locus templates like MHC in non-model organisms.Previous approaches to genotyping MHC genes using NGS technologies suffer from two problems:1) a "gray zone" where low frequency alleles and high frequency artifacts can be difficult to disentangle and 2) a similar sequence problem, where very similar alleles can be difficult to distinguish as two distinct alleles. Here were present a new method for genotyping MHC loci - Stepwise Threshold Clustering (STC) - that addresses these problems by taking full advantage of the increase in sequence data provided by NGS technologies. Unlike previous approaches for genotyping MHC with NGS data that attempt to classify individual sequences as alleles or artifacts, STC uses a quasi-Dirichlet clustering algorithm to cluster similar sequences at increasing levels of sequence similarity. By applying frequency and similarity based criteria to clusters rather than individual sequences, STC is able to successfully identify clusters of sequences that correspond to individual or similar alleles present in the genomes of individual samples. Furthermore, STC does not require duplicate runs of all samples, increasing the number of samples that can be genotyped in a given project. We show how the STC method works using a single sample library. We then apply STC to 295 threespine stickleback (Gasterosteus aculeatus) samples from four populations and show that neighboring populations differ significantly in MHC allele pools. We show that STC is a reliable, accurate, efficient, and flexible method for genotyping MHC that will be of use to biologists interested in a variety of downstream applications.
    PLoS ONE 07/2014; 9(7):e100587. DOI:10.1371/journal.pone.0100587 · 3.23 Impact Factor
Show more