[Show abstract][Hide abstract] ABSTRACT: The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order i) the sequence, ii) the alignment, and iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established nomenclature for CE-based STR analysis will remain unchanged in the future, the nomenclature of sequence-based STR genotypes will need to follow updated rules and be generated by expert systems that translate MPS sequences to match CE conventions in order to guarantee compatibility between the different generations of STR data.
[Show abstract][Hide abstract] ABSTRACT: Ancestry informative markers (AIMs) can be used to detect and adjust for population stratification and predict the ancestry of the source of an evidence sample. Autosomal single nucleotide polymorphisms (SNPs) are the best candidates for AIMs. It is essential to identify the most informative AIM SNPs across relevant populations. Several informativeness measures for ancestry estimation have been used for AIMs selection: absolute allele frequency differences (δ), F statistics (F
ST), and informativeness for assignment measure (In). However, their efficacy has not been compared objectively, particularly for determining affiliations of major US populations. In this study, these three measures were directly compared for AIMs selection among four major US populations, i.e., African American, Caucasian, East Asian, and Hispanic American. The results showed that the F
ST panel performed slightly better for population resolution based on principal component analysis (PCA) clustering than did the δ panel and both performed better than the In panel. Therefore, the 23 AIMs selected by the F
ST measure were used to characterize the four major American populations. Genotype data of nine sample populations were used to evaluate the efficiency of the 23-AIMs panel. The results indicated that individuals could be correctly assigned to the major population categories. Our AIMs panel could contribute to the candidate pool of AIMs for potential forensic identification purposes.
Full-text · Article · Dec 2015 · Deutsche Zeitschrift für die Gesamte Gerichtliche Medizin
[Show abstract][Hide abstract] ABSTRACT: Forensic and ancient DNA samples often are damaged and in limited quantity as a result of exposure to harsh environments and the passage of time. Several strategies have been proposed to address the challenges posed by degraded and low copy templates, including a PCR based whole genome amplification method called degenerate oligonucleotide-primed PCR (DOP-PCR). This study assessed the efficacy of four modified versions of the original DOP-PCR primer that retain at least a portion of the 5′ defined sequence and alter the number of bases on the 3′ end. The use of each of the four modified primers resulted in improved STR profiles from environmentally-damaged bloodstains, contemporary human skeletal remains, American Civil War era bone samples, and skeletal remains of WWII soldiers over those obtained by previously described DOP-PCR methods and routine STR typing. Additionally, the modified DOP-PCR procedure allows for a larger volume of DNA extract to be used, reducing the need to concentrate the sample and thus mitigating the effects of concurrent concentration of inhibitors.
[Show abstract][Hide abstract] ABSTRACT: Massively parallel sequencing (MPS) technology is capable of determining the sizes of short tandem repeat (STR) alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs) within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics). The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles.
[Show abstract][Hide abstract] ABSTRACT: While capillary electrophoresis-based technologies have been the mainstay for human identity typing applications, there are limitations with this methodology's resolution, scalability, and throughput. Massively parallel sequencing (MPS) offers the capability to multiplex multiple types of forensically-relevant markers and multiple samples together in one run all at an overall lower cost per nucleotide than traditional capillary electrophoresis-based methods; thus, addressing some of these limitations. MPS also is poised to expand forensic typing capabilities by providing new strategies for mixture deconvolution with the identification of intra-STR allele sequence variants and the potential to generate new types of investigative leads with an increase in the overall number and types of genetic markers being analyzed. The beta version of the Illumina ForenSeq DNA Signature Prep Kit is a MPS library preparation method with a streamlined workflow that allows for targeted amplification and sequencing of 63 STRs and 95 identity SNPs, with the option to include an additional 56 ancestry SNPs and 22 phenotypic SNPs depending on the primer mix chosen for amplification, on the MiSeq desktop sequencer (Illumina). This study was divided into a series of experiments that evaluated reliability, sensitivity of detection, mixture analysis, concordance, and the ability to analyze challenged samples. Genotype accuracy, depth of coverage, and allele balance were used as informative metrics for the quality of the data produced. The ForenSeq DNA Signature Prep Kit produced reliable, reproducible results and obtained full profiles with DNA input amounts of 1 ng. Data were found to be concordant with current capillary electrophoresis methods, and mixtures at a 1:19 ratio were resolved accurately. Data from the challenged samples showed concordant results with current DNA typing methods with markers in common and minimal allele drop out from the large number of markers typed on these samples. This set of experiments indicates the beta version of the ForenSeq DNA Signature Prep Kit is a valid tool for forensic DNA typing and warrants full validation studies of this MPS technology.
[Show abstract][Hide abstract] ABSTRACT: Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.
Full-text · Article · Aug 2015 · Frontiers in Bioengineering and Biotechnology
[Show abstract][Hide abstract] ABSTRACT: Mitochondrial DNA is a useful marker for population studies, human identification, and forensic analysis. Commonly used hypervariable regions I and II (HVI/HVII) were reported to contain as little as 25 % of mitochondrial DNA variants and therefore the majority of power of discrimination of mitochondrial DNA resides in the coding region. Massively parallel sequencing technology enables entire mitochondrial genome sequencing. In this study, buccal swabs were collected from 114 unrelated Estonians and whole mitochondrial genome sequences were generated using the Illumina MiSeq system. The results are concordant with previous mtDNA control region reports of high haplogroup HV and U frequencies (47.4 and 23.7 % in this study, respectively) in the Estonian population. One sample with the Northern Asian haplogroup D was detected. The genetic diversity of the Estonian population sample was estimated to be 99.67 and 95.85 %, for mtGenome and HVI/HVII data, respectively. The random match probability for mtGenome data was 1.20 versus 4.99 % for HVI/HVII. The nucleotide mean pairwise difference was 27 ± 11 for mtGenome and 7 ± 3 for HVI/HVII data. These data describe the genetic diversity of the Estonian population sample and emphasize the power of discrimination of the entire mitochondrial genome over the hypervariable regions.
Full-text · Article · Aug 2015 · Deutsche Zeitschrift für die Gesamte Gerichtliche Medizin
[Show abstract][Hide abstract] ABSTRACT: Allele distributions for 13 tetrameric short tandem repeat (STR) loci, CSF1PO, FGA, TH01, TPOX, VWA, D3S1358, D5S818, D7S820, D8S1179, D13S317, D16S539, D18S51, and D21S11, were determined in African American, United States Caucasian, Hispanic, Bahamian, Jamaican, and Trinidadian sample populations. There was little evidence for departures from Hardy-Weinberg expectations (HWE) in any of the populations. Based on the exact test, the loci that departed significantly from HWE are: D21S11 (p = 0.010, Bahamians); CSF1PO (p = 0.014, Trinidadians); TPOX (p = 0.011, Jamaicans and p = 0.035, U.S. Caucasians); and D16S539 (p = 0.043, Bahamians). After employing the Bonferroni correction for the number of loci analyzed (i.e., 13 loci per database), these observations are not likely to be significant. There is little evidence for association of alleles between the loci in these databases. The allelic frequency data are similar to other comparable data within the same major population group.
No preview · Article · Jul 2015 · Journal of Forensic Sciences
[Show abstract][Hide abstract] ABSTRACT: To perform a blind study to assess the capability of the Ion Personal Genome Machine® (PGM™) system to sequence forensically relevant genetic marker panels and to characterize unknown individuals for ancestry and possible relatedness.
Twelve genomic samples were provided by a third party for blinded genetic analysis. For these 12 samples, the mitochondrial genome and three PGM™ panels containing human identity single nucleotide polymorphisms (SNPs), ancestry informative SNPs, and short tandem repeats (STRs) were sequenced on the PGM™ system and analyzed.
All four genetic systems were run and analyzed on the PGM™ system in a reasonably quick time frame. Completeness of genetic profiles, depth of coverage, strand balance, and allele balance were informative metrics that illustrated the quality and reliability of the data produced. SNP genotypes allowed for identification of sex, paternal lineage, and population ancestry. STR genotypes were shown to be in complete concordance with genotypes generated by standard capillary electrophoresis-based technologies. Variants in the mitochondrial genome data provided information on population background and maternal relationships.
All results from analysis of the 12 genomic samples were consistent with sample information provided by the sample providers at the end of the blinded study. The relatively easy identification of intra-STR allele SNPs offered the potential for increased discrimination power. The promising nature of these results warrants full validation studies of this massively parallel sequencing technology and its further development for forensic data analysis.
Full-text · Article · Jun 2015 · Croatian Medical Journal
[Show description][Hide description] DESCRIPTION: National Institute of Justice report (January 2015), Office of Investigative and Forensic Sciences; in collaboration with the Forensic Technology Center of Excellence (FTCOE)
[Show abstract][Hide abstract] ABSTRACT: STR typing in forensic genetics has been performed traditionally using capillary electrophoresis (CE). However, CE-based method has some limitations: a small number of STR loci can be used; stutter products, dye artifacts and low level alleles. Massively parallel sequencing (MPS) has been considered a viable technology in recent years allowing high-throughput coverage at a relatively-affordable price. Some of the CE-based limitations may be overcome with the application of MPS. In this study, a prototype multiplex STR System (Promega) was amplified and prepared using the TruSeq DNA LT Sample Preparation Kit (Illumina) in 24 samples. Results showed that the MinElute PCR Purification Kit (Qiagen) was a better size selection method compared with recommended diluted bead mixtures. The library input sensitivity study showed that a wide range of amplicon product (6–200 ng) could be used for library preparation without apparent differences in the STR profile. PCR sensitivity study indicated that 62 pg may be minimum input amount for generating complete profiles. Reliability study results on 24 different individuals showed that high depth of coverage (DoC) and balanced heterozygote allele coverage ratios (ACRs) could be obtained with 250 pg of input DNA, and 62 pg could generate complete or nearly complete profiles. These studies indicate that this STR multiplex system and the Illumina MiSeq can generate reliable STR profiles at a sensitivity level that competes with current widely used CE-based method.