[Show abstract][Hide abstract] ABSTRACT: Massively parallel sequencing (MPS) technology is capable of determining the sizes of short tandem repeat (STR) alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs) within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics). The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles.
[Show abstract][Hide abstract] ABSTRACT: While capillary electrophoresis-based technologies have been the mainstay for human identity typing applications, there are limitations with this methodology's resolution, scalability, and throughput. Massively parallel sequencing (MPS) offers the capability to multiplex multiple types of forensically-relevant markers and multiple samples together in one run all at an overall lower cost per nucleotide than traditional capillary electrophoresis-based methods; thus, addressing some of these limitations. MPS also is poised to expand forensic typing capabilities by providing new strategies for mixture deconvolution with the identification of intra-STR allele sequence variants and the potential to generate new types of investigative leads with an increase in the overall number and types of genetic markers being analyzed. The beta version of the Illumina ForenSeq DNA Signature Prep Kit is a MPS library preparation method with a streamlined workflow that allows for targeted amplification and sequencing of 63 STRs and 95 identity SNPs, with the option to include an additional 56 ancestry SNPs and 22 phenotypic SNPs depending on the primer mix chosen for amplification, on the MiSeq desktop sequencer (Illumina). This study was divided into a series of experiments that evaluated reliability, sensitivity of detection, mixture analysis, concordance, and the ability to analyze challenged samples. Genotype accuracy, depth of coverage, and allele balance were used as informative metrics for the quality of the data produced. The ForenSeq DNA Signature Prep Kit produced reliable, reproducible results and obtained full profiles with DNA input amounts of 1 ng. Data were found to be concordant with current capillary electrophoresis methods, and mixtures at a 1:19 ratio were resolved accurately. Data from the challenged samples showed concordant results with current DNA typing methods with markers in common and minimal allele drop out from the large number of markers typed on these samples. This set of experiments indicates the beta version of the ForenSeq DNA Signature Prep Kit is a valid tool for forensic DNA typing and warrants full validation studies of this MPS technology.
[Show abstract][Hide abstract] ABSTRACT: Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.
Frontiers in Bioengineering and Biotechnology 08/2015; DOI:10.3389/fbioe.2015.00138
[Show abstract][Hide abstract] ABSTRACT: Mitochondrial DNA is a useful marker for population studies, human identification, and forensic analysis. Commonly used hypervariable regions I and II (HVI/HVII) were reported to contain as little as 25 % of mitochondrial DNA variants and therefore the majority of power of discrimination of mitochondrial DNA resides in the coding region. Massively parallel sequencing technology enables entire mitochondrial genome sequencing. In this study, buccal swabs were collected from 114 unrelated Estonians and whole mitochondrial genome sequences were generated using the Illumina MiSeq system. The results are concordant with previous mtDNA control region reports of high haplogroup HV and U frequencies (47.4 and 23.7 % in this study, respectively) in the Estonian population. One sample with the Northern Asian haplogroup D was detected. The genetic diversity of the Estonian population sample was estimated to be 99.67 and 95.85 %, for mtGenome and HVI/HVII data, respectively. The random match probability for mtGenome data was 1.20 versus 4.99 % for HVI/HVII. The nucleotide mean pairwise difference was 27 ± 11 for mtGenome and 7 ± 3 for HVI/HVII data. These data describe the genetic diversity of the Estonian population sample and emphasize the power of discrimination of the entire mitochondrial genome over the hypervariable regions.
Deutsche Zeitschrift für die Gesamte Gerichtliche Medizin 08/2015; DOI:10.1007/s00414-015-1249-4 · 2.71 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: To perform a blind study to assess the capability of the Ion Personal Genome Machine® (PGM™) system to sequence forensically relevant genetic marker panels and to characterize unknown individuals for ancestry and possible relatedness.
Twelve genomic samples were provided by a third party for blinded genetic analysis. For these 12 samples, the mitochondrial genome and three PGM™ panels containing human identity single nucleotide polymorphisms (SNPs), ancestry informative SNPs, and short tandem repeats (STRs) were sequenced on the PGM™ system and analyzed.
All four genetic systems were run and analyzed on the PGM™ system in a reasonably quick time frame. Completeness of genetic profiles, depth of coverage, strand balance, and allele balance were informative metrics that illustrated the quality and reliability of the data produced. SNP genotypes allowed for identification of sex, paternal lineage, and population ancestry. STR genotypes were shown to be in complete concordance with genotypes generated by standard capillary electrophoresis-based technologies. Variants in the mitochondrial genome data provided information on population background and maternal relationships.
All results from analysis of the 12 genomic samples were consistent with sample information provided by the sample providers at the end of the blinded study. The relatively easy identification of intra-STR allele SNPs offered the potential for increased discrimination power. The promising nature of these results warrants full validation studies of this massively parallel sequencing technology and its further development for forensic data analysis.
Croatian Medical Journal 06/2015; 56(3):218-29. DOI:10.3325/cmj.2015.56.218 · 1.31 Impact Factor
[Show description][Hide description] DESCRIPTION: National Institute of Justice report (January 2015), Office of Investigative and Forensic Sciences; in collaboration with the Forensic Technology Center of Excellence (FTCOE)
[Show abstract][Hide abstract] ABSTRACT: STR typing in forensic genetics has been performed traditionally using capillary electrophoresis (CE). However, CE-based method has some limitations: a small number of STR loci can be used; stutter products, dye artifacts and low level alleles. Massively parallel sequencing (MPS) has been considered a viable technology in recent years allowing high-throughput coverage at a relatively-affordable price. Some of the CE-based limitations may be overcome with the application of MPS. In this study, a prototype multiplex STR System (Promega) was amplified and prepared using the TruSeq DNA LT Sample Preparation Kit (Illumina) in 24 samples. Results showed that the MinElute PCR Purification Kit (Qiagen) was a better size selection method compared with recommended diluted bead mixtures. The library input sensitivity study showed that a wide range of amplicon product (6–200 ng) could be used for library preparation without apparent differences in the STR profile. PCR sensitivity study indicated that 62 pg may be minimum input amount for generating complete profiles. Reliability study results on 24 different individuals showed that high depth of coverage (DoC) and balanced heterozygote allele coverage ratios (ACRs) could be obtained with 250 pg of input DNA, and 62 pg could generate complete or nearly complete profiles. These studies indicate that this STR multiplex system and the Illumina MiSeq can generate reliable STR profiles at a sensitivity level that competes with current widely used CE-based method.
[Show abstract][Hide abstract] ABSTRACT: The majority of STR loci are not ideal for the analysis of forensic samples with
degraded and/or low template DNA. One alternative to overcome these limitations is
the use of bi-allelic markers, which have low mutation rates and shorter amplicons.
Human identification (HID) InDel marker panels have been described in several
countries, including Brazil. The commercial kit available is, however, mostly suitable for
Europeans, with lower discrimination power for other population groups. Recently a
combination of 49 InDel markers used in four different ethnic groups in the United
States has been shown to be more informative than another panel from Portugal,
already tested in a Rio de Janeiro sample. However, these 49 InDels have yet to be
applied to other admixed or isolated populations. We assessed the efficiency of this
panel in two urban admixed populations (Rio de Janeiro, Brazil; Tripoli, Libya), and one
isolated Native Brazilian community. All markers are in Hardy-Weinberg equilibrium
(HWE) after the Bonferroni correction, and no Linkage disequilibrium was detected.
Assuming loci independence and no substructure effect, cumulative RMP were 2.7x10-
18, 1.5x10-20, and 4.5x10-20 for Native Brazilian, Rio de Janeiro, and Tripoli
populations, respectively. The overall Fst value was 0.05512. Rio de Janeiro and
Tripoli showed similar admixture levels, however for Native Brazilians one parental
cluster represented over 60% of the total parental population. We conclude that this
panel is suitable for HID on these urban populations, but is less efficient for the isolated
International Journal of Legal Medicine 03/2015; 129(2). DOI:10.1007/s00414-014-1137-3 · 2.69 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Background
Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM.
24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%.
In this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed. Overall, the results of this study, based on orthogonal concordance testing and phylogenetic scrutiny, supported that whole mtGenome sequence data with high accuracy can be obtained using the PGM platform.
[Show abstract][Hide abstract] ABSTRACT: Biothreats are a high priority concern for public safety and national security. The field of microbial forensics was developed to analyze evidence associated with biological crimes in which microbes or their toxins are used as weapons. Microbial forensics is the scientific discipline dedicated to analyzing evidence from a bioterrorism act, biocrime, hoax, or inadvertent microorganism/toxin release for attribution purposes. Microbial forensics combines the practices of epidemiology with the characterization of microbial and microbial-related evidence to assist in determining the specific source of the sample, as individualizing as possible, and/or the methods, means, processes and locations involved to determine the identity of the perpetrator(s) of an attack.
Reference Module in Biomedical Sciences, Edited by Michael Caplan, 01/2015: chapter Medical Microbiology; Elsevier Inc.., ISBN: ISBN: 978-0-12-801238-3
[Show abstract][Hide abstract] ABSTRACT: The European Journal of Human Genetics is the official Journal of the European Society of Human Genetics, publishing high-quality, original research papers, short reports, News and Commentary articles and reviews in the rapidly expanding field of human genetics and genomics.
European journal of human genetics: EJHG 12/2014; DOI:10.1038/ejhg.2014.247 · 4.35 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The TruSeq™ Forensic Amplicon library preparation protocol, originally designed to attach sequencing adapters to chromatin-bound DNA for chromatin immunoprecipitation sequencing (TruSeq™ ChIP-Seq), was used here to attach adapters directly to amplicons containing markers of forensic interest. In this study, the TruSeq™ Forensic Amplicon library preparation protocol was used to detect 160 single nucleotide polymorphisms (SNPs), including human identification SNPs (iSNPs), ancestry, and phenotypic SNPs (apSNPs) in 12 reference samples. Results were compared with those generated by a second laboratory using the same technique, as well as to those generated by whole genome sequencing (WGS). The genotype calls made using the TruSeq™ Forensic Amplicon library preparation protocol were highly concordant. The protocol described herein represents an effective and relatively sensitive means of preparing amplified nuclear DNA for massively parallel sequencing (MPS).
Deutsche Zeitschrift für die Gesamte Gerichtliche Medizin 11/2014; 129(1):1-6. DOI:10.1007/s00414-014-1108-8 · 2.71 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Mitochondrial DNA testing is a useful tool in the analysis of forensic biological evidence. In cases where nuclear DNA is damaged or limited in quantity, the higher copy number of mitochondrial genomes available in a sample can provide information about the source of a sample. Currently, Sanger-type sequencing (STS) is the primary method to develop mitochondrial DNA profiles. This method is laborious and time consuming. Massively parallel sequencing (MPS) can increase the amount of information obtained from mitochondrial DNA samples while improving turnaround time by decreasing the numbers of manipulations and more so by exploiting high throughput analyses to obtain interpretable results. In this study 18 buccal swabs, three different tissue samples from five individuals, and four bones samples from casework were sequenced at hypervariable regions I and II using STS and MPS. Sample enrichment for STS and MPS was PCR-based. Library preparation for MPS was performed using Nextera® XT DNA Sample Preparation Kit and sequencing was performed on the MiSeq™ (Illumina, Inc.). MPS yielded full concordance of base calls with STS results, and the newer methodology was able to resolve length heteroplasmy in homopolymeric regions. This study demonstrates short amplicon MPS of mitochondrial DNA is feasible, can provide information not possible with STS, and lays the groundwork for development of a whole genome sequencing strategy for degraded samples.
[Show abstract][Hide abstract] ABSTRACT: STRait Razor (the STR Allele Identification Tool – Razor) was developed as a bioinformatic software tool to detect short tandem repeat (STR) alleles in massively parallel sequencing (MPS) raw data. The method of detection used by STRait Razor allows it to make reliable allele calls for all STR types in a manner that is similar to that of capillary electrophoresis. STRait Razor v2.0 incorporates several new features and improvements upon the original software, such as a larger default locus configuration file that increases the number of detectable loci (now including X-chromosome STRs and Amelogenin), an enhanced custom locus list generator, a novel output sorting method that highlights unique sequences for intra-repeat variation detection, and a genotyping tool that emulates traditional electropherogram data. Users also now have the option to choose whether the program detects autosomal, X-chromosome, Y-chromosome, or all STRs. Concordance testing was performed, and allele calls produced by STRait Razor v2.0 were completely consistent with those made by the original software.