Technical ReportPDF Available

Integrating the DNA Integrity Number (DIN) to Assess Genomic DNA (gDNA) Quality Control Using the Agilent 2200 TapeStation System

Authors:

Abstract

Next Generation Sequencing (NGS) requires the input of high molecular weight genomic DNA (gDNA) to construct quality libraries for large scale sequencing projects , such as the 100K Pathogen Genome Project. The assessment of DNA integrity is a critical first step in obtaining meaningful data, and intact DNA is a key element for successful library construction. The Agilent 2200 TapeStation System plays an important role in the determination of the DNA quality using the DNA genomic assay. Profiles generated on the 2200 TapeStation System yield information on concentration , allow a visual inspection of the DNA quality, and generate a DNA Integrity Number (DIN), which is a value automatically assigned by the software that provides an indication of integrity (that is, lack of degradation). This application note describes a new software algorithm that has been developed to extract information about DNA sample integrity from the 2200 TapeStation System electrophoretic trace.
Integrating the DNA Integrity
Number (DIN) to Assess Genomic
DNA (gDNA) Quality Control Using
the Agilent 2200 TapeStation System
Authors
Nguyet Kong, Whitney Ng, Lucy Cai,
Alvin Leonardo, and Bart C. Weimer
100K Pathogen Genome Project
Population Health and Reproduction
Department
School of Veterinary Medicine
University of California-Davis
Davis, CA, USA
Lenore Kelly
Agilent Technologies, Inc.
Santa Clara, CA, USA
Application Note
Abstract
Next Generation Sequencing (NGS) requires the input of high molecular weight
genomic DNA (gDNA) to construct quality libraries for large scale sequencing pro-
jects, such as the 100K Pathogen Genome Project. The assessment of DNA integrity
is a critical first step in obtaining meaningful data, and intact DNA is a key element
for successful library construction. The Agilent 2200 TapeStation System plays an
important role in the determination of the DNA quality using the DNA genomic
assay. Profiles generated on the 2200 TapeStation System yield information on con-
centration, allow a visual inspection of the DNA quality, and generate a DNA
Integrity Number (DIN), which is a value automatically assigned by the software
that provides an indication of integrity (that is, lack of degradation). This application
note describes a new software algorithm that has been developed to extract
information about DNA sample integrity from the 2200 TapeStation System
electrophoretic trace.
The Agilent 2200 TapeStation System
2
Introduction
Reduced costs and high-throughput methods have rendered
microbial whole genome sequencing (WGS) accessible to
many applications in infectious disease, food safety, and
public health. The production of thousands of genomes repre-
sents a consortium of government, academic, and industrial
partners in a global effort to make these sequences public.
The 100K Pathogen Genome Project
http://100kgenome.vetmed.ucdavis.edu/ is sequencing
100,000 bacterial pathogens from around the globe. This large
scale next-generation sequencing project requires
high-throughput procedures for DNA extraction before library
construction and sequencing [1].
Genomic DNA (gDNA) extracts are often evaluated on
agarose gels, but this approach is not suitable for a
high-throughput workflow and automation. Size estimation
against a ladder coupled with densitometry to determine con-
centration often results in low-resolution images, and cannot
be automated. Assessment of gDNA quality is crucial,
because the next step in library preparation for automated
sequencing is DNA shearing, which requires high molecular
weight gDNA [2,3]. The Agilent 2200 TapeStation System and
associated Agilent Genomic DNA ScreenTape assay has the
potential to become the standard in DNA quality assessment
and quantification as well as provide the remaining QC checks
for the entire work flow [4].
The 2200 TapeStation Analysis Software generates an electro-
pherogram that provides a detailed visual assessment of the
DNA size distribution and fragments, virtual gel images, and
sample concentration. In addition, the software automatically
generates a value referred to as the DNA Integrity Number
(DIN) that determines the level of sample degradation as
opposed to the classical gel electrophoresis method that
inadequately determines sample integrity. These advantages
provide a quantitative basis for selecting gDNA samples to
proceed with into the next phase of library construction for
WGS.
DIN was developed to remove the manual interpretation of
the DNA integrity by evaluating the entire electrophoretic
trace. The DIN software algorithm allows for the classification
of total DNA based on a numbering system from 1 to 10, with
1 being the most degraded and 10 being the most intact (that
is, high molecular weight). This algorithm has been derived
from approximately 7,000 gDNA traces provided by Genomic
DNA ScreenTape users covering samples derived from whole
and dried blood, saliva, and human tissues from fresh, frozen,
and FFPE sources [4]. The DIN facilitates the interpretation of
electropherograms, allows for the comparison of samples, and
ensures the repeatability of experiments and quantitation of
high-quality gDNA moving into library construction.
Table 1. Bacterial Isolates Used to Investigate DNA Integrity Estimations
Bacterium
Gram
reaction
Approx. genome
size (Mb)
GC content
(%)
Average
DIN values
Campylobacter Negative 1.7 30 8.8
Staphylococcus Positive 2.8 32 8.9
Listeria Positive 2 38 8.9
Escherichia Negative 5 51 8.3
Salmonella Negative 5 52 8.6
Methods
As with all whole genome sequencing projects, the 100K
Pathogen Genome Project sample preparation workflow
begins with isolation of high molecular weight gDNA followed
by quality control metrics (intact gDNA, A260/230, and
A260/280 ratios) prior to production of sheared DNA for
library construction. Specific bacterial isolates with a range of
different GC content and genome sizes were chosen to vali-
date the DNA integrity using a 2200 TapeStation System. After
lysis, gDNA was isolated using the Qiagen QIAamp DNA Mini
Kit (51306) using the manufacturer’s instructions [5,6].
The isolated gDNA was analyzed using the 2200 TapeStation
System for high molecular weight prior to shearing and library
construction to obtain the DIN value [7-10].
3
Results and Discussion
Bacteria samples with a range of % GC content were obtained
using the Qiagen QIAamp Mini Kit. Genomic DNA was ana-
lyzed on the 2200 TapeStation System with the Genomic DNA
ScreenTape assay to obtain electropherograms, and resulted
in high molecular weight gDNA gel image similar to an
agarose gel (Figures 1 and 2). The gDNA data files used were
from already constructed libraries and were re-analyzed with
the 2200 TapeStation Analysis Software (version A.01.05) to
obtain the DIN value. This version of the software results in
the display of an electropherogram, virtual gel as well as a
DIN value calculated from the electropherogram of the gDNA.
The DIN value indicates the intactness of the DNA, giving a
qualitative measure of the integrity, which can be used to
compare across samples (Figure 2). This value can be used
before proceeding with the next steps in library construction.
Figure 1. Classical agarose gel with
the upper marker of 10 kb.
Figure 2. Electropherograms and gel images of Genomic DNA from
Agilent 2200 TapeStation with DIN values.
Microbe
A1 B1 C1 D1
DIN
8.9
DIN
8.9
DIN
8.5
Campylobacter
A1 B1 C1 D1
DIN
8.6
DIN
9.2
DIN
9.1
Staphylococcus
A1 B1 C1 D1
DIN
9.0
DIN
9.0
DIN
8.7
Listeria
A1 B1 C1 D1
DIN
8.4
DIN
8.2
DIN
8.4
Escherichia
A1 B1 C1 D1
DIN
8.6
DIN
8.5
DIN
8.6
Salmonella
Electropherogram Gel image
4
Three independent isolates of each bacteria were chosen to
be re-analyzed with the software, with an average DIN of 8.3
to 8.9 (Table 1), which showed that the gDNA input data were
acceptable. The input data include samples with a predefined
numeric system from 1 to 10. The gDNA input is shown with
the electropherograms to illustrate the DIN in the software
ranging from intact (DIN 9.2), to degraded (DIN 1.1) in
Figures 3-4. The specifications for the Genomic DNA
ScreenTape System indicates that the linear concentration
range for samples is 10–100 ng/µL, and that the DIN func-
tional range is from 5–300 ng/µL [8]. Within the Analysis
software, the concentration of gDNA is shown under the
samples (data not shown). The electropherogram and gel for
the DIN 1.1 sample shows that this degraded sample is too
dilute to be within useful range, while the concentrations of
the better quality DIN gDNA samples were comfortably with
these working ranges.
DIN 9. 2
Lowe r
Lowe r 1,106 6, 894
53, 929 L ower
Lowe r
10,0 30
Sampl e intensit y (FU)
800
700
600
500
400
300
200
100
100
250
400
600
900
1,20 0
1,50 0
2,0 00
2,5 00
3,00 0
4,00 0
7,000
15,00 0
48, 500
100
250
400
600
900
1,20 0
1,50 0
2,0 00
2,5 00
3,00 0
4,00 0
7,000
15,00 0
48, 500
100
250
400
600
900
1,20 0
1,50 0
2,0 00
2,5 00
3,00 0
4,00 0
7,000
15,00 0
48, 500
100
250
400
600
900
1,20 0
1,50 0
2,0 00
2,5 00
3,00 0
4,00 0
7,000
15,00 0
48, 500
0
Sampl e intensit y (FU)
0
50
100
150
200
250
300
350
Sampl e intensit y (FU)
0
50
100
150
200
250
300
Sampl e intensit y (FU)
0
100
200
300
400
DIN 3. 0
DIN 6. 2
DIN 1.1
Figure 3. Sample electropherograms to show the DIN in the software. Samples range from intact (DIN 9.2), to degraded (DIN 1.1).
Figure 4. Sample bacteria gel image that correspond to
the electropherogram of the DIN ranges from
Figure 3.
A1
DIN
DIN
9.2
DIN
6.2
DIN
3.0
DIN
1.1
B1 C1 D1 E1
5
It was determined that samples with a DIN of > 7 were
acceptable to progress into the next step of library construc-
tion (Table 1). Figure 5 shows a gDNA image with the average
DIN of 8.6 that produced quality final libraries with an average
size of 267 bp. However, a degraded gDNA sample (DIN of 6)
produced a final library with an average size of 198 bp, which
is out of the acceptable range of the typical 250–500 bp final
library requirement for WGS sequencing (Figure 6). The
Agilent 2200 TapeStation with the Genomic DNA ScreenTape
assay in the new software update automatically determines
the DIN value using their new algorithm for each gDNA
sample. In this assay, the libraries produced from gDNA with
a higher DIN were better quality than those produced from
gDNA in the lower range. The DIN number was successfully
used to assess the quality of gDNA.
Conclusion
Agilent Technologies has designed a software algorithm that
is capable of assessing DNA quality to produce a quantitative
measure of quality. The DIN algorithm was developed to
remove user dependent interpretation of DNA quality and to
provide a standardized assessment. However, successful
library construction is dependent on several variables. It is
essential that other quality assessments are made in addition
to the new DIN software to achieve optimum results.
Characterization of gDNA samples with DIN is independent of
the instrument, sample concentration, and the operator allow-
ing for unbiased comparison of the samples. The researcher
is no longer tied to arbitrary classification of total DNA, and it
can be used to ensure the consistency of library construction.
Figure 5. Three electropherograms of input gDNA from Salmonella (A) that
produced the examples of quality final libraries (B), shown
assayed with the Agilent D1000 ScreenTape assay.
Figure 6. An example electropherogram of input gDNA with a DIN of 6 (A)
that produced a low quality final library (B), shown assayed with
the Agilent D1000 ScreenTape assay.
A
B
Sampl e intensit y (FU)
600
500
400
300
200
100
100
250
400
600
900
1,20 0
1,50 0
2,0 00
2,5 00
3,00 0
4,00 0
7,000
15,00 0
48, 500
0
Sampl e intensit y (FU)
2,0 00
1,50 0
1,00 0
500
25
50
100
200
300
400
500
700
1,00 0
1,50 0
A
B
Sampl e intensit y (FU)
300
Lower
Lower Upper
10,030
198
250
200
150
100
50
100
250
400
600
900
1,20 0
1,50 0
2,0 00
2,5 00
3,00 0
4,00 0
7,000
15,00 0
48, 500
0
0
200
400
600
800
1,00 0
1,20 0
1,40 0
Sampl e intensit y (FU)
25
50
100
200
300
400
500
700
1,00 0
1,50 0
www.agilent.com/chem
Agilent shall not be liable for errors contained herein or for incidental or consequential
damages in connection with the furnishing, performance, or use of this material.
Information, descriptions, and specifications in this publication are subject to change
without notice.
© Agilent Technologies, Inc., 2014
Printed in the USA
December 18, 2014
5991-5442EN
Acknowledgement
We gratefully acknowledge the technical assistance provided
by Carol Huang, Regina Agulto, San Mak, Kendra Liu, Patrick
Ancheta and Christina Kong from the laboratory of Dr. Bart
Weimer at University of California, Davis.
References
1. N. Kong, et al. Automated Library Construction Using
KAPA Library Preparation Kits on the Agilent NGS
Workstation Yields High-Quality Libraries for Whole-
Genome Sequencing on the Illumina Platform, Agilent
Technologies, publication number 5991-4296EN (2014).
2. M. A. Quail, et al. “A large genome center’s improvements
to the Illumina sequencing system” Nature Methods
5,1005-1010 (2008).
3. S. Wilkening, et al. “Genotyping 1000 yeast strains by
next-generation sequencing” BMC Genomics 14:90
(2013).
4. M. Gassmann & B. McHoull, DNA Integrity Number (DIN)
with the Agilent 2200 TapeStation System and the Agilent
Genomic DNA ScreenTape Assay, Agilent Technologies,
publication number 5991-5258EN (2014).
5. Qiagen QIAamp DNA Mini Kit:
http://www.qiagen.com/us/products/catalog/sample-
technologies/dna-sample-technologies/genomic-
dna/qiaamp-dna-mini-kit/
6. R. Jeannotte, et al. “High-Throughput Analysis of
Foodborne Bacterial Genomic DNA Using Agilent 2200
TapeStation and Genomic DNA ScreenTape System”
Agilent Technologies, publication number 5991-4003EN
(2014).
7. N. Kong, et al. “Quality Control of High-Throughput
Library Construction Pipeline for KAPA HTP Library Using
Agilent 2200 TapeStation” Agilent Technologies, publica-
tion number 5991-5141EN (2014).
8. Agilent Genomic DNA ScreenTape System Quick Guide”
Agilent Technologies, publication number G2964-90040
rev.B (2013).
9. Agilent 2200 TapeStation User Manual”
Agilent Technologies, publication number G2964-90002
Rev. B (2013).
10. Agilent High Sensitivity D1K ScreenTape System Quick
Guide” Agilent Technologies, publication number
G2964-90131 Rev. B (2013).
For More Information
These data represent typical results. For more information
on our products and services, visit our Web site at
www.agilent.com/chem.
... Peaks were eliminated by fragment size (< 3000 bp) and integrated area (< 2). The software algorithm classi es DNA by the DIN, which is scaled 1-10 based on the level of degradation, where 1 is the most degraded and 10 is the least degraded 17 . A DIN > 7 is accepted as high molecular weight DNA. ...
... A DIN of 7 or higher was deemed satisfactory, indicating high genomic DNA quality with minimal degradation and maintained structural integrity. Kong et al. 17 note that this DIN threshold, as measured by the Agilent 2200 TapeStation System, ensures fewer fragmentation events and reliable performance for sequencing and PCR applications. ...
Preprint
Full-text available
This study assesses the feasibility of extracting high-quality DNA from blood samples stored at -20°C for up to 21 years under suboptimal conditions. It addresses sample mishandling in research, where many samples lack proper biobank protocols. Prior studies focused on short-term storage and controlled conditions, highlighting the negative effects of freeze-thaw cycles. This study evaluates whether DNA from long-term stored samples under suboptimal conditions can still meet quality standards for research purposes. Genomic DNA was extracted from 1,012 capillary blood samples from the Diabetes Prediction in Skåne study. Samples were stored at -20°C for 7 to 21 years, and DNA was isolated using QIAamp DNA Blood Mini kits. DNA quantity, purity, and quality were analyzed using spectrophotometry and automated electrophoresis. Overall, 75.7% of samples met quality standards for DNA quantity (≥20 ng/µL) and purity (A260/280 ratio 1.7–1.9), with the highest proportion in 12-year samples (83.5%). DNA quality was further assessed in 270 samples, where 57.8% had a DNA Integrity Number (DIN) of 7 or higher. Despite some contamination, the majority of samples were suitable for downstream applications like next-generation sequencing. This study suggests that historical blood samples stored under suboptmal conditions can still be viable for modern genomic analyses.
... Eigen-decomposition of the ST at each voxel yielded its three eigenvalues and their eigenvectors, which represent magnitude and direction of orientation of cells, respectively. The tertiary eigenvector, which has the smallest eigenvalue, represents the vector following the orientation of myocyte aggregates in their longitudinal axis due to correspondence with lowest intensity variation 49 . ...
... DNA integrity was highest for fresh frozen samples (without xation), with mean DIN values of 7.0 (with cryo-X-PCI) and 7.2 (control) and a range of 0.9 (cryo-X-PCI) and 1.0 (control) (Supplementary Table 3). All fresh frozen samples had DIN values that exceeded the minimum cut-off (DIN > 6, Fig. 3 and Supplementary Table 3) 49 . In 4% PFA-xed samples, mean DIN values were lower (mean 6.9, range 2.6), and were the lowest for 10% F samples (mean 3.0, range 1.6) ( Fig. 3 and Supplementary Table 3). ...
Preprint
Full-text available
Snap frozen biopsies serve as a valuable clinical resource of archival material for disease research, as they enable a comprehensive array of downstream analyses to be performed, including extraction and sequencing of nucleic acids. Obtaining three-dimensional (3D) structural information prior to multi-omics is more challenging but could potentially allow for better characterisation of tissues and targeting of clinically relevant cells. Conventional histological techniques are limited in this regard due to their destructive nature and the reconstruction artifacts produced by sectioning, dehydration, and chemical processing. These limitations are particularly notable in soft tissues such as the heart. In this study, we assessed the feasibility of using synchrotron-based cryo-X-ray phase contrast imaging (cryo-X-PCI) of snap frozen myocardial biopsies and 3D structure tensor analysis of aggregated myocytes, followed by nucleic acid (DNA and RNA) extraction and analysis. We show that optimal sample preparation is the key driver for successful structural and nucleic acid preservation which is unaffected by the process of cryo-X-PCI. We propose that cryo-X-PCI has clinical value for 3D tissue analysis of cardiac and potentially non-cardiac soft tissue biopsies prior to nucleic acid investigation.
... DNA templates of lower molecular weight may result in substantially decreased amplification efficiency. The Agilent TapeStation system measures the integrity of a DNA sample and generates a DNA integrity number (DIN) ranging from 1 to 10, where a low score indicates substantial DNA degradation and a score of "10" indicates high-quality DNA [24]. ...
Chapter
Polymerase chain reaction (PCR) is a laboratory technique used to amplify a targeted region of DNA, demarcated by a set of oligonucleotide primers. Long-range PCR is a form of PCR optimized to facilitate the amplification of large fragments. Using the adapted long-range PCR protocol described in this chapter, we were able to generate PCR products of 6.6, 7.2, 13, and 20 kb from human genomic DNA samples. For some of the long PCRs, successful amplification was not possible without the use of PCR enhancers. Thus, we also evaluated the impact of some enhancers on long-range PCR and included the findings as part of this updated chapter.Key wordsLong-range polymerase chain reaction (PCR)Long ampliconsPCR enhancersPCR additivesAgarose gel electrophoresisDNA polymeraseProofreading enzymeThermal cyclingPrimer designPharmacogenetics
... gDNA from the IHMS Protocol Q had 260/280 ratio ranging from 1.89 to 1.96, whereas four out of 10 samples extracted from the QP kit had 260/280 ratio below 1.8. The gDNA integrity was assessed with DNA Integrity Number (DIN) ranging from 1 to 10, where 1 indicates highly degraded gDNA and 10 represents highly intact gDNA (Nguyet et al., 2014). No significant difference in DIN values was found between the two protocols (p = 0.82); however, one sample with lower 260/280 ratio of 1.55 from the QP kit had also appreciably low gDIN value of 2.8. ...
Article
Full-text available
Gut microbiome plays a significant role in HIV-1 immunopathogenesis and HIV-1-associated complications. Previous studies have mostly been based on 16S rRNA gene sequencing, which is limited in taxonomic resolution at the genus level and inferred functionality. Herein, we performed a deep shotgun metagenomics study with the aim to obtain a more precise landscape of gut microbiome dysbiosis in HIV-1 infection. A reduced tendency of alpha diversity and significantly higher beta diversity were found in HIV-1-infected individuals on antiretroviral therapy (ART) compared to HIV-1-negative controls. Several species, such as Streptococcus anginosus, Actinomyces odontolyticus, and Rothia mucilaginosa, were significantly enriched in the HIV-1-ART group. Correlations were observed between the degree of immunodeficiency and gut microbiome in terms of microbiota composition and metabolic pathways. Furthermore, microbial shift in HIV-1-infected individuals was found to be associated with changes in microbial virulome and resistome. From the perspective of methodological evaluations, our study showed that different DNA extraction protocols significantly affect the genomic DNA quantity and quality. Moreover, whole metagenome sequencing depth affects critically the recovery of microbial genes, including virulome and resistome, while less than 5 million reads per sample is sufficient for taxonomy profiling in human fecal metagenomic samples. These findings advance our understanding of human gut microbiome and their potential associations with HIV-1 infection. The methodological assessment assists in future study design to accurately assess human gut microbiome.
... DNA concentration was determined using Quant-iT dsDNA Assay Kit (#P11496 Thermo Fisher Scientific, USA) following manufacturer instructions. Numerical assessment of DNA integrity using 2200 TapeStation software with Genomic DNA ScreenTape (Agilent Technologies, USA) generated a DNA Integrity Number (DIN; Kong et al., 2014). The average DIN of 8.9 (±0.43) indicates mostly intact DNA was obtained. ...
Article
The effects of maternal glucocorticoids (e.g. corticosterone, CORT) on offspring interest biologists due to increasing environmental perturbations. While little is known about the impact of maternal CORT on offspring fitness, it may modulate telomere length and compromise offspring health. Here, we use a modified real-time quantitative PCR assay to assess telomere length for small DNA quantities (<60 ng). We tested the hypothesis that increased maternal CORT during gestation decreases offspring telomere length. While CORT-driven telomere shortening is well established within individuals, cross-generational effects remain unclear. We treated wild-caught gravid female eastern fence lizards (Sceloporus undulatus) with daily transdermal applications of CORT, at ecologically relevant levels, from capture to laying. Maternal CORT treatment did not alter maternal telomere length, although baseline maternal CORT concentrations had a weak, negative correlation with maternal telomere length. There was no relation between mother and offspring telomere length. There was a trend for maternal CORT treatment to shorten telomeres of sons but not daughters. Our treatment replicated exposure to a single stressor per day, likely underestimating effects seen in the wild where stressors may be more frequent. Future research should further explore fitness consequences of maternal CORT effects.
... According to the manufacturer, for the classification of total DNA, DIN is used and ranged from 1 to 10, where 1 is the most degraded and 10 is the most intact DNA (with high molecular weight) (Kong et al., 2016). ...
Article
Full-text available
Lavender (genus Lavandula L., family Lamiaceae Lindl.) is a commercially valuable crop grown all over the world due to the great value of its essential oil, which is widely used in medicine and cosmetology. To identify the genetic relationships between various lavender cultivars, molecular and genetic mechanisms that encode their economically valuable characteristics and plant genotype, it is necessary to work out methods of molecular genetic analysis for these particular objects. The aim of the presented study was to assess the quantity and quality of DNA isolated from L. angustifolia by different ways. Lavender plants of the cultivar 'Belyanka' in vitro and ex situ were investigated. For estimation of essential oils inclusion accumulation in leaf tissues, the leaf sections were made, stained with Sudan III and investigated under light microscope. To isolate DNA from intact young leaves four different commercial kits such as DiamondDNA™, PureLink ® Plant Total DNA Purification Kit, GeneJET Plant Genomic DNA Purification Kit and MagNA Pure Compact Nucleic Acid Isolation Kit I were used. Our data confirmed that the quality of the DNA isolated from the cultivar 'Belyanka' leaves was dependent from plant material and presence of essential oil inclusions and other metabolites in tissues. The best results were demonstrated with kits utilizing silica-based membrane technology (PureLink and GeneJET). RAPD-PCR showed that all kits can be used for this analysis. The following sequencing needs more accurate plant material type selection and accounting all scores (not only spectrophotometric data) for clear results.
Article
Biological samples are important resources for scientific research. These samples are stored in biobanks over years until needed, and some of them can never be retrieved if they are improperly stored, causing them to be wasted. Thus, they are priceless, and they should be used correctly and effectively. Sample quality substantially affects biomedical research results. However, sample misidentification or mix-up is common. It is necessary to establish quality standards for sample identification. In this study, we used the Advanta Sample ID genotyping panel to detect homology identification and cross-contamination. We compared the single-nucleotide polymorphism (SNP) typing results of two different samples and calculated the similarity score of homologous sample pairs and nonhomologous sample pairs. Through analysis, we obtained a similarity score cutoff point of 0.8620, which was an effective way to distinguish homology and nonhomology. Cross-contamination was detected in two sets of mixtures (STD8:STD6 and jj3:1-P) mixed at a series of special ratios. Sensitivity was dependent on the sample characteristics and mixing ratios. Finally, we assessed the effect of sample degradation degree on SNP genotyping and found that degraded samples with a minimal DNA integrity number of 1.9 had complete genotyping results. On the whole, this study shows that the Sample ID panel is reliable for homology identification and cross-contamination analysis. Moreover, this technology has promising further applications in biological sample quality control.
Chapter
Full-text available
Homology has been a contentious topic for discussion for almost 200 years and the debate is ongoing. In its simplest definition, homology means “descended from a common ancestor.” Because of genetic recombination, or the replacement of one kind of character or trait with a different kind that can fulfil the same role, identifying homologs and indeed defining homology in detail is fraught with difficulty. In this chapter, I detail some of the history of the concept and link the concept to its uses in phylogeny reconstruction, developmental biology and networks of gene sharing.
Chapter
Full-text available
Genome methylation in bacteria is an area of intense interest because it has broad implications for bacteriophage resistance, replication, genomic diversity via replication fidelity, response to stress, gene expression regulation, and virulence. Increasing interest in bacterial DNA modification is coming about with investigation of host/microbe interactions and the microbiome association and coevolution with the host organism. Since the recognition of DNA methylation being important in Escherichia coli and bacteriophage resistance using restriction/modification systems, more than 43,600 restriction enzymes have been cataloged in more than 3600 different bacteria. While DNA sequencing methods have made great advances there is a dearth of method advances to examine these modifications in situ. However, the large increase in whole genome sequences has led to advances in defining the modification status of single genomes as well as mining new restriction enzymes, methyltransferases, and modification motifs. These advances provide the basis for the study of pan-epigenomes, population-scale comparisons among pangenomes to link replication fidelity and methylation status along with mutational analysis of mutLS. Newer DNA sequencing methods that include SMRT and nanopore sequencing will aid the detection of DNA modifications on the ever-increasing whole genome and metagenome sequences that are being produced. As more sequences become available, larger analyses are being done to provide insight into the role and guidance of bacterial DNA modification to bacterial survival and physiology.
Chapter
Full-text available
The comparison of multiple genome sequences sampled from a bacterial population reveals considerable diversity in both the core and the accessory parts of the pangenome. This diversity can be analysed in terms of microevolutionary events that took place since the genomes shared a common ancestor, especially deletion, duplication, and recombination. We review the basic modelling ingredients used implicitly or explicitly when performing such a pangenome analysis. In particular, we describe a basic neutral phylogenetic framework of bacterial pangenome microevolution, which is not incompatible with evaluating the role of natural selection. We survey the different ways in which pangenome data is summarised in order to be included in microevolutionary models, as well as the main methodological approaches that have been proposed to reconstruct pangenome microevolutionary history.
Technical Report
Full-text available
The initial step in Next Generation Sequencing is to construct a library from genomic DNA. To gain the optimum result, extracted DNA must be of high molecular weight with limited degradation. High-throughput sequencing projects, such as the 100K Pathogen Genome Project, require methods to rapidly assess the quantity and quality of genomic DNA extracts. In this study, assessment of the applicability of the Agilent 2200 TapeStation was done using genomic DNA from nine foodborne pathogens using several accepted high-throughput methods. The Agilent 2200 TapeStation System with Genomic DNA ScreenTape and Genomic DNA Reagents was easy to use with minimal manual intervention. An important advantage of the 2200 TapeStation over other high-throughput methods was that high molecular weight genomic DNA quality and quantity can be quantified apart from lower molecular weight size ranges, providing a distinct advantage in the library construction pipeline and over other methods available for this important step in the Next Generation Sequencing process.
Technical Report
Full-text available
Next Generation Sequencing requires the input of high molecular weight genomic DNA to construct quality libraries for whole genome bacterial sequencing. Large scale sequencing projects, such as the 100K Pathogen Genome Project, require methods to rapidly assess the quantity and quality of the input DNA using high-throughput methods that are fast and cost effective. In this study, the Agilent 2200 TapeStation and Agilent 2100 Bioanalyzer Systems were used to assess a few critical quality control steps for library construction. With minimal manual intervention , the Agilent 2200 TapeStation System determined the quality of genomic DNA, fragmented DNA, and final libraries constructed from multiple types of foodborne pathogens. The Agilent 2200 TapeStation System provided a single platform that effectively evaluated the necessary quality control steps, which provided a distinct advantage to decrease the time needed for library construction and a common instrument methodology for quality control.
Article
Full-text available
Background The throughput of next-generation sequencing machines has increased dramatically over the last few years; yet the cost and time for library preparation have not changed proportionally, thus representing the main bottleneck for sequencing large numbers of samples. Here we present an economical, high-throughput library preparation method for the Illumina platform, comprising a 96-well based method for DNA isolation for yeast cells, a low-cost DNA shearing alternative, and adapter ligation using heat inactivation of enzymes instead of bead cleanups. Results Up to 384 whole-genome libraries can be prepared from yeast cells in one week using this method, for less than 15 euros per sample. We demonstrate the robustness of this protocol by sequencing over 1000 yeast genomes at ~30x coverage. The sequence information from 768 yeast segregants derived from two divergent S. cerevisiae strains was used to generate a meiotic recombination map at unprecedented resolution. Comparisons to other datasets indicate a high conservation of recombination at a chromosome-wide scale, but differences at the local scale. Additionally, we detected a high degree of aneuploidy (3.6%) by examining the sequencing coverage in these segregants. Differences in allele frequency allowed us to attribute instances of aneuploidy to gains of chromosomes during meiosis or mitosis, both of which showed a strong tendency to missegregate specific chromosomes. Conclusions Here we present a high throughput workflow to sequence genomes of large number of yeast strains at a low price. We have used this workflow to obtain recombination and aneuploidy data from hundreds of segregants, which can serve as a foundation for future studies of linkage, recombination, and chromosomal aberrations in yeast and higher eukaryotes.
Article
Full-text available
The Wellcome Trust Sanger Institute is one of the world's largest genome centers, and a substantial amount of our sequencing is performed with 'next-generation' massively parallel sequencing technologies: in June 2008 the quantity of purity-filtered sequence data generated by our Genome Analyzer (Illumina) platforms reached 1 terabase, and our average weekly Illumina production output is currently 64 gigabases. Here we describe a set of improvements we have made to the standard Illumina protocols to make the library preparation more reliable in a high-throughput environment, to reduce bias, tighten insert size distribution and reliably obtain high yields of data.
DNA Integrity Number (DIN) with the Agilent 2200 TapeStation System and the Agilent Genomic DNA ScreenTape Assay
  • M Gassmann
  • B Mchoull
M. Gassmann & B. McHoull, DNA Integrity Number (DIN) with the Agilent 2200 TapeStation System and the Agilent Genomic DNA ScreenTape Assay, Agilent Technologies, publication number 5991-5258EN (2014).
sample- technologies/dna-sample-technologies/genomic- dna/qiaamp-dna-mini-kit/ 6 High-Throughput Analysis of Foodborne Bacterial Genomic DNA Using Agilent 2200 TapeStation and Genomic DNA ScreenTape System
  • Qiagen Qiaamp
  • Dna Mini
  • Kit R Jeannotte
Qiagen QIAamp DNA Mini Kit: http://www.qiagen.com/us/products/catalog/sample- technologies/dna-sample-technologies/genomic- dna/qiaamp-dna-mini-kit/ 6. R. Jeannotte, et al. " High-Throughput Analysis of Foodborne Bacterial Genomic DNA Using Agilent 2200 TapeStation and Genomic DNA ScreenTape System " Agilent Technologies, publication number 5991-4003EN (2014).
Agilent High Sensitivity D1K ScreenTape System Quick Guide Agilent Technologies, publication number G2964-90131
" Agilent High Sensitivity D1K ScreenTape System Quick Guide " Agilent Technologies, publication number G2964-90131 Rev. B (2013).