ArticlePDF Available

Automation of PacBio SMRTbell NGS library preparation for bacterial genome sequencing

Authors:

Abstract and Figures

Background The PacBio RS II provides for single molecule, real-time DNA technology to sequence genomes and detect DNA modifications. The starting point for high-quality sequence production is high molecular weight genomic DNA. To automate the library preparation process, there must be high-throughput methods in place to assess the genomic DNA, to ensure the size and amounts of the sheared DNA fragments and final library. FindingsThe library construction automation was accomplished using the Agilent NGS workstation with Bravo accessories for heating, shaking, cooling, and magnetic bead manipulations for template purification.The quality control methods from gDNA input to final library using the Agilent Bioanalyzer System and Agilent TapeStation System were evaluated. Conclusions Automated protocols of PacBio 10 kb library preparation produced libraries with similar technical performance to those generated manually. The TapeStation System proved to be a reliable method that could be used in a 96-well plate format to QC the DNA equivalent to the standard Bioanalyzer System results. The DNA Integrity Number that is calculated in the TapeStation System software upon analysis of genomic DNA is quite helpful to assure that the starting genomic DNA is not degraded. In this respect, the gDNA assay on the TapeStation System is preferable to the DNA 12000 assay on the Bioanalyzer System, which cannot run genomic DNA, nor can the Bioanalyzer work directly from the 96-well plates.
Content may be subject to copyright.
S T A N D A R D O P E R A T I N G P R O C E D U R E Open Access
Automation of PacBio SMRTbell NGS library
preparation for bacterial genome
sequencing
Nguyet Kong
1
, Whitney Ng
2
, Kao Thao
3
, Regina Agulto
1
, Allison Weis
1
, Kristi Spittle Kim
4
, Jonas Korlach
4
,
Luke Hickey
4
, Lenore Kelly
5
, Stephen Lappin
5
and Bart C. Weimer
1*
Abstract
Background: The PacBio RS II provides for single molecule, real-time DNA technology to sequence genomes and
detect DNA modifications. The starting point for high-quality sequence production is high molecular weight
genomic DNA. To automate the library preparation process, there must be high-throughput methods in place to
assess the genomic DNA, to ensure the size and amounts of the sheared DNA fragments and final library.
Findings: The library construction automation was accomplished using the Agilent NGS workstation with Bravo
accessories for heating, shaking, cooling, and magnetic bead manipulations for template purification.
The quality control methods from gDNA input to final library using the Agilent Bioanalyzer System and Agilent
TapeStation System were evaluated.
Conclusions: Automated protocols of PacBio 10 kb library preparation produced libraries with similar technical
performance to those generated manually. The TapeStation System proved to be a reliable method that could be
used in a 96-well plate format to QC the DNA equivalent to the standard Bioanalyzer System results. The DNA
Integrity Number that is calculated in the TapeStation System software upon analysis of genomic DNA is quite
helpful to assure that the starting genomic DNA is not degraded. In this respect, the gDNA assay on the
TapeStation System is preferable to the DNA 12000 assay on the Bioanalyzer System, which cannot run genomic
DNA, nor can the Bioanalyzer work directly from the 96-well plates.
Keywords: PacBio SMRTbell NGS library preparation, Bacterial genomic DNA, Automation, NGS workstation,
TapeStation System, Bioanalyzer
Introduction
Increased throughput from the use of next generation
sequencing methods has revealed new information about
the function and structure of bacterial genomes. The use
of short reads to produce draft genomes leads to prob-
lems with GC content bias and repeat regions that make
it tedious to produce closed genome assemblies. This
technical note discusses the PacBio RS II approach using
a single molecule, real-time DNA sequencing approach
to improve genome assembly through extra-long read
lengths. By reducing the number of contigs, the accuracy
of the de novo assembly of bacterial whole genomes is
facilitated. The real-time technology of the PacBio RS II
allows determination of not only the full, closed, gDNA
sequence, but also epigenetic modifications and plasmid
DNA sequence simultaneously.
The 100K Pathogen Genome Project [1] is using the
PacBio 10 kb SMRTbell Template Preparation kit to
produce 1,000 closed genomes. The scale of this project
required automation of the construction of the sequen-
cing (SMRTbell) library. To prepare libraries for
sequencing in this way, gDNA must be cut into frag-
ments to a target size of 10 kb. Critical to generating
long sub-reads, it is important to start with high quality
gDNA input in order to shear the gDNA into the target
fragment size to ensure the correct concentrations
* Correspondence: bcweimer@ucdavis.edu
1
Population Health and Reproduction Department, School of Veterinary
Medicine, University of California-Davis, Davis, CA, USA
Full list of author information is available at the end of the article
© The Author(s). 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Kong et al. Standards in Genomic Sciences (2017) 12:27
DOI 10.1186/s40793-017-0239-1
during library construction to react properly with the
concentrations of reagents in each of the given steps. Gel
electrophoresis is a low-resolution traditional method with
sizing against a ladder and determining concentration on
an agarose gel by comparing peak density to a standard,
and since it cannot be automated, is not suitable for a pro-
ject of this size. Another way to measure size and concen-
tration is to use the Agilent 2100 Bioanalyzer with the
DNA 12000 assay, but the instrument only runs 12 sam-
ples at a time and cannot be automated. We will discuss
the automation of preparation of libraries with the
SMRTbell Template Preparation kit as well as analysis of
gDNA, fragmented DNA and the final libraries ready for
sequencing with both the Agilent electrophoresis plat-
form: Agilent 2100 Bioanalyzer System using the DNA
12000 assay and the Agilent TapeStation System using the
genomic DNA ScreenTape and matching reagents.
Procedure
Campylobacter jejuni, Listeria monocytogenes, Vibrio flu-
vialis and Salmonella enterica serovar. Enteritidis were
cultured in appropriate culture medium and growing
condition listed in Tables 1 and 2. Bacteria were cultured
on the appropriate agar and pellets were made for ex-
traction. DNA was extracted from the cell pellets using
a kit and clean-up was accomplished with a spin column
[24]. Absorbance ratios at 260/280 and 260/230 were
measured with a NanoDrop 2000 UV-vis spectropho-
tometer (Thermo Fisher Scientific, Waltham MA). A
Qubit 2.0 Fluorometer (Q32866) was used with a Qubit
dsDNA HS Assay Kit (Q32854, both from Invitrogen,
Carlsbad CA) to measure the gDNA concentration and
confirm DNA input of 10 μg before shearing. The initial
evaluation of the quantity and size distribution of the
purified gDNA was with the Agilent 2200 TapeStation
Nucleic Acid System (G2965AA) controlled by Agilent
2200 TapeStation Software A.01.05, using the Agilent
Genomic DNA ScreenTape (50675365) and the Agilent
Genomic DNA Reagents (50675366) with samples
drawn from a 96-well plate [5, 6]
Genomic DNA was sheared using the Covaris g-TUBE
device (520079) according to the manufacturer specifica-
tions [7]. After fragmentation, DNA was evaluated with
the TapeStation System with the Genomic DNA assay and
also with the Agilent 2100 Bioanalyzer System with the
Agilent DNA 12000 assay (50671508) [8, 9]. Both of
these methods have minimal sample consumption and re-
turn both sizing and quantitation. The sheared gDNA
sample input was normalized for all samples between 15
μg into library construction for PacBio SMRTbell 10 kb
Library Preparation.
The SMRTbell Template Preparation kit from Pacific
Biosciences (Menlo Park CA) was used on the Agilent
NGS Workstation (G5522A, Agilent Technologies, Santa
Clara CA). The workflow to construct the final DNA
libraries for sequencing is shown in Fig. 1 and involved
automation of these steps:
1. Determination of the quality of the gDNA
2. Fragment gDNA using a Covaris g-TUBE device
3. QC the sizing and adjust the concentration
4. Repair DNA damage and repair ends of
fragmented DNA
5. Purify the DNA
6. Blunt-end ligate using blunt adapters
7. Purify template for submission to a sequencer
In Fig. 2, A (Post Shearing Clean-up) and B (10kb
Library Prep Runset Dual SPRI) are two of the VWorks
protocol graphical user interfaces that help with the
NGS Workstation setup and deck layout to optimize the
use of reagent volumes. This interface allows the user to
view the progress of the procedure. In Fig. 2c, the Excel
template assists with laying out the reagent amounts and
calculations, and provides a record of each batch of
reagents preparation and lot numbers.
With the automation, this workflow takes about 7 h for
post-shearing clean-up and library construction. Once the
PacBio 10 kb library is made, the final library was con-
firmed with the Agilent 2200 TapeStation with the
Genomic DNA ScreenTape assay and the Agilent 2100
Bioanalyzer System with the Agilent DNA 12000 assay to
determine the size of the library. Libraries are quantified
using a Qubit 2.0 Fluorometer (Q32866) with a Qubit
dsDNA HS Assay Kit (Q32854, both from Invitrogen,
Carlsbad CA) to measure the library concentration before
Table 1 Organisms used in this study
Kong et al. Standards in Genomic Sciences (2017) 12:27 Page 2 of 10
submission to the sequencing facility. The sequencing
facility anneals sequencing primer and binds polymerase
to the SMRTbell templates before loading the library onto
the PacBio RS II.
Discussion
The genomic DNA isolated from four model organisms
with a range of GC content were made into libraries
prepared on the Agilent NGS Workstation with PacBio
SMRTbell Template Preparation kit for sequencing on
the PacBio RSII. Finished sequences showed GC content
very close to the known GC content, thus showing this
process produced minimal bias (Table 1).
For the best results to produce genomic sequences, it
is important the starting material be relatively free of
organics and protein, and be at least 50 kilobases to
insure long fragments can be obtained for sequencing.
The microbes used are listed in Table 1 and include four
genera of varying length and GC content. The organisms
were cultured and genomic DNA was extracted followed
by spin column clean-up. The quality of the gDNA was
measured with the NanoDrop and the 260/280 nm and
the 260/230 nm ratios were calculated. The 260/280 nm
ratio and 260/230 nm ratio of 1.8 was the requirement
for further use of each extraction. The Agilent 2200
TapeStation System with the Genomic DNA assay was
used to assess size and concentration of each sample as
Table 2 gDNA quality, average shearing size and average final library for each bacterium
Fig. 1 PacBio SMRTbell Template Preparation Workflow for PacBio RS II system. PacBio SMRTbell Template Preparation Workflow for PacBio RS II
system. This workflow is used to prepare libraries from fragmented and concentrated DNA using Covaris g-TUBE and concentrated using the
AMPure magnetic beads before following PacBio SMRTbell 10 kb Library Preparation procedures
Kong et al. Standards in Genomic Sciences (2017) 12:27 Page 3 of 10
Fig. 2 VWorks protocols and Excel workbook for PacBio Library Preparation. VWorks protocols and Excel workbook for PacBio Library Preparation
method provide an interactive, visual layout for the end user. aPost Shearing Cleanup Form. b10 kb Library Prep Runset Dual SPRI Form. c
PacBio Library Excel Workbook
Kong et al. Standards in Genomic Sciences (2017) 12:27 Page 4 of 10
Fig. 3 Quantitation of Genomic DNA. Electropherogram (a) and gel image (b) of high molecular weight gDNA from Agilent 2200 TapeStation
using the Genomic DNA ScreenTape System. Campylobacter (green), Listeria (blue), Vibrio (aqua), and Salmonella (red). Green lines at the bottom of
the gel image are internal standards added to permit quantitation. Lower marker is not shown in the electropherogram
Kong et al. Standards in Genomic Sciences (2017) 12:27 Page 5 of 10
Fig. 4 Appearance of sheared DNA from Agilent 2100 Bioanalyzer analysis. Representative electropherogram (a) and virtual gel (b) are used for
visual inspection (generated with the Agilent 2100 Bioanalyzer system with the DNA 12000 Kit) of sheared bacterial genomic DNA with average
shearing size for Campylobacter (green, 10 kb), Listeria (blue, 13.5 kb), Vibrio (aqua,11.6 kb), and Salmonella (red, 17 kb). Peaks near 35 are the lower
marker internal standard for the DNA 12000 kit. A typical electropherogram using the Agilent Bioanalyzer 2100 DNA 12000 kit shows the lower
marker at 35 s and the upper marker at 90 s. The sheared DNA and the red upper marker, seen in the gel image, co-elute together
Kong et al. Standards in Genomic Sciences (2017) 12:27 Page 6 of 10
Fig. 5 Appearance of sheared DNA from Agilent 2200 TapeStation analysis. Representative electropherogram (a) and virtual gel (b) of sheared
bacterial genomic DNA was generated with the Agilent 2200 TapeStation genomic DNA Kit with the average shearing size for Campylobacter
(green, 16 kb), Listeria (blue, 12 kb), Vibrio (aqua, 14 kb), and Salmonella (red, 20 kb). Green lines at the bottom of the gel image are internal
standards added to permit quantitation. Lower marker is not shown in the electropherogram
Kong et al. Standards in Genomic Sciences (2017) 12:27 Page 7 of 10
Fig. 6 Appearance of DNA libraries from Agilent 2100 Bioanalyzer analysis. Representative electropherogram (a) and virtual gel (b) used for visual
inspection (generated with the Agilent 2100 Bioanalyzer system with the DNA 12000 Kit) of DNA libraries sizes prepared for sequencing with the
PacBio SMRTbell 10 kb Template Preparation Kit on the Agilent NGS Workstation. A typical electropherogram using the Agilent bioanalyzer 2100
DNA 12000 kit shows the lower marker at 35 s and the upper marker at 90 s. The DNA libraries and the upper marker co-elutes with each other,
the sharper peak is the upper marker, shown in red on the gel image. The average library sizes are: Campylobacter (green, 9.1 kb), Listeria
(blue, 9.5 kb), Vibrio (aqua, 10 kb), and Salmonella (red, 15 kb)
Kong et al. Standards in Genomic Sciences (2017) 12:27 Page 8 of 10
Fig. 7 Appearance of DNA libraries from Agilent 2200 TapeStation analysis. Representative electropherogram (a) and virtual gel (b) of DNA
libraries sizes (generated with the Agilent 2200 TapeStation DNA genomics Kit) prepared for sequencing with the PacBio SMRTbell 10kb Template
Preparation Kit on the Agilent NGS Workstation. The average library size for Campylobacter (green, 16 kb), Listeria (blue, 12 kb), Vibrio (aqua, 14 kb),
and Salmonella (red, 20 kb) is displayed on the software screen. Green lines at the bottom of the gel image are internal standards added to permit
quantitation. Lower marker is not shown in the electropherogram
Kong et al. Standards in Genomic Sciences (2017) 12:27 Page 9 of 10
shown in Fig. 3, where an electropherogram overlay and
virtual gel images are shown for the four model organ-
isms, together with the DIN calculated by the TapeStation
software. The DNA Integrity Number (DIN) helped estab-
lish a cut-off for the suitability of the gDNA for further
work and can be useful for library construction.
Following qualification of the gDNA, the next step is to
shear the gDNA into the target fragment size required for
library construction using a Covaris g-TUBE device
according to manufacturer instructions. It is important to
check the fragment size and the DNA amount prior to
proceeding with the library construction. Traditionally,
this has been done with the Agilent 2100 Bioanalyzer
system with the DNA 12000 kit and these results are
shown in Fig. 4 as overlaid electropherograms and a
virtual gel image together with the sizing ladder provided.
The DNA 12000 kit uses both a lower and an upper
marker as internal standard. For these samples with a tar-
get size of 10 kb, the DNA fragments usually run together
with the upper marker, which can be easily seen on the gel
image since it is shown in red. In the electropherogram
view, the upper marker is the sharp peak at 90 s. The Agi-
lent 2200 TapeStation System with the gDNA ScreenTape
assay can qualify the fragment size too, and this is shown
in Fig. 5. The assay has a larger range to quantify genomic
DNA larger than 12 kb with no upper marker and can run
directly out of a 96 well plate. It is important to determine
the correct sizing, in order for the sequencing facility to
properly load the libraries on the sequencer.
Libraries are made following the PacBio SMRTbell 10kb
Library Preparation on the Agilent NGS Workstation and
traditionally confirmed with the Agilent 2100 Bioanalyzer
System with the DNA 12000 kit, shown in Fig. 6. Thus,
with SMRTbell templates around 10 kb in size, itsdifficult
to determine the correct sizing for those libraries as these
constructs also run with the upper marker shown in red
on the virtual gel images. Since the Agilent 2200 TapeSta-
tion System can size larger fragments up to 60 kb, it can
determine the size more accurately, as shown in Fig. 7.
Conclusion
The PacBio SMRTbell 10 kb Library preparation kit can be
used with automation such as the Agilent Bravo to prepare
microbial libraries with minimal GC bias. QC of the start-
ing DNA and the required fragment preparation with the
Covaris g-TUBE can be done with the Agilent 2200 TapeS-
tation and the gDNA ScreenTape assay directly from the
96 well plates used by the Bravo to prepare the libraries.
Abbreviations
DIN: DNA integrity number; gDNA: Genomic DNA; NGS: Next generation
sequencing; SMRT: Single molecule, real-time
Acknowledgements
We gratefully acknowledge the technical assistance provided by Kerry Le,
Sum Leung, Christina Kong, Lucy Cai, Alvin Leonardo, Vivian Lee, Surene
Foutouhi and Patrick Ancheta. We thank the 100K Pathogen Genome
Sequencing Project for providing the cultures to conduct the study.
Funding
Funding provided to BCW (NIH - 1R01HD065122-01A1; NIH - U24-DK097154;
AGILENT TECHNOLOGIES THOUGHT LEADER AWARD, FDA - 5U01FD003572-04).
Availability of data and materials
All data was analyzed during this study are included in this published article.
Authorscontributions
NK isolated DNA, conducted experiments, analyzed TapeStation data, and
wrote the manuscript; WN and KT conducted experiments and analyzed
TapeStation data; RA & AW isolated DNA; KSK, JK and LH provided technical
assistance with library preparation; LK conceived of experiments, analyzed
data, and wrote the manuscript; SL provided programming of the
automation to run the protocols; BCW conceived of experiments, analyzed
data, and wrote the manuscript. All authors read and approved the final
manuscript.
Competing interests
Agilent Technologies provided test instruments and initial funding to BCW.
Pacific Biosciences provided PacBio SMRTBell 10 kb Library Preparation Kit
and sequencing.
Consent for publication
Not applicable.
Ethics approval and consent to participate
Not applicable.
PublishersNote
Springer Nature remains neutral with regard to jurisdictional claims in
published maps and institutional affiliations.
Author details
1
Population Health and Reproduction Department, School of Veterinary
Medicine, University of California-Davis, Davis, CA, USA.
2
Genentech, S. San
Francisco, CA, USA.
3
University of California-San Francisco, San Francisco, CA,
USA.
4
Pacific Biosciences, Menlo Park, CA, USA.
5
Agilent Technologies, Inc.,
Santa Clara, CA, USA.
Received: 2 July 2016 Accepted: 26 February 2017
References
1. 100K Pathogen Genome Project. 2013 [cited 2016 June 30]; Available from:
http://www.100kgenomes.org.
2. Kong N, et al. Production and analysis of high molecular weight genomic
DNA for NGS pipelines using Agilent DNA extraction kit (p/n 200600). 2013.
doi:10.13140/RG.2.1.2961.4807.
3. QIAamp DNA Mini Kit. 2016 [cited 2016 June 30]; Available from: https://
www.qiagen.com/us/shop/sample-technologies/dna/dna-preparation/
QIAamp-DNA-Mini-Kit#resources.
4. Greenspoon SA, et al. QIAamp spin columns as a method of DNA isolation
for forensic casework. J Forensic Sci. 1998;43(5):102430.
5. Agilent Genomic DNA ScreenTape System Quick Guide (p/n G2964-90040).
2016 [cited 2016 June 30]; Available from: http://www.agilent.com/cs/
library/usermanuals/Public/ScreenTape_gDNA_QG.pdf.
6. Agilent 2200 TapeStation User Manual (p/n G2964-90002). 2016 [cited 2016
June 30]; Available from: http://www.agilent.com/cs/library/usermanuals/
Public/G2964-90002_TapeStationPalpatine_USR_EN.pdf.
7. Covaris. Covaris USER MANUAL: g-TUBE 2012 [cited 2016 June 30]; Available
from: http://covarisinc.com/wp-content/uploads/pn_010154.pdf.
8. Agilent 2100 Bioanalyzer User Manual (p/n G2946-90004). 2016 [cited 2016
June 30]; Available from: https://www.agilent.com/cs/library/usermanuals/
Public/G2946-90004_Vespucci_UG_eBook_(NoSecPack).pdf.
9. Agilent Technologies, I. Agilent DNA 7500 and DNA 12000 Kit Quick Start
Guide. 2013 [cited 2016 June 30]; Available from: http://www.agilent.com/
cs/library/usermanuals/Public/G2938-90025_DNA7500-12000_QSG.pdf.
Kong et al. Standards in Genomic Sciences (2017) 12:27 Page 10 of 10
... Following the annealing of the sequencing primer v4 to the SMRTbell template, the complex was bound by DNA polymerase (Sequel II Binding Kit 2.0, Pacific Biosciences, USA). AMPure PB Bead Purification was then used to remove free primers and polymerase, after which a Sequel Sequencing Kit 2.0 (PacBio) was used for library sequencing, with 10 h videos being captured for each SMRT Cell 8 M with the Sequel II sequencing platform (BGI-Shenzhen, China) [27]. ...
Article
Full-text available
The identification of oleaginous yeast species capable of simultaneously utilizing xylose and glucose as substrates to generate value-added biological products is an area of key economic interest. We have previously demonstrated that the Cutaneotrichosporon dermatis NICC30027 yeast strain is capable of simultaneously assimilating both xylose and glucose, resulting in considerable lipid accumulation. However, as no high-quality genome sequencing data or associated annotations for this strain are available at present, it remains challenging to study the metabolic mechanisms underlying this phenotype. Herein, we report a 39,305,439 bp draft genome assembly for C. dermatis NICC30027 comprised of 37 scaffolds, with 60.15% GC content. Within this genome, we identified 524 tRNAs, 142 sRNAs, 53 miRNAs, 28 snRNAs, and eight rRNA clusters. Moreover, repeat sequences totaling 1,032,129 bp in length were identified (2.63% of the genome), as were 14,238 unigenes that were 1,789.35 bp in length on average (64.82% of the genome). The NCBI non-redundant protein sequences (NR) database was employed to successfully annotate 11,795 of these unigenes, while 3,621 and 11,902 were annotated with the Swiss-Prot and TrEMBL databases, respectively. Unigenes were additionally subjected to pathway enrichment analyses using the Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), Cluster of Orthologous Groups of proteins (COG), Clusters of orthologous groups for eukaryotic complete genomes (KOG), and Non-supervised Orthologous Groups (eggNOG) databases. Together, these results provide a foundation for future studies aimed at clarifying the mechanistic basis for the ability of C. dermatis NICC30027 to simultaneously utilize glucose and xylose to synthesize lipids.
... Be it in microbiology [59], synthetic biology [60][61][62], endocrinology [63], or genetics [58,[64][65][66], laboratory biologists are increasingly trusting automated liquid handling workstations to streamline their protocols. Genomics laboratories at prominent institutions have also already dipped their feet in liquid-handling automation, be it for gene expression, NGS, or third-generation sequencing for a number of diseases [67][68][69][70][71][72][73][74][75][76][77][78][79][80][81][82][83]. ...
Article
Full-text available
In research and clinical genomics laboratories today, sample preparation is the bottleneck of experiments, particularly when it comes to high-throughput next generation sequencing (NGS). More genomics laboratories are now considering liquid-handling automation to make the sequencing workflow more efficient and cost effective. The question remains as to its suitability and return on investment. A number of points need to be carefully considered before introducing robots into biological laboratories. Here, we describe the state-of-the-art technology of both sophisticated and do-it-yourself (DIY) robotic liquid-handlers and provide a practical review of the motivation, implications and requirements of laboratory automation for genome sequencing experiments.
... The bottom panel shows an example of a time trace in which adenine ("A" pulse) and cytosine ("C" pulse) are detected. A and C are reproduced with permission from Ref. [156], B is reproduced with permission from Ref. [273]. . The pore is used to capture and linearize nucleic acid molecules, which are detected by a decrease in the electrical current (I blocked ) during a translocation event. ...
Article
Nucleic acids are important biomarkers for disease detection, monitoring, and treatment. Advances in technologies for nucleic acid analysis have enabled discovery and clinical implementation of nucleic acid biomarkers. However, challenges remain with technologies for nucleic acid analysis, thereby limiting the use of nucleic acid biomarkers in certain contexts. Here, we review single-molecule technologies for nucleic acid analysis that can be used to overcome these challenges. We first discuss the various types of nucleic acid biomarkers important for clinical applications and conventional technologies for nucleic acid analysis. We then discuss technologies for single-molecule in vitro and in situ analysis of nucleic acid biomarkers. Finally, we discuss other ultra-sensitive techniques for nucleic acid biomarker detection.
Article
Full-text available
Here, we report the draft genome sequence of Desemzia sp. strain C1, which was isolated from oil-contaminated soil in South Korea and produces hydrogen peroxide (H 2 O 2 ). The genome of Desemzia sp. strain C1 contains genes encoding various oxidases involved in H 2 O 2 production and resistance to oxidative stress.
Article
Pacific Biosciences has developed a platform that may sequence one molecule of DNA in a period via the polymerization of that strand with one enzyme. Single-molecule real-time sequencing by Pacific BioSciences’ technology is one of the most widely utilized third-generation sequencing technologies. PacBio single-molecule real-time Sequencing uses the Zero-mode waveguide’s ingenuity to distinguish the best fluorescence signal from the stable fluorescent backgrounds generated by disorganized free-floating nucleotides. PacBio single-molecule real-time sequencing does not require PCR amplification, and the browse length is a hundred times longer than next-generation sequencing. It will only cover high-GC and high-repeat sections and is more accurate in quantifying low-frequency mutations. PacBio single-molecule real-time sequencing will have a relatively high error rate of 10%-15% (which is practically a standard flaw of existing single-molecule sequencing technology). In contrast to next-generation sequencing, however, the errors are unintentionally random. As a result, multiple sequencing will effectively rectify the bottom deviance. Unlike second-generation sequencing, PacBio sequencing may be a technique for period sequencing and doesn’t need an intermission between browse steps. These options distinguish PacBio sequencing from second-generation sequencing, therefore it’s classified because of the third-generation sequencing. PacBio sequencing produces extremely lengthy reads with a high error rate and low yield. Short reads refine alignments/assemblies/detections to single-nucleotide precision, whereas PacBio long reads provide reliable alignments, scaffolds, and approximate detections of genomic variations. Through extraordinarily long sequencing reads (average >10,000 bp) and high accord precision, the PacBio Sequencing System can provide a terribly high depth of genetic information. To measure and promote the event of modern bioinformatics tools for PacBio sequencing information analysis, a good browse machine is required.
Article
Over the last decade, whole transcriptome profiling, also known as RNA-sequencing (RNA-seq), has quickly gained traction as a reliable method for unbiased assessment of gene expression. Integration of RNA-seq expression data into other omics datasets (e.g., proteomics, metabolomics, or epigenetics) solidifies our understanding of cell-specific regulatory patterns, yielding pathways to investigate the key rules of gene regulation. A limitation to efficient, at-scale utilization of RNA-seq is the time-demanding library preparation workflows, which is a 2-day or longer endeavor per cohort/sample size. To tackle this bottleneck, we designed an automated workflow that increases throughput capacity, while minimizing human error to enhance reproducibility. To this end, we converted the manual protocol of the NEBNext Directional Ultra II RNA Library Prep Kit for Illumina on the Beckman Coulter liquid handler, Biomek i7 Hybrid workstation. A total of 84 RNA samples were isolated from two human cell lines and subjected to comparative manual and automated library preparation methods. Qualitative and quantitative results indicated a high degree of similarity between libraries generated manually or through automation. Yet, there was a significant reduction in both hands-on and assay time from a 2-day manual to a 9-hour automated workflow. Using linear regression analysis, we found the Pearson correlation coefficient between libraries generated manually or by automation to be almost identical to a sample being sequenced twice (R²= 0.985 vs 0.983). This demonstrates that high-throughput automated workflows can be of great benefit to genomic laboratories by enhancing efficiency of library preparation, reducing hands-on time and increasing throughput potential.
Preprint
Full-text available
Viruses play crucial roles in the ecology of microbial communities, yet they remain relatively understudied in their native environments. Despite many advancements in high-throughput whole-genome sequencing (WGS), sequence assembly, and annotation of viruses, the reconstruction of full-length viral genomes directly from metagenomic sequencing is possible only for the most abundant phages and requires long-read sequencing technologies. Additionally, the prediction of their cellular hosts remains difficult from conventional metagenomic sequencing alone. To address these gaps in the field and to accelerate the study of viruses directly in their native microbiomes, we developed an end-to-end bioinformatics platform for viral genome reconstruction and host attribution from metagenomic data using proximity-ligation sequencing (i.e., Hi-C). We demonstrate the capabilities of the platform by recovering and characterizing the metavirome of a variety of metagenomes, including a fecal microbiome that has also been sequenced with accurate long reads, allowing for the assessment and benchmarking of the new methods. The platform can accurately extract numerous near-complete viral genomes even from highly fragmented short-read assemblies and can reliably predict their cellular hosts with minimal false positives. To our knowledge, this is the first software for performing these tasks. Being significantly cheaper than long-read sequencing of comparable depth, the incorporation of proximity-ligation sequencing in microbiome research shows promise to greatly accelerate future advancements in the field.
Article
Full-text available
We isolated a strain of Lactobacillus nenjiangensis named SH-Y15 from traditional suan-cai used in northeastern China because it has a high capacity for degrading nitrites at low temperatures. The complete genome of SH-Y15 contains a single circular chromosome and a plasmid. The complete length is 2,249,893 bp, and the G+C content is 39.68%.
Chapter
Full-text available
Genome methylation in bacteria is an area of intense interest because it has broad implications for bacteriophage resistance, replication, genomic diversity via replication fidelity, response to stress, gene expression regulation, and virulence. Increasing interest in bacterial DNA modification is coming about with investigation of host/microbe interactions and the microbiome association and coevolution with the host organism. Since the recognition of DNA methylation being important in Escherichia coli and bacteriophage resistance using restriction/modification systems, more than 43,600 restriction enzymes have been cataloged in more than 3600 different bacteria. While DNA sequencing methods have made great advances there is a dearth of method advances to examine these modifications in situ. However, the large increase in whole genome sequences has led to advances in defining the modification status of single genomes as well as mining new restriction enzymes, methyltransferases, and modification motifs. These advances provide the basis for the study of pan-epigenomes, population-scale comparisons among pangenomes to link replication fidelity and methylation status along with mutational analysis of mutLS. Newer DNA sequencing methods that include SMRT and nanopore sequencing will aid the detection of DNA modifications on the ever-increasing whole genome and metagenome sequences that are being produced. As more sequences become available, larger analyses are being done to provide insight into the role and guidance of bacterial DNA modification to bacterial survival and physiology.
Article
Full-text available
Next generation sequencing is in the process of evolving from a technology used for research purposes to one which is applied in clinical diagnostics. Recently introduced high throughput and benchtop instruments offer fully automated sequencing runs at a lower cost per base and faster assay times. In turn, the complex and cumbersome library preparation, starting with isolated nucleic acids and resulting in amplified and barcoded DNA with sequencing adapters, has been identified as a significant bottleneck. Library preparation protocols usually consist of a multistep process and require costly reagents and substantial hands-on-time. Considerable emphasis will need to be placed on standardisation to ensure robustness and reproducibility. This review presents an overview of the current state of automation of library preparation for next generation sequencing. Major challenges associated with library preparation are outlined and different automation strategies are classified according to their functional principle. Pipetting workstations allow high-throughput processing yet offer limited flexibility, whereas microfluidic solutions offer great potential due to miniaturisation and decreased investment costs. For the emerging field of single cell transcriptomics for example, microfluidics enable singularisation of tens of thousands of cells in nanolitre droplets and barcoding of the RNA to assign each nucleic acid sequence to its cell of origin. Finally, two applications, the characterisation of bacterial pathogens and the sequencing within human immunogenetics, are outlined and benefits of automation are discussed.
Technical Report
Full-text available
The Agilent DNA Extraction Kit (p/n 200600) was compared to standard methods such as beadbeating and enzyme treatment for preparation of genomic DNA from the prokaryote Listeria monocytogenes. Using this extraction kit, with modifications, to lyse the bacteria and isolate high molecular weight DNA reproducibly yielded high quality DNA suitable for further applications such as polymerase chain reactions to produce amplicons, or for next-generation DNA sequencing. The quality of the high molecular weight DNA, and the comparison of extraction methods, was shown on the Agilent 2200 TapeStation with the Agilent Genomic DNA ScreenTape (p/n 5067-5365) and Agilent Genomic DNA Reagents (p/n 5067-5366). 2
Article
Full-text available
The Detroit Police Crime Lab has historically used Chelex as a method to isolate DNA for amplification and typing of bloodstains at the HLADQA1, PM and D1S80 loci. However, preliminary validation of several STR systems for casework has demonstrated that the Chelex procedure is not the best method of DNA isolation for STR amplifications for our purposes. Long term storage at -20 degrees C in the presence of unbuffered Chelex beads (approximately 1 year), combined with multiple freeze thaws, resulted in signal loss at a locus for many database samples. Therefore, we have employed the QIAamp spin column as an alternative method of DNA isolation for amplification and typing of STR loci currently being validated for use in the laboratory. Moreover, we determined that QIAamp isolated DNA is also suitable for HLADQA1, PM and D1S80 typing. A matrix study was performed to determine if the QIAamp DNA procedure would give better results on bloodstains deposited on "problem surfaces" such as leather, dirt and various dyed fabrics. Again, QIAamp isolated DNA was more readily typeable than Chelex isolated DNA. We successfully replaced the phenol/chloroform extraction steps utilized in our laboratory for differential extractions, a commonly used method for separating sperm and non-sperm fractions of sexual assault evidence, with the QIAamp spin columns. The QIAamp extracted DNA performed as well in all PCR amplification and typing procedures tested (PM, HLADQA1, D1S80, and STR (PowerPlex)) as the phenol/chloroform Centricon isolated or EtOH precipitated DNAs. Thus we concluded that QIAamp spin columns are a superior method for isolating DNA to be typed for a variety of loci.
Available from: https:// www.qiagen.com/us/shop/sample-technologies/dna/dna-preparation
  • Dna Mini Qiaamp
  • Kit
QIAamp DNA Mini Kit. 2016 [cited 2016 June 30]; Available from: https:// www.qiagen.com/us/shop/sample-technologies/dna/dna-preparation/ QIAamp-DNA-Mini-Kit#resources.
Agilent DNA 7500 and DNA 12000 Kit Quick Start Guide
  • Agilent Technologies
Agilent Technologies, I. Agilent DNA 7500 and DNA 12000 Kit Quick Start Guide. 2013 [cited 2016 June 30]; Available from: http://www.agilent.com/ cs/library/usermanuals/Public/G2938-90025_DNA7500-12000_QSG.pdf.