ArticlePDF Available

Complete genome sequence of Agrobacterium fabrum ARqua1

Authors:

Abstract

We report the complete genome of Agrobacterium fabrum ARqua1 generated from Oxford Nanopore and Illumina sequencing. The genome of ARqua1 has a total length of 5,714,310 bp, comprising a circular chromosome, a linear chromosome, and two plasmids. In total, 5,446 genes were predicted, of which 5,288 were annotated.
| Biotechnology | Announcement
Complete genome sequence of Agrobacterium fabrum ARqua1
Bo Lan,1 Qian Zhang,2 Kangquan Yin1
AUTHOR AFFILIATIONS See aliation list on p. 2.
ABSTRACT We report the complete genome of Agrobacterium fabrum ARqua1
generated from Oxford Nanopore and Illumina sequencing. The genome of ARqua1 has
a total length of 5,714,310 bp, comprising a circular chromosome, a linear chromosome,
and two plasmids. In total, 5,446 genes were predicted, of which 5,288 were annotated.
KEYWORDS genomes, Agrobacterium, genome editing
Agrobacterium fabrum is a well-known natural agent that induces hairy root diseases
in plants and has been widely used in transgenic technology (1–3). Among these, A.
fabrum ARqua1 has been used for genetic transformation in more than 60 plant species
(4). The ARqua1 is derived from strain R1000, which has the C58 chromosomal back
ground from A. tumefaciens A136 (5). Recently, a draft shotgun assembly of A. fabrum
ARqua1 consisting of 19 contigs was reported (6). This report describes the complete
genome assembly of A. fabrum ARqua1, which will facilitate genetic engineering of the
“engineer” and generation of improved Agrobacterium strains for plant transformation
and genome editing (7).
ARqua1 strain was obtained from Shanghai Weidi Biotechnology Co. Ltd. The strain
was streaked onto Luria broth (LB) plates, and a single colony was selected for
amplication with streptomycin (100 mg/L). The same genomic DNA extracted using
the cetyltrimethylammonium bromide (CTAB) method (8) was used for both Oxford
Nanopore and Illumina sequencing. For Oxford Nanopore sequencing, DNA was sheared
using g-TUBE, and large DNA fragments (>20 Kb) were selected using BluePippin (Sage
Science, Beverly, MA). Nanopore libraries were prepared using the SQK-LSK109 Genomic
DNA Ligation Kit (Oxford Nanopore Technologies, UK) and sequenced on an FLO-PRO002
R9.4.1 ow cell. The basecalling and adapter trimming were performed using Guppy
version 3.2.6. Read QC was performed using fastp v0.23.1 (9). Reads with quality scores
less than 6.0 or shorter than 2,000 bp were ltered. Nanopore sequencing yielded
101,392 clean reads and an N50 of 26,601 bp. Illumina libraries were prepared using
the TrueLib DNA Library Rapid Prep Kit and sequenced on an Illumina NovaSeq 6000
(Illumina, San Diego, CA, USA). Illumina sequencing yielded 12,333,606 paired-end reads
(mean read length, 147 bp), which were trimmed using fastp version 0.23.1 (9), resulting
in a total of 988,367,215 bp (~173-fold coverage). Nanopore sequencing data were used
for de novo genome assembly using the Canu version 1.5 (10), and the post-assembly
correction was conducted using Racon version 3.4.3 (11). Illumina sequencing data were
used for further correction using Pilon version 1.22 (12). Circularization of chromosomal
and plasmid contigs was conducted using Circlator version 1.5.5 (13) with the parameter
“minimus2 --no_pre_merge.” Genome was annotated using Prodigal version 2.6.3 (14),
RepeatMasker version 4.0.5 (15), Infernal version 1.1.3 (16), and tRNAscan-SE version
2.0 (17), respectively. Functional annotation of genes was conducted using eggNOG
version 4.0, Pfam version 32.0, Swissprot access date 2019-07-31, and TrEMBL access date
2019-07-13 databases. Collinearity analysis was performed using the One Step MCScanX
plugin in TBtools version 1.120 (18). Default parameters were used for all of the above
software unless otherwise specied.
November 2023 Volume 12 Issue 11 10.1128/MRA.00554-23 1
Editor J. Cameron Thrash, University of Southern
California, Los Angeles, California, USA
Address correspondence to Kangquan Yin,
yinkq@bjfu.edu.cn.
The authors declare no conict of interest.
See the funding table on p. 3.
Received 29 June 2023
Accepted 22 August 2023
Published 9 October 2023
Copyright © 2023 Lan et al. This is an open-access
article distributed under the terms of the Creative
Commons Attribution 4.0 International license.
The assembly resulted in four contigs: a circular chromosome, a linear chromosome,
and two plasmids. The assembled genome had a GC content of 59.03%. Collinearity
analysis showed that contig4 of the current ARqua1 genome was plasmid pRiA4b and
conrmed that the chromosomal background of the ARqua1 strain was A. fabrum C58
(Fig. 1).
ACKNOWLEDGMENTS
This work was supported by grants from the Fundamental Research Funds for the Central
Universities (2021ZY80) and Science and Technology Innovation of Inner Mongolia
Autonomous Region (2022JBGS0020).
AUTHOR AFFILIATIONS
1School of Grassland Science, Beijing Forestry University, Beijing, Beijing, China
2Chinese Academy of Agricultural Sciences, Lanzhou Institute of Husbandry and
Pharmaceutical Sciences, Lanzhou, China
AUTHOR ORCIDs
Kangquan Yin http://orcid.org/0000-0002-4627-6585
a
b
A. fabrum ARqua1
(this study)
4321
GCA_000092025.1
Contig
Contig A. fabrum ARqua1
(this study)
4321
1151613271512391761410814 18
GCA_012971785.1
GCA_018138105.1
Chr2Chr1 pRiA4pArA4
pAtC58 pTiC58
Chr2Chr1
Contig
FIG 1 Genome synteny maps. (a) Synteny links between current A. fabrum ARqua1 genome and A. rhizogenes A4 genome (GCA_018138105.1, https://
www.ncbi.nlm.nih.gov/datasets/genome/GCF_018138105.1/), and synteny links between current A. fabrum ARqua1 genome and A. fabrum C58 genome
(GCA_000092025.1, https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000092025.1/). The red lines indicate the synteny links between the plasmid pRiA4 of
A. rhizogenes A4 and contig 4 of the current A. fabrum ARqua1 genome. (b) Synteny links between the current A. fabrum ARqua1 genome and the rst draft
genome of A. fabrum ARqua1 (GCA_012971785.1, https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_012971785.1/). The blue lines indicate the syntenic links
between contig 11 of the draft genome of A. fabrum ARqua1 and contig 4 of the current A. fabrum ARqua1 genome.
Announcement Microbiology Resource Announcements
November 2023 Volume 12 Issue 11 10.1128/MRA.00554-23 2
FUNDING
Funder Grant(s) Author(s)
MOE | Fundamental Research Funds for the Central
Universities (Fundamental Research Fund for the Central
Universities)
2021ZY80 Kangquan Yin
Science and Technology Innovation of Inner Mongolia
Autonomous Region
2022JBGS0020 Kangquan Yin
AUTHOR CONTRIBUTIONS
Bo Lan, Conceptualization, Data curation, Formal analysis, Funding acquisition,
Investigation, Methodology, Project administration, Resources, Supervision, Writing –
original draft, Writing – review and editing | Qian Zhang, Data curation, Resources,
Writing – original draft | Kangquan Yin, Conceptualization, Data curation, Formal analysis,
Funding acquisition, Investigation, Methodology, Project administration, Resources,
Supervision, Writing – original draft, Writing – review and editing
DATA AVAILABILITY
The Bioproject number is PRJNA976066. The SRA accession numbers are SRR24759617
(Illumina) and SRR24759618 (ONT).
REFERENCES
1. Chilton MD, Tepfer DA, Petit A, David C, Casse-Delbart F, Tempé J. 1982.
Agrobacterium rhizogenes inserts T-DNA into the genomes of the host
plant root cells. Nature 295:432–434. https://doi.org/10.1038/295432a0
2. OzyigitII, Dogan I, Artam TE. 2013. Agrobacterium rhizogenes mediated
transformation and its biotechnological applications in crops, p 1–48. In
Hakeem RK, P Ahmad, M Ozturk (ed), Crop improvement: new
approaches and modern techniques. Springer, Boston. https://doi.org/
10.1007/978-1-4614-7028-1
3. Yin K, Gao C, Qiu JL. 2017. Progress and prospects in plant genome
editing. Nat Plants 3:1–6. https://doi.org/10.1038/nplants.2017.107
4. ChristeyMC. 1997. Transgenic crop plants using Agrobacterium
rhizogenes mediated transformation, p 99–111. In Doran PM (ed), Hairy
roots: Culture and applications. Harwood Academic Publishers,
Amsterdam.
5. Kiryushkin AS, Ilina EL, Guseva ED, Pawlowski K, Demchenko KN. 2022.
Hairy CRISPR: genome editing in plants using hairy root transformation.
Plants 11:51. https://doi.org/10.3390/plants11010051
6. ThompsonMG, Cruz-Morales P, MooreWM, Pearson AN, KeaslingJD,
SchellerHV, ShihPM. 2020. Draft genome sequence of Agrobacterium
fabrum ARqua1. Microbiol Resour Announc 9:e00506–20.
7. Rodrigues SD, Karimi M, Impens L, Van Lerberge E, Coussens G, Aesaert
S, Rombaut D, Holtappels D, Ibrahim HMM, Van Montagu M, Wagemans
J, Jacobs TB, De Coninck B, Pauwels L. 2021. Ecient CRISPR-mediated
base editing in Agrobacterium Spp. Proc Natl Acad Sci U S A
118:e2013338118. https://doi.org/10.1073/pnas.2013338118
8. Wilson K. 2001. Preparation of genomic DNA from bacteria. Curr Protoc
Mol Biol 56:2–4. https://doi.org/10.1002/0471142727.mb0204s56
9. Chen S, Zhou Y, Chen Y, Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ
preprocessor. Bioinformatics 34:i884–i890. https://doi.org/10.1093/
bioinformatics/bty560
10. Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. 2017.
Canu: scalable and accurate long-read assembly via adaptive k-mer
weighting and repeat separation. Genome Res 27:722–736. https://doi.
org/10.1101/gr.215087.116
11. Vaser R, Sović I, Nagarajan N, Šikić M. 2017. Fast and accurate de novo
genome assembly from long uncorrected reads. Genome Res 27:737–
746. https://doi.org/10.1101/gr.214270.116
12. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo
CA, Zeng Q, Wortman J, Young SK, Earl AM. 2014. Pilon: an integrated
tool for comprehensive microbial variant detection and genome
assembly improvement. PloS One 9:e112963.
https://doi.org/10.1371/journal.pone.0112963
13. Hunt M, Silva ND, Otto TD, Parkhill J, Keane JA, Harris SR. 2015. Circlator:
automated circularization of genome assemblies using long sequencing
reads. Genome Biol 16:1–10. https://doi.org/10.1186/s13059-015-0849-0
14. Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. 2010.
Prodigal: prokaryotic gene recognition and translation initiation site
identication. BMC Bioinf 11:1–11. https://doi.org/10.1186/1471-2105-
11-119
15. Chen N. 2004. Using repeat masker to identify repetitive elements in
genomic sequences. Curr Protoc Bioinf 5:4–10. https://doi.org/10.1002/
0471250953.bi0410s05
16. Nawrocki EP, Eddy SR. 2013. Infernal 1.1: 100-fold faster RNA homology
searches. Bioinformatics 29:2933–2935. https://doi.org/10.1093/
bioinformatics/btt509
17. Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for improved
detection of transfer RNA genes in genomic sequence. Nucleic Acids Res
25:955–964. https://doi.org/10.1093/nar/25.5.955
18. Chen C, Chen H, Zhang Y, Thomas HR, Frank MH, He Y, Xia R. 2020.
TBtools: an integrative toolkit developed for interactive analyses of big
biological data. Mol Plant 13:1194–1202. https://doi.org/10.1016/j.molp.
2020.06.009
Announcement Microbiology Resource Announcements
November 2023 Volume 12 Issue 11 10.1128/MRA.00554-23 3
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
CRISPR/Cas-mediated genome editing is a powerful tool of plant functional genomics. Hairy root transformation is a rapid and convenient approach for obtaining transgenic roots. When combined, these techniques represent a fast and effective means of studying gene function. In this review, we outline the current state of the art reached by the combination of these approaches over seven years. Additionally, we discuss the origins of different Agrobacterium rhizogenes strains that are widely used for hairy root transformation; the components of CRISPR/Cas vectors, such as the promoters that drive Cas or gRNA expression, the types of Cas nuclease, and selectable and screenable markers; and the application of CRISPR/Cas genome editing in hairy roots. The modification of the already known vector pKSE401 with the addition of the rice translational enhancer OsMac3 and the gene encoding the fluorescent protein DsRed1 is also described.
Article
Full-text available
Agrobacterium fabrum ARqua1 is a hybrid of Agrobacterium fabrum C58C bearing the megaplasmid pRiA4b. ARqua1 is used by many plant researchers to generate transgenic roots. The draft genome of ARqua1 includes a 249,350-bp contig that likely covers all of pRiA4b, and it will be a valuable resource to plant biologists.
Article
Full-text available
Motivation Quality control and preprocessing of FASTQ files are essential to providing clean data for downstream analysis. Traditionally, a different tool is used for each operation, such as quality control, adapter trimming and quality filtering. These tools are often insufficiently fast as most are developed using high-level programming languages (e.g. Python and Java) and provide limited multi-threading support. Reading and loading data multiple times also renders preprocessing slow and I/O inefficient. Results We developed fastp as an ultra-fast FASTQ preprocessor with useful quality control and data-filtering features. It can perform quality control, adapter trimming, quality filtering, per-read quality pruning and many other operations with a single scan of the FASTQ data. This tool is developed in C++ and has multi-threading support. Based on our evaluation, fastp is 2–5 times faster than other FASTQ preprocessing tools such as Trimmomatic or Cutadapt despite performing far more operations than similar tools. Availability and implementation The open-source code and corresponding instructions are available at https://github.com/OpenGene/fastp.
Article
Full-text available
The emergence of sequence-specific nucleases that enable genome editing is revolutionizing basic and applied biology. Since the introduction of CRISPR–Cas9, genome editing has become widely used in transformable plants for characterizing gene function and improving traits, mainly by inducing mutations through non-homologous end joining of double-stranded breaks generated by CRISPR–Cas9. However, it would be highly desirable to perform precision gene editing in plants, especially in transformation-recalcitrant species. Recently developed Cas9 variants, novel RNA-guided nucleases and base-editing systems, and DNA-free CRISPR–Cas9 delivery methods now provide great opportunities for plant genome engineering. In this Review Article, we describe the current status of plant genome editing, focusing on newly developed genome editing tools and methods and their potential applications in plants. We also discuss the specific challenges facing plant genome editing, and future prospects.
Article
Full-text available
Long-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. However, given the relatively high error rates of such technologies, efficient and accurate assembly of large repeats and closely related haplotypes remains challenging. We address these issues with Canu, a successor of Celera Assembler that is specifically designed for noisy single-molecule sequences. Canu introduces support for nanopore sequencing, halves depth-of-coverage requirements, and improves assembly continuity while simultaneously reducing runtime by an order of magnitude on large genomes versus Celera Assembler 8.2. These advances result from new overlapping and assembly algorithms, including an adaptive overlapping strategy based on tf-idf weighted MinHash and a sparse assembly graph construction that avoids collapsing diverged repeats and haplotypes. We demonstrate that Canu can reliably assemble complete microbial genomes and near-complete eukaryotic chromosomes using either PacBio or Oxford Nanopore technologies, and achieves a contig NG50 of greater than 21 Mbp on both human and Drosophila melanogaster PacBio datasets. For assembly structures that cannot be linearly represented, Canu provides graph-based assembly outputs in graphical fragment assembly (GFA) format for analysis or integration with complementary phasing and scaffolding techniques. The combination of such highly resolved assembly graphs with long-range scaffolding information promises the complete and automated assembly of complex genomes.
Article
Full-text available
The assembly of long reads from Pacific Biosciences and Oxford Nanopore Technologies typically requires resource intensive error correction and consensus generation steps to obtain high quality assemblies. We show that the error correction step can be omitted and high quality consensus sequences can be generated efficiently with a SIMD accelerated, partial order alignment based stand-alone consensus module called Racon. Based on tests with PacBio and Oxford Nanopore datasets we show that Racon coupled with Miniasm enables consensus genomes with similar or better quality than state-of-the-art methods while being an order of magnitude faster.
Article
Significance Agrobacteria are plant-pathogenic bacteria that can deliver DNA to plant cells as part of their infection strategy. This property has been used for decades to generate transgenic plants and, more recently, to deliver gene-editing reagents to plant cells. Notwithstanding their importance for research and industry, laboratory strains have not been improved much over the years and several aspects of Agrobacterium biology and pathogenesis remain poorly understood. Here, we developed a CRISPR-mediated base-editing approach to efficiently modify the genome of Agrobacterium . We show that single-nucleotide changes can be introduced at targeted positions in both the Agrobacterium tumefaciens and Agrobacterium rhizogenes genomes. Whole-genome analysis of edited strains revealed only a limited number of unintentional mutations.
Article
The rapid development of high-throughput sequencing (HTS) techniques has led biology into the big-data era. Data analyses using various bioinformatics tools rely on programming and command-line environments, which are challenging and time-consuming for most wet-lab biologists. Here, we present TBtools (a Toolkit for Biologists integrating various biological data handling tools), a stand-alone software with a user-friendly interface. The toolkit incorporates over 130 functions, which are designed to meet the increasing demand for big-data analyses, ranging from bulk sequence processing to interactive data visualization. A wide variety of graphs can be prepared in TBtools, with a new plotting engine (“JIGplot”) developed to maximum their interactive ability, which allows quick point-and-click modification to almost every graphic feature. TBtools is a platform-independent software that can be run under all operating systems with Java Runtime Environment 1.6 or newer. It is freely available to non-commercial users at https://github.com/CJ-Chen/TBtools/releases.