ArticlePDF Available

Abstract

Shigella is a major foodborne pathogen that infects humans and non-human primates and is the major cause of dysentery and reactive arthritis worldwide. This is the initial public release of 16 Shigella genome sequences from four species sequenced as part of the 100K Pathogen Genome Project.
Shigella Draft Genome Sequences:
Resources for Food Safety and Public
Health
Allison M. Weis,
a
Brent Gilpin,
b
Bihua C. Huang,
a
Nguyet Kong,
a
Poyin Chen,
a
Bart C. Weimer
a
School of Veterinary Medicine, 100K Pathogen Genome Project, UC Davis, Davis, California, USA
a
; Institute of
Environmental Science & Research Ltd., Christchurch, New Zealand
b
ABSTRACT Shigella is a major foodborne pathogen that infects humans and non-
human primates and is the major cause of dysentery and reactive arthritis world-
wide. This is the initial public release of 16 Shigella genome sequences from four
species sequenced as part of the 100K Pathogen Genome Project.
Shigella spp. are Gram-negative enteric pathogens that infect humans and nonhu-
man primates. They are an important cause of dysentery, affecting more than 80
million people and causing more than 700,000 deaths each year worldwide (1, 2). The
burden of disease is carried by children, where 99% of infections occur in children in
developing nations, and most cases (70%) and deaths (60%) occur in children age 5 and
under (1, 2). Rare cases of shigellosis can lead to reactive arthritis (3). Shigella is spread
by direct contact with an infected person or by ingesting contaminated food or water
(1, 4). The infective dose can be as few as 10 organisms, making Shigella a foodborne
pathogen of global importance based on wide distribution, water quality concerns, and
an important risk for public health (4).
The genus Shigella is composed of four species: S. dysenteriae,S. flexneri,
S. boydii, and S. sonnei, all of which cause acute bloody diarrhea (2, 5). Shigella
genomics has emerged as an important tool in basic and clinical applications for
diagnosis and classification, and will inform treatment plans (5, 6), but the ability to
conduct source tracking using whole-genome sequencing remains challenging due
to the relatively few publically available genomes. In this release, the 100K Patho-
gen Genome Project sequenced and assembled the genomes of 16 novel Shigella
isolates of the four species: two S. boydii, three S. dysenteriae, nine S. flexneri, and
two S. sonnei isolates (Table 1).
The 100K Pathogen Genome Project (http://www.100kgenomes.org) is a large-scale
sequencing effort to inform food safety and public health in genome-based identifi-
cation and source tracking (7, 8). All Shigella isolates were shipped to Bart Weimer’s
laboratory (UC Davis, Davis, CA). DNA isolation, sequencing, and assembly were done
as previously described (7–9). Briefly, isolates were checked for purity (10) prior to
extracting genomic DNA (gDNA) from cultures grown on brain heart infusion agar
(catalog no. 241830; BD Difco, Franklin Lakes, NJ) for 1 to 2 days at 37°C. Cells were
lysed (11), gDNA was purified using the Qiagen QIAamp DNA minikit (catalog no.
51306), and quality was measured using the Agilent 2200 TapeStation system with the
Genomic DNA ScreenTape (12). After isolation, gDNA was fragmented using Covaris
E220 (13), end-repaired (5=), adenylated (3=), and ligated with double-stranded DNA
(dsDNA) adapters NEXTflex-96 DNA barcode (Bioo Scientific, Austin, TX), and gDNA
(1
g) was used for library construction with the Kapa high-throughput (HTP) library
preparation kit (catalog no. KK8234; Kapa Biosystems, Boston, MA), using the Agilent
Bravo automated liquid handling platform workstation option B (Santa Clara, CA). The
Received 15 February 2017 Accepted 6
March 2017 Published 20 April 2017
Citation Weis AM, Gilpin B, Huang BC, Kong N,
Chen P, Weimer BC. 2017. Shigella draft
genome sequences: resources for food safety
and public health. Genome Announc 5:
e00176-17. https://doi.org/10.1128/
genomeA.00176-17.
Copyright © 2017 Weis et al. This is an open-
access article distributed under the terms of
the Creative Commons Attribution 4.0
International license.
Address correspondence to Bart C. Weimer,
bcweimer@ucdavis.edu.
PROKARYOTES
crossm
Volume 5 Issue 16 e00176-17 genomea.asm.org 1
libraries were size selected using dual SPRI selection (0.2to 0.6) to produce libraries
with fragments between 300 and 450 bp. Final library amplification was done with
eight cycles using the Kapa HiFi HotStart ReadyMix, followed by a 1SPRI bead
cleanup. Prior to sequencing, the library size was confirmed using the Agilent 2100
Bioanalyzer system with high-sensitivity DNA kit (14, 15), quantified with a quantitative
PCR (qPCR)-based Kapa library quantification kit (catalog no. KK4824), pooled with
multiplexing up to 96 isolates, and sequenced on the Illumina HiSeq 2000 with PE100
plus index read at BGI@UC Davis (Sacramento, CA). The paired-end reads were assem-
bled using CLC Genomics Workbench version 6.5.1 (Qiagen).
Accession number(s). Sequences can be found in the NCBI SRA 100K Project
BioProject PRJNA186441 and in GenBank (Table 1).
ACKNOWLEDGMENTS
We thank the Weimer lab and all their efforts in isolate logistics and technical
assistance and all of the collaborators for the 100K Pathogen Genome Project.
This project was funded by the 100K Pathogen Genome Project with initial funding
from Agilent Technologies to produce these sequences.
REFERENCES
1. Kotloff KL, Winickoff JP, Ivanoff B, Clemens JD, Swerdlow DL, Sansonetti
PJ, Adak GK, Levine MM. 1999. Global burden of Shigella infections:
implications for vaccine development and implementation of control
strategies. Bull World Health Organ 77:651– 666.
2. WHO. 2005. Guidelines for the control of shigellosis, including epi-
demics due to Shigella dysenteriae 1. World Health Organization,
Geneva, Switzerland.
3. Gaston JSH, Lillicrap MS. 2003. Arthritis associated with enteric infection.
Best Pract Res Clin Rheumatol 17:219 –239.
4. DuPont HL, Levine MM, Hornick RB, Formal SB. 1989. Inoculum size in
shigellosis and implications for expected mode of transmission. J Infect
Dis 159:1126 –1128. https://doi.org/10.1093/infdis/159.6.1126.
5. Hale TL. 1991. Genetic basis of virulence in Shigella species. Microbiol
Rev 55:206 –224.
6. Yang F, Yang J, Zhang XB, Chen LH, Jiang Y, Yan YL, Tang XD, Wang J,
Xiong ZH, Dong J, Xue Y, Zhu YF, Xu XY, Sun LL, Chen SX, Nie H, Peng
JP, Xu JG, Wang Y, Yuan ZH, Wen YM, Yao ZJ, Shen Y, Qiang BQ, Hou YD,
Yu J, Jin Q. 2005. Genome dynamics and diversity of Shigella species, the
etiologic agents of bacillary dysentery. Nucleic Acids Res 33:6445– 6458.
https://doi.org/10.1093/nar/gki954.
7. Weis AM, Clothier KA, Huang BC, Kong N, Weimer BC. 2016. Draft
genome sequences of Campylobacter jejuni strains that cause abortion in
livestock. Genome Announc 4(6):e01324-16. https://doi.org/10.1128/
genomeA.01324-16.
8. Weis AM, Storey DB, Taff CC, Townsend AK, Huang BC, Kong NT, Clothier
KA, Spinner A, Byrne BA, Weimer BC. 2016. Genomic comparison of
Campylobacter spp. and their potential for zoonotic transmission be-
tween birds, primates, and livestock. Appl Environ Microbiol 82:
7165–7175. https://doi.org/10.1128/AEM.01746-16.
9. Weis AM, Huang BC, Storey DB, Kong N, Chen P, Arabyan N, Gilpin B,
Mason C, Townsend AK, Smith WA, Byrne BA, Taff CC, Weimer BC. 2017.
Large-scale release of Campylobacter draft genomes: resources for food
safety and public health from the 100K pathogen genome project.
Genome Announc 5(1):e00925-16. https://doi.org/10.1128/genomeA
.00925-16.
10. Kong N, Ng W, Lee V, Kelly L, Weimer BC. 2013. Production and analysis
of high molecular weight genomic DNA for NGS pipelines using Agilent
DNA extraction kit (p/n 200600). Application note. Agilent Technologies,
Santa Clara, CA. https://www.agilent.com/cs/library/applications/5991-
3722EN.pdf.
11. Jeannotte R, Lee E, Kong N, Ng W, Kelly L, Weimer BC. 2014. High-
throughput analysis of foodborne bacterial genomic DNA using Agilent
2200 TapeStation and genomic DNA ScreenTape system. Application
note. Agilent Technologies, Santa Clara, CA. https://www.agilent.com/
cs/library/applications/5991-4003EN.pdf.
12. Kong N, Ng W, Cai L, Leonardo A, Kelly L, Weimer BC. 2014. Integrating
the DNA integrity number (DIN) to assess genomic DNA (gDNA) quality
control using the Agilent 2200 TapeStation system. Application note.
Agilent Technologies, Santa Clara, CA. http://www.agilent.com/cs/
library/applications/5991-5442EN.pdf.
TABLE 1 Shigella species draft genome sequence information
GenBank accession no. Strain ID Species Depth () No. of contigs No. of bases
MSJS00000000 BCW_4868 S. boydii 115 243 4,863,576
MSJT00000000 BCW_4869 S. boydii 108 297 4,246,029
MSJU00000000 BCW_4870 S. dysenteriae 114 285 4,018,103
MSJV00000000 BCW_4871 S. dysenteriae 117 299 4,078,019
MSJW00000000 BCW_4872 S. dysenteriae 72 292 4,490,659
MSJX00000000 BCW_4874 S. flexneri 109 269 4,252,909
MSJY00000000 BCW_4875 S. flexneri 90 249 4,196,256
MSJZ00000000 BCW_4876 S. flexneri 101 293 4,396,898
MSKA00000000 BCW_4877 S. flexneri 100 296 4,330,224
MSKC00000000 BCW_4879 S. flexneri 124 287 4,167,963
MSKB00000000 BCW_4880 S. flexneri 106 267 4,224,783
MSKD00000000 BCW_4881 S. flexneri 170 253 4,334,622
MSKG00000000 BCW_4882 S. flexneri 96 297 4,099,589
MSKF00000000 BCW_4883 S. flexneri 118 289 4,305,926
MSKE00000000 BCW_4885 S. sonnei 101 299 4,392,417
MSKH00000000 BCW_4886 S. sonnei 100 286 4,530,575
Weis et al.
Volume 5 Issue 16 e00176-17 genomea.asm.org 2
13. Jeannotte R, Lee E, Arabyan N, Kong N, Thao K, Huang BH, Kelly L,
Weimer BC. 2014. Optimization of Covaris settings for shearing bacterial
genomic DNA by focused ultrasonication and analysis using Agilent
2200 TapeStation. Application note. Agilent Technologies, Santa Clara,
CA. http://cn.agilent.com/cs/library/applications/5991-5075EN.pdf.
14. Kong N, Ng W, Foutouhi A, Huang BH, Kelly L, Weimer BC. 2014. Quality
control of high-throughput library construction pipeline for KAPA HTP
library using an Agilent 2200 TapeStation. Application note. Agilent
Technologies, Santa Clara, CA. http://www.agilent.com/cs/library/
applications/5991-5141EN.pdf.
15. Kong N, Thao K, Huang C, Appel M, Lappin S, Knapp L, Kelly L, Weimer
BC. 2014. Automated library construction using KAPA library preparation
kits on the Agilent NGS workstation yields high-quality libraries for
whole-genome sequencing on the Illumina platform. Application note.
Agilent Technologies, Santa Clara, CA. http://www.agilent.com/cs/
library/applications/5991-4296EN.pdf.
Genome Announcement
Volume 5 Issue 16 e00176-17 genomea.asm.org 3
... Many researchers also detected Sh. dysenteriae type 1 in vegetable salad and meat food items. There are not any studies reported Sh. dysenteriae in milk samples as it was not discovered in our study ( Figure 6) (Weis et al., 2017). As provided in Figure 5, the most contamination rates of E. coli O157: H7 and Sh. ...
... Contamination of these food products was occurred probably because of food handlers for Sh. dysenteriae and naturally contaminated raw materials for E. coli O157:H7 (Weis et al., 2017). ...
Article
Full-text available
Escherichia coli serotype O157: H7 and Shigella dysenteriae type 1 as the Shiga toxin‐producing bacteria cause some acute gastrointestinal and extraintestinal diseases such as hemorrhagic uremic syndrome and bloody diarrhea in human. Stx genes are the key virulence factors in these pathogens. The aim of this study was to develop HRMA assay to differentiate stx1A gene for detection of E. coli serotype O157: H7 and Sh. dysenteriae type 1 and determine the prevalence of these pathogens in food samples using this method. PCR‐HRMA assay and gold standard methods have been carried out for identification of pathogens among 135 different food samples. We found HRMA method a sensitive and specific assay (100 and 100%, respectively) for differentiation of stx1A gene, consequently, detection of these pathogens in food samples. Also, the highest prevalence of E. coli serotype O157: H7 and Sh. dysenteriae type 1 harboring stx1A gene was observed in raw milk and vegetable salad samples, respectively. HRMA as a rapid, inexpensive, sensitive and specific method is suggested to be used for differentiation of stx1A gene to detect E. coli serotype O157: H7 and Sh. dysenteriae type 1 as the key pathogens for safety evaluation of food samples. In this paper, we developed high resolution melting curve analysis method to differentiate stx1A gene to detect Escherichia coli serotype O157: H7 and Shigella dysenteriae strains in food samples. We found this method specific and sensitive for detection of these pathogens in food samples. Also, we investigated the prevalence of these foodborne pathogens in food samples using this method. We observed the highest prevalence of E. coli serotype O157: H7 and Sh. dysenteriae type 1 in raw milk and vegetable salad samples, respectively. We found this method appropriate for detection of these pathogens in naturally contaminated food samples.
... can infect humans and non-human primates and is a major cause of dysentery, affecting more than 80 million people and causing more than 700,000 deaths each year worldwide. The infectious dose may be as low as 10 organisms, making Shigella a food-borne pathogen of global importance and a significant risk to public health (Weis et al., 2017). During pregnancy, women are more susceptible to listeriosis, which can result in premature labor, spontaneous abortion, chorioamnionitis, and maternal/neonatal sepsis (Bhaskar & Chaudhury, 2018). ...
Article
Full-text available
Abstract The psychrotrophic bacteria count and the profile of Gram-negative bacteria present in commercial Brazilian organic dairy products (27 samples, pasteurized whole milk, Minas Frescal cheese, and yoghurt, equally distributed) as well as the biofilm-producing capacity and the production of deteriorating enzymes were investigated. Most of the samples (59%) presented psychrotrophic bacteria counts higher than 4 log CFU/g, indicating problems in the microbiological quality of the products. The identification of Gram-negative bacteria revealed the presence of micro-organisms that can be potentially harmful (39.4% Acinetobacter baumanii/calcoaceticus, 23.2% Burkoderia pseudomallei, 10.1% Halfnia alvei) and/or deteriorating micro-organisms (6.1% Pseudomonas aeruginosa) with the capacity of producing proteases (36.4%), lecithinases (41.4%), and lipases (4%). Roughly, 68.7% of micro-organisms were non-producers of biofilms, 28.3% were medium biofilm producers, and 3.0% were high biofilm producers. The findings suggest there are problems related to the good practices of obtaining organic milk and in the manufacture of dairy products.
... On the other hand, E. coli is also considered as an important pathogen that causes diarrhea in developing countries (26,40). Shigella has an infective dose that can be as few as 10 organisms, also making it one of the important foodborne pathogens that can be transmitted through the contaminated Nile tilapia (42). ...
Article
This study evaluated the microbiological safety of fresh Nile tilapia ( Oreochromis niloticus) from Kenyan fresh water fish value chains. One hundred seventy-six fish samples were analyzed. The microbial counts of hygiene indicators, total viable aerobic count (TVC), total coliforms, and fecal coliforms isolated by using culture techniques were enumerated, and microbial pathogens present in the fish samples were identified and characterized by using molecular methods. The diversity of bacterial isolates was determined by using the Shannon-Weaver diversity index. The mean of TVC in the samples was 4.44 log CFU/g. A comparison with the European Commission and International Commission on Microbiological Specifications for Foods standards showed two fish samples had counts above the 5.00 log CFU/g limit for TVC, and all the fish samples had total coliform and fecal coliform counts above 2.00 and 1.00 log CFU/g, respectively. Pathogenic strains, including Shiga toxin-producing and enteropathogenic Escherichia coli, Listeria monocytogenes, Yersinia enterocolitica, Klebsiella pneumoniae, and Salmonella enterica, were identified in the fish samples. The diversity of 1,608 bacterial isolates was higher in semiregulated chains than unregulated chains. The diversity was also high at the retail stage of the fish value chain. In conclusion, fresh Nile tilapia samples were above some of the set food safety standards and may be a source of foodborne pathogens. Further microbial risk assessment for detected pathogens is recommended to further support public health protection, taking into account growth, inactivation through cooking, processing, survival, and consumption.
Article
Shigella flexneri is a nonmotile gram-negative bacillus that affects humans and nonhuman primates. In August 2021, 15 primates at the ABQ BioPark demonstrated clinical signs of Shigella infection: 3 out of 4 Sumatran and hybrid orangutans (Pongo abelii), 6 out of 8 gorillas (Gorilla gorilla), 2 out of 9 chimpanzees (Pan troglodytes), and 4 out of 4 siamangs (Hylobates syndactylus). Three siamangs and one gorilla succumbed to complications of shigellosis during the initial outbreak and a chimpanzee died 10 mon later. Although it is well documented that Shigella may cause morbidity and mortality in nonhuman primates, the rapid and devastating nature of the outbreak, the difference from previous reports in zoological collections (enzootic vs outbreak), and the chronological overlap with the increase in human cases in the region makes discussion of this Shigella outbreak of significance. The cases presented here are significantly different than previous reports, because these were part of an outbreak that arose and subsided, versus other reports where the authors describe an enzootic disease with persistently infected animals. Close communication with the New Mexico Department of Health allowed for the investigation into possible sources of the outbreak, recommendations regarding biosecurity protocols, and staff education.
Article
Full-text available
The 100K Pathogen Genome Project is producing draft and closed genome sequences from diverse pathogens. This project expanded globally to include a snapshot of global bacterial genome diversity. The genomes form a sequence database that has a variety of uses from systematics to public health.
Article
Full-text available
Campylobacter is a food-associated bacterium and a leading cause of foodborne illness worldwide, being associated with poultry in the food supply. This is the initial public release of 202 Campylobacter genome sequences as part of the 100K Pathogen Genome Project. These isolates represent global genomic diversity in the Campylobacter genus.
Article
Full-text available
Campylobacter jejuni is an intestinal bacterium that can cause abortion in livestock. This publication announces the public release of 15 Campylobacter jejuni genome sequences from isolates linked to abortion in livestock. These isolates are part of the 100K Pathogen Genome Project and are from clinical cases at the University of California (UC) Davis.
Article
Full-text available
Importance: This study examined the link between public health and genomic variation of Campylobacter in relation to disease in humans, primates, and livestock. Use of large-scale whole genome sequencing enabled population level assessment to find new genes that are linked to livestock disease. With 184 Campylobacter genomes we assessed virulence traits, antibiotic resistance susceptibility, and potential for zoonotic transfer to observe there is a 'generalist' genotype that may move between host species.
Technical Report
Full-text available
The Agilent DNA Extraction Kit (p/n 200600) was compared to standard methods such as beadbeating and enzyme treatment for preparation of genomic DNA from the prokaryote Listeria monocytogenes. Using this extraction kit, with modifications, to lyse the bacteria and isolate high molecular weight DNA reproducibly yielded high quality DNA suitable for further applications such as polymerase chain reactions to produce amplicons, or for next-generation DNA sequencing. The quality of the high molecular weight DNA, and the comparison of extraction methods, was shown on the Agilent 2200 TapeStation with the Agilent Genomic DNA ScreenTape (p/n 5067-5365) and Agilent Genomic DNA Reagents (p/n 5067-5366). 2
Technical Report
Full-text available
A new method was developed to automate the KAPA HTP Library Preparation kit for microbial whole genome sequencing. This method uses the Agilent NGS Workstation, consisting of the NGS Bravo liquid handling platform with its accessories for heating, cooling, shaking, and magnetic bead manipulations in a 96-well format. User intervention in multistep protocols is minimized through the use of other components of the workstation such as the BenchCel 4R Microplate Handler and Labware MiniHub for labware storage and movement. This method has been validated for sequencing on the Illumina platform and consists of three protocols: the first is for end repair to post-ligation cleanup; the second is used for library amplification setup; and the third is for the post-amplification cleanup. The modular design provides the end-user with the flexibility to complete library construction over two days, and is suitable for the construction of high-quality libraries from bacteria of various GC content. This combined solution produced a workflow that is suitable for production-scale sequencing projects such as the 100K Pathogen Genome Project.
Technical Report
Full-text available
The initial step in Next Generation Sequencing is to construct a library from genomic DNA. To gain the optimum result, extracted DNA must be of high molecular weight with limited degradation. High-throughput sequencing projects, such as the 100K Pathogen Genome Project, require methods to rapidly assess the quantity and quality of genomic DNA extracts. In this study, assessment of the applicability of the Agilent 2200 TapeStation was done using genomic DNA from nine foodborne pathogens using several accepted high-throughput methods. The Agilent 2200 TapeStation System with Genomic DNA ScreenTape and Genomic DNA Reagents was easy to use with minimal manual intervention. An important advantage of the 2200 TapeStation over other high-throughput methods was that high molecular weight genomic DNA quality and quantity can be quantified apart from lower molecular weight size ranges, providing a distinct advantage in the library construction pipeline and over other methods available for this important step in the Next Generation Sequencing process.
Technical Report
Full-text available
Next Generation Sequencing requires the input of high molecular weight genomic DNA to construct quality libraries for whole genome bacterial sequencing. Large scale sequencing projects, such as the 100K Pathogen Genome Project, require methods to rapidly assess the quantity and quality of the input DNA using high-throughput methods that are fast and cost effective. In this study, the Agilent 2200 TapeStation and Agilent 2100 Bioanalyzer Systems were used to assess a few critical quality control steps for library construction. With minimal manual intervention , the Agilent 2200 TapeStation System determined the quality of genomic DNA, fragmented DNA, and final libraries constructed from multiple types of foodborne pathogens. The Agilent 2200 TapeStation System provided a single platform that effectively evaluated the necessary quality control steps, which provided a distinct advantage to decrease the time needed for library construction and a common instrument methodology for quality control.
Technical Report
Full-text available
Next Generation Sequencing (NGS) requires the input of high molecular weight genomic DNA (gDNA) to construct quality libraries for large scale sequencing projects , such as the 100K Pathogen Genome Project. The assessment of DNA integrity is a critical first step in obtaining meaningful data, and intact DNA is a key element for successful library construction. The Agilent 2200 TapeStation System plays an important role in the determination of the DNA quality using the DNA genomic assay. Profiles generated on the 2200 TapeStation System yield information on concentration , allow a visual inspection of the DNA quality, and generate a DNA Integrity Number (DIN), which is a value automatically assigned by the software that provides an indication of integrity (that is, lack of degradation). This application note describes a new software algorithm that has been developed to extract information about DNA sample integrity from the 2200 TapeStation System electrophoretic trace.
Article
Shigella species and enteroinvasive strains of Escherichia coli cause disease by invasion of the colonic epithelium, and this invasive phenotype is mediated by genes carried on 180- to 240-kb plasmids. In addition, at least eight loci on the Shigella chromosome are necessary for full expression of virulence. The products of these genes can be classified as (i) virulence determinants that directly affect the ability of shigellae to survive in the intestinal tissues, e.g., the aerobactin siderophore (iucABCD and iutA), superoxide dismutase (sodB), and somatic antigen expression (rfa and rfb); (ii) cytotoxins that contribute to the severity of disease, e.g., the Shiga toxin (stx) and a putative analog of this toxin (flu); and (iii) regulatory loci that affect the expression of plasmid genes, e.g., ompR-envZ, which mediates response to changes in osmolarity, virR (osmZ), which mediates response to changes in temperature, and kcpA, which affects the translation of the plasmid virG (icsA) gene which is associated with intracellular bacterial mobility and intracellular bacterial spread. A single plasmid regulatory gene (virF) controls a virulence-associated plasmid regulon including virG (icsA) and two invasion-related loci, i.e., (i) ipaABCD, encoding invasion plasmid antigens that may be structural components of the Shigella invasion determinant; and (ii) invAKJH (mxi), which is necessary for insertion of invasion plasmid antigens into the outer membrane.