ArticlePDF Available

Abstract and Figures

The Salmonella Syst-OMICS consortium is sequencing 4,500 Salmonella genomes and building an analysis pipeline for the study of Salmonella genome evolution, antibiotic resistance and virulence genes. Metadata, including phenotypic as well as genomic data, for isolates of the collection are provided through the Salmonella Foodborne Syst-OMICS database (SalFoS), at https://salfos.ibis.ulaval.ca/. Here, we present our strategy and the analysis of the first 3,377 genomes. Our data will be used to draw potential links between strains found in fresh produce, humans, animals and the environment. The ultimate goals are to understand how Salmonella evolves over time, improve the accuracy of diagnostic methods, develop control methods in the field, and identify prognostic markers for evidence-based decisions in epidemiology and surveillance.
This content is subject to copyright.
fmicb-08-00996 May 31, 2017 Time: 15:54 # 1
PERSPECTIVE
published: 02 June 2017
doi: 10.3389/fmicb.2017.00996
Edited by:
Sabah Bidawid,
Health Canada, Canada
Reviewed by:
Young Min Kwon,
University of Arkansas, United States
Sheng Chen,
Hong Kong Polytechnic University,
Hong Kong
*Correspondence:
Roger C. Levesque
rclevesq@ibis.ulaval.ca
Lawrence Goodridge
lawrence.goodridge@mcgill.ca
These authors have contributed
equally to this work.
Specialty section:
This article was submitted to
Food Microbiology,
a section of the journal
Frontiers in Microbiology
Received: 29 March 2017
Accepted: 17 May 2017
Published: 02 June 2017
Citation:
Emond-Rheault J-G , Jeukens J,
Freschi L, Kukavica-Ibrulj I, Boyle B,
Dupont M-J, Colavecchio A,
Barrere V, Cadieux B, Arya G,
Bekal S, Berry C, Burnett E,
Cavestri C, Chapin TK, Crouse A,
Daigle F, Danyluk MD, Delaquis P,
Dewar K, Doualla-Bell F, Fliss I,
Fong K, Fournier E, Franz E,
Garduno R, Gill A, Gruenheid S,
Harris L, Huang CB, Huang H,
Johnson R, Joly Y, Kerhoas M,
Kong N, Lapointe G, Larivière L,
Loignon S, Malo D, Moineau S,
Mottawea W, Mukhopadhyay K,
Nadon C, Nash J, Ngueng Feze I,
Ogunremi D, Perets A, Pilar AV,
Reimer AR, Robertson J, Rohde J,
Sanderson KE, Song L, Stephan R,
Tamber S, Thomassin P, Tremblay D,
Usongo V, Vincent C, Wang S,
Weadge JT, Wiedmann M,
Wijnands L, Wilson ED, Wittum T,
Yoshida C, Youfsi K, Zhu L,
Weimer BC, Goodridge L and
Levesque RC (2017) A Syst-OMICS
Approach to Ensuring Food Safety
and Reducing the Economic Burden
of Salmonellosis.
Front. Microbiol. 8:996.
doi: 10.3389/fmicb.2017.00996
A Syst-OMICS Approach to Ensuring
Food Safety and Reducing the
Economic Burden of Salmonellosis
Jean-Guillaume Emond-Rheault1, Julie Jeukens1, Luca Freschi1,
Irena Kukavica-Ibrulj1, Brian Boyle1, Marie-Josée Dupont1, Anna Colavecchio2,
Virginie Barrere2, Brigitte Cadieux2, Gitanjali Arya3, Sadjia Bekal4, Chrystal Berry3,
Elton Burnett2, Camille Cavestri5, Travis K. Chapin6, Alanna Crouse2, France Daigle7,
Michelle D. Danyluk6, Pascal Delaquis8, Ken Dewar2,9, Florence Doualla-Bell4,
Ismail Fliss5, Karen Fong10 , Eric Fournier4, Eelco Franz11 , Rafael Garduno12,
Alexander Gill13 , Samantha Gruenheid2, Linda Harris14, Carol B. Huang15,
Hongsheng Huang16 , Roger Johnson3, Yann Joly2, Maud Kerhoas7, Nguyet Kong15,
Gisèle Lapointe17 , Line Larivière2, Stéphanie Loignon5, Danielle Malo2, Sylvain Moineau5,
Walid Mottawea2,18, Kakali Mukhopadhyay2, Céline Nadon3, John Nash3,
Ida Ngueng Feze2, Dele Ogunremi16 , Ann Perets3, Ana V. Pilar2, Aleisha R. Reimer3,
James Robertson3, John Rohde19 , Kenneth E. Sanderson20, Lingqiao Song2,
Roger Stephan21 , Sandeep Tamber13, Paul Thomassin2, Denise Tremblay5,
Valentine Usongo4, Caroline Vincent4, Siyun Wang10 , Joel T. Weadge22,
Martin Wiedmann23 , Lucas Wijnands11, Emily D. Wilson22 , Thomas Wittum24 ,
Catherine Yoshida3, Khadija Youfsi4, Lei Zhu2, Bart C. Weimer15 , Lawrence Goodridge2*
and Roger C. Levesque1*
1Institute for Integrative and Systems Biology, Université Laval, Québec City, QC, Canada, 2McGill University, Montréal, QC,
Canada, 3National Microbiology Laboratory, Public Health Agency of Canada, Ottawa, ON, Canada, 4Laboratoire de Santé
Publique du Québec, Sainte-Anne-de-Bellevue, QC, Canada, 5Université Laval, Québec City, QC, Canada, 6Institute of
Food and Agricultural Sciences, University of Florida, Gainesville, FL, United States, 7Département de Microbiologie,
Infectiologie et Immunologie, Université de Montréal, Montréal, QC, Canada, 8Agriculture and Agri-Food Canada,
Summerland, BC, Canada, 9Génome Québec Innovation Center, Montréal, QC, Canada, 10 Food Safety Engineering, Faculty
of Land and Food Systems, University of British Columbia, Vancouver, BC, Canada, 11 National Institute for Public Health and
the Environment, Bilthoven, Netherlands, 12 Canadian Food Inspection Agency, Halifax, NS, Canada, 13 Bureau of Microbial
Hazards, Health Canada, Ottawa, ON, Canada, 14 UC Davis Food Science and Technology, Davis, CA, United States,
15 UC Davis School of Veterinary Medicine, Davis, CA, United States, 16 Canadian Food Inspection Agency, Ottawa, ON,
Canada, 17 Food Science, University of Guelph, Guelph, ON, Canada, 18 Department of Microbiology and Immunology,
Faculty of Pharmacy, Mansoura University, Mansoura, Egypt, 19 Department of Microbiology and Immunology, Dalhousie
University, Halifax, NS, Canada, 20 Department of Biological Sciences, University of Calgary, Calgary, AB, Canada, 21 Institute
for Food Safety and Hygiene, University of Zurich, Zurich, Switzerland, 22 Biological and Chemical Sciences, Wilfrid Laurier
University, Waterloo, ON, Canada, 23 Department of Food Science, Cornell University, Ithaca, NY, United States, 24 College of
Veterinary Medicine, The Ohio State University, Columbus, OH, United States
The Salmonella Syst-OMICS consortium is sequencing 4,500 Salmonella genomes and
building an analysis pipeline for the study of Salmonella genome evolution, antibiotic
resistance and virulence genes. Metadata, including phenotypic as well as genomic
data, for isolates of the collection are provided through the Salmonella Foodborne Syst-
OMICS database (SalFoS), at https://salfos.ibis.ulaval.ca/. Here, we present our strategy
and the analysis of the first 3,377 genomes. Our data will be used to draw potential
links between strains found in fresh produce, humans, animals and the environment.
The ultimate goals are to understand how Salmonella evolves over time, improve the
accuracy of diagnostic methods, develop control methods in the field, and identify
prognostic markers for evidence-based decisions in epidemiology and surveillance.
Keywords: Salmonella, foodborne pathogen, next-generation sequencing, bacterial genomics, phylogeny,
antibiotic resistance, database
Frontiers in Microbiology | www.frontiersin.org 1June 2017 | Volume 8 | Article 996
fmicb-08-00996 May 31, 2017 Time: 15:54 # 2
Emond-Rheault et al. The Salmonella Syst-OMICS Project
IMPORTANCE OF FOODBORNE
Salmonella AS A MODEL IN
LARGE-SCALE BACTERIAL GENOMICS
Salmonella enterica is a foodborne bacterial pathogen having at
least 2,600 serotypes (Gal-Mor et al., 2014)1that contaminates a
diversity of foods and is a leading cause of foodborne illnesses
and mortality globally. In fact, there are an estimated 93.3
million cases of gastroenteritis due to non-typhoidal Salmonella
infections each year, resulting in approximately 155,000 deaths
(Majowicz et al., 2010). In Canada, non-typhoidal salmonellosis
accounts for more than 88,000 cases of foodborne illness
each year, and has among the highest incidence rate of any
bacterial foodborne pathogen (Thomas et al., 2015). S. enterica is
responsible for more than 50% of fresh produce-borne outbreaks,
the highest number of foodborne outbreaks of any inspected
food commodity in North America (Kozak et al., 2013). Because
of its remarkable genomic diversity, Salmonella is found in
complex environmental and ecological niches and survives in
harsh environments for long periods (Podolak et al., 2010;Fatica
and Schneider, 2011). Several research groups have identified
relationships between some of the 2,557 S. enterica serotypes and
specific foods, which suggests, that some food commodities act
as reservoirs for particular serotypes (Kim, 2010;Jackson et al.,
2013;Nuesch-Inderbinen et al., 2015).
Salmonella outbreaks are monitored with support from
the PulseNet surveillance system in 86 countries2(Ribot
and Hise, 2016;Scharff et al., 2016). PulseNet Canada3is
a national surveillance system used to quickly identify and
respond to foodborne disease outbreaks, centralized at the
National Microbiology Laboratory in Winnipeg, MB, and
working in close collaboration with a network of federal and
provincial public health laboratories and epidemiologists. Still,
despite the availability of thousands of sequenced genomes,
knowledge of genome evolution integrated with transmission and
epidemiology is limited for produce-related outbreaks.
Studies of S. enterica population structure in humans,
animals, food and the environment are central to understand
the biodiversity, evolution, ecology and epidemiology of this
pathogen. However, studies describing the genetic structure
of Salmonella populations are commonly based on isolates
drawn overwhelmingly from clinical collections (Hoffmann
et al., 2014). This approach has resulted in a limited view
of Salmonellas evolutionary history (D’costa et al., 2006;
Perry and Wright, 2014). In Salmonella as in many other
bacterial pathogens, there is limited knowledge on how genome
content, rearrangements and the complement of genes including
those acquired by horizontal gene transfer (HGT) contribute
to strain-specific phenotypes, including virulence (Casadevall,
2017). Various studies have sought to resolve the population
structure of Salmonella using complementary subtyping methods
including pulsed-field gel electrophoresis (PFGE), multiple
1https://www.cdc.gov/salmonella/reportspubs/salmonella-atlas/serotype-
snapshots.html
2http://www.pulsenetinternational.org/networks/usa/
3https://www.nml-lnm.gc.ca/Pulsenet/index- eng.htm
loci VNTR analysis (MLVA), 7-gene housekeeping schemes,
whole-genome multi-locus sequence typing (wgMLST) profiles,
pan- and core genome studies, and CRISPR analysis to define
molecular signatures, pathogen subtypes and the potential for
pathogenicity (Shariat and Dudley, 2014;Rouli et al., 2015;Liu
et al., 2016). Next-generation sequencing (NGS) coupled with
whole-genome comparison is well-positioned to become the gold
standard subtyping method, as it offers previously unmatched
resolution for phylogenetic analysis and rapid subtyping during
investigation of food contamination and outbreaks (Ashton et al.,
2016;Bekal et al., 2016).
THE Syst-OMICS Strategy
The application of genomics to infectious pathogens via
WGS is transforming the practice of Salmonella diagnostics,
epidemiology and surveillance. Genomic data are increasingly
used to understand infectious disease epidemiology (Didelot
et al., 2017). With rapidly falling costs and turnaround time,
microbial WGS and analysis is becoming a viable strategy to
identify the geographic origin of bacterial pathogens (Weedmark
et al., 2015;Hoffmann et al., 2016). The objective of the
Canadian-based international Syst-OMICS consortium is to
sequence a minimum of 4,500 genomes, include the data in the
Salmonella Foodborne Syst-OMICS database (SalFoS) at https:
//SalFoS.ibis.ulaval.ca/, share this information plus available
metadata with Canadian federal and provincial regulators and
the food industry, and develop pipelines to study these genomes.
Genomics data will support molecular epidemiology and source
attribution of outbreaks and has the potential for future genotypic
antimicrobial susceptibility testing, as well as the identification of
novel therapeutic targets and prognostic markers. Moreover, the
large-scale genomics and evolutionary biology tools developed
may lead to new strategies for countering not only Salmonella
infections, but other pathogens as well (Little et al., 2012).
The Syst-OMICS project is based upon a systems approach
(flowchart and screening method available in Supplementary
File 1). First, the genome diversity of 4,500 isolates will be assessed
using high-quality WGS, assembly, annotation and phylogeny.
This data will be used for in silico serotyping (Yoshida et al.,
2016), as well as analysis of virulence (Chen et al., 2012),
antibiotic resistance (Jia et al., 2016) and mobilome gene content
(Lanza et al., 2014). Based on this genomic data, a funnel-type
model will be applied such that 300 isolates will be selected
for in vitro high-throughput screening (HTS) in cell lines to
determine attachment, adhesion, invasion and replication of each
isolate (protocol adapted to 96-well plates from Forest et al.,
2007). From the results, isolates will be categorized as being
of high, medium, or low virulence. A limited number of those
isolates will then be selected for further screening in vivo using a
mouse model (Roy et al., 2007) and in vitro using gastrointestinal
fermenter models (Kheadr et al., 2010;Le Blay et al., 2012).
These data will identify isolates to represent the different levels
of virulence that will be used to develop novel diagnostic and
control tools. We propose to enhance food safety and lower
the economic burden of salmonellosis through a farm-to-table
Frontiers in Microbiology | www.frontiersin.org 2June 2017 | Volume 8 | Article 996
fmicb-08-00996 May 31, 2017 Time: 15:54 # 3
Emond-Rheault et al. The Salmonella Syst-OMICS Project
FIGURE 1 | Unrooted maximum likelihood tree of 3,377 Salmonella enterica genomes based on 196,774 SNPs using FastTree 2.1.9 (1000 bootstraps). The six
S. enterica subspecies names and specific epithets are indicated on the upper right tree. S. enterica subspecies enterica is split into two major lineages, clade A and
clade B, as proposed by Timme et al. (2013). The two S. bongori (V) isolates contained in SalFoS were not included in the phylogenetic tree because they
unnecessarily decrease resolution within the S. enterica subspecies. Number of genomes within each S. enterica subspecies are 3,235 enterica (2,648 in clade A
and 587 in clade B), 51 houtenae,32diarizonae,28salamae,17indica,8arizonae and 6 with unknown subspecies. Tree tips were colored based on the source of
each isolate. The number of isolates represented for each source category is shown between parentheses.
systematic approach to control Salmonella, with a focus on
new control methods in agricultural production, more specific
diagnostics and improved bacterial subtyping methods to support
investigation of foodborne outbreaks, as no single intervention is
likely to produce meaningful and lasting effects.
THE Salmonella FOODBORNE
Syst-OMICS DATABASE (SalFoS)
Salmonella Foodborne Syst-OMICS database is an online web
application that relies on a Mysql 5 database. It was designed not
only to store data for the Salmonella strain collection but also to
provide access to each isolate’s phenotypic, genomic, virulence,
serotype, mobilome and epidemiological data. Different levels
of access may be granted, but data modification is strictly
reserved to the curators. It includes isolate identification, host,
provider, date of isolation, geographical origin, phenotypic data,
DNA extraction details, NGS information and genome assembly
statistics. SalFoS currently contains NGS data and unpublished
draft genomes from produce, human, animal and environmental
isolates. Upon publication, draft genomes of SalFoS will become
available at NCBI and EnteroBase4.
4http://enterobase.warwick.ac.uk/
The SalFoS collection currently contains 2,498 entries for
Salmonella, as well as for Citrobacter,Hafnia and Proteus,
three genera often identified as false-positives by a number of
Salmonella detection schemes. It includes previously described
collections such as the unique Salmonella Genetic Stock Centre
strains, described at http://people.ucalgary.ca/~{}kesander/. This
collection was assembled with the aim of representing maximal
genomic diversity.
SEQUENCING 4,500 Salmonella
GENOMES: OBJECTIVES AND
STRATEGY
Our working hypothesis is that a very high-quality, large-scale
bacterial genome database available through a user-friendly
pipeline will have a major impact for epidemiology, diagnosis,
prevention and treatment. By generating a comprehensive
genome sequence database truly representative of the
foodborne Salmonella population, we will: (1) assemble a
large and representative strain collection, with associated
genome data, useful for antimicrobial testing, identification of
resistance markers, data mining for new therapeutic targets
and development of machine learning strategies; (2) develop
Frontiers in Microbiology | www.frontiersin.org 3June 2017 | Volume 8 | Article 996
fmicb-08-00996 May 31, 2017 Time: 15:54 # 4
Emond-Rheault et al. The Salmonella Syst-OMICS Project
FIGURE 2 | The resistome of 3,377 Salmonella strains. Gene, protein or variant presence were determined using the RGI-CARD (Mcarthur et al., 2013). Of 1,003
unique resistomes, only those present in at least five strains are shown; the histogram at the top represents absolute frequency. Other resistomes are condensed in
the “Rare genes” column. AMR genes and variants are grouped by antibiotic family or function. Genes encoding efflux pumps, which are generally conserved, have
been removed for figure clarity. Green: perfect match to a gene in the CARD, red: similar to a gene in the CARD, according to curated homology cut-offs, gray:
perfect match and/or similar to a gene in the CARD, black: no match in the CARD, (wildcard) represents multiple forms (exact number between square brackets) of
the same gene or protein, specific variants conferring resistance (protein variant models).
Frontiers in Microbiology | www.frontiersin.org 4June 2017 | Volume 8 | Article 996
fmicb-08-00996 May 31, 2017 Time: 15:54 # 5
Emond-Rheault et al. The Salmonella Syst-OMICS Project
platforms and pipelines to manage and analyze this information,
which will allow identification of prognostic markers, fast
epidemiological tracking and reduction of socio-economic
costs. We seek to develop user-friendly tools that will enable
epidemiologists, microbiologists, clinicians and others to
interpret genomic data, thus leading to informed decisions in
cases of food contamination and outbreaks. The contamination
of fresh produce by Salmonella will be addressed through
the development of natural solutions to control the presence
of Salmonella on fruits and vegetables as they are growing
in the fields. New tests will also be developed so that fresh
produce can be quickly and efficiently tested for the presence
of Salmonella before being sold to consumers. In the context
of outbreak investigation, the genomic data will be used to
assess high-quality SNPs and core/whole genome MLST for
their usefulness in genetic discrimination in addition to other
emerging methods such as CRISPR and prophage sequence
typing. As for outbreak investigation software, the National
Microbiology Laboratory-Public Health Agency of Canada
group has implemented the Integrated Rapid Infectious
Disease Analysis project (IRIDA)5and developed the SNVPhyl
phylogenomics pipeline that is in use by PulseNet Canada
for microbial genomic epidemiology (Petkau et al., 2016).
A complementary system called the Metagenomics Computation
and Analytics Workbench (MCAW) is being implemented as a
computing service for food safety (Edlund et al., 2016;Weimer
et al., 2016).
Sequencing for this project is performed on an Illumina MiSeq
instrument (at the Plateforme d’Analyses Génomiques of the
IBIS, Université Laval, Quebec City, QC, Canada), at a rate
of 120 genomes per week, using 300 bp paired-end libraries,
and with a median coverage of 45×. In order to perform core
genome phylogenetic analysis, the pan-genome, i.e., the complete
repertoire of genes of a species, is determined using a recently
developed software capable of handling high-quality NGS data
from thousands of genomes: Saturn V version 1.06(Jeukens
et al., 2017). Additional analyses focus on genes implicated in
virulence using comparative genomics predictions of confirmed
and predicted virulence factors (Yang et al., 2008), and resistome
identification based on the comprehensive antibiotic resistance
database (CARD) (Mcarthur et al., 2013;Jia et al., 2016). A set
of new reference Salmonella genomes representing maximal
genomic diversity among foodborne pathogens will then be
selected for PacBio Sequel sequencing to become fully assembled
and annotated as a single circular chromosome.
THE IBIS BIOINFORMATICS PIPELINE
FOR GENOME ASSEMBLY
When working with hundreds or thousands of genomes, analysis
software for assembly, annotation, statistics for quality control
and selection of additional reference genomes is required to
extract relevant information in an automated and reliable fashion
5http://dev.irida.ca/
6https://github.com/ejfresch/saturnV
with minimal human intervention. Ideally, this software should
be platform independent and able to analyze sequence data
directly without being tied to proprietary data formats. This
insures maximal flexibility and reduces lag time to a minimum.
We are currently using an integrated pipeline for de novo
assembly of microbial genomes based on the A5 pipeline (Tritt
et al., 2012). It was parallelized on a Silicon Graphics UV 300
using up to 120 cores to accommodate raw data from 120
genomes and provide assembly statistics as well as reference
genome alignment metrics in as little as 2 h. This automated
approach currently results in a median of 35 scaffolds per genome
(median N50 =462 kb).
PHYLOGENY OF Salmonella
Once isolates from a given outbreak are sequenced, patterns of
shared variations can be used to infer which isolates within the
outbreak are most closely related to each other (e.g., Didelot
et al., 2017). As a future strategy for the Syst-OMICS project, this
could be applied to partially sampled and on-going Salmonella
outbreaks. Here, as a first step in the study of S. enterica diversity
and epidemiology, we used 3,377 genomes; 1,627 were from
a collaboration with UC Davis (Bart C. Weimer), and 1,750
were part of SalFoS. All genomes with >100 scaffolds were
eliminated; this filter typically removes the vast majority of low
coverage (i.e., low quality) assemblies and mixed cultures. As
our assembly pipeline also includes alignment on a suite of
reference genomes, it is also possible to ensure that genomes used
belong to S. enterica. The core (conserved) genome was identified
with Saturn V, and consisted of 839 genes, which were used for
phylogenetic analysis. This number of core genes, which seems
small compared to other studies (2,882 core genes for 73 genomes
from 2 subspecies, Leekitcharoenphon et al., 2012), is due to both
the extensive diversity and the high number of genomes used.
As depicted in Figure 1, this population of S. enterica strains
could be divided into seven major groups. They correspond to
S. enterica subspecies enterica clades A and B and a collection
of branching subspecies previously defined as salamae, arizonae,
diarizonae, houtenae and indica. The significant number of
strains (3,377) included in our analysis and their wide-ranging
sources (including environmental, human, animal and food) is
essential to understand the diversity of Salmonella as a foodborne
pathogen and in defining levels of virulence. The remarkable
genomic diversity exhibited in Figure 1 is thought to enable the
colonization of a wide range of ecological niches. The Salmonella
Syst-OMICS consortium will provide fine-scale analysis of this
diversity via virulence factors, antibiotic resistance genes as well
as complete core and accessory genomes.
LINKING SalFoS WITH THE
COMPREHENSIVE ANTIBIOTIC
RESISTANCE DATABASE
The SalFoS database is intended to become an established
platform for searching and comparing multiple genome
Frontiers in Microbiology | www.frontiersin.org 5June 2017 | Volume 8 | Article 996
fmicb-08-00996 May 31, 2017 Time: 15:54 # 6
Emond-Rheault et al. The Salmonella Syst-OMICS Project
sequences for Salmonella isolates. The database will also
incorporate genome annotation and serotype prediction based on
SISTR (Yoshida et al., 2016). Close attention to the links between
specific genomic islands and patterns of SNPs in the core genome
will help identify diagnostic sequences and SNP combinations
for the development of new Salmonella subtyping methods with
the highest resolution to date. This will be done using de novo
island prediction with IslandViewer (Langille and Brinkman,
2009;Dhillon et al., 2015) as well as with gene presence-absence
from SaturnV.
As an additional feature, we routinely determine the resistome
of the genomes in SalFoS, i.e., the genes and variants likely
involved in antibiotic resistance. This is done using the Resistance
Gene Identifier (RGI) available with the CARD (Mcarthur et al.,
2013;Jia et al., 2016), at http://arpcard.mcmaster.ca/. Figure 2
summarizes the resistomes of 3,377 genomes. In fact, the original
dataset contained 1,003 unique resistomes, composed of various
combinations of 195 different genes and variants. Despite this
impressive diversity, the most striking feature shown in Figure 2
is that the two most frequently observed resistomes, which are
extremely similar, account for 23% of the strains. They are
therefore highly conserved and warrant further investigation.
These results will be exploited to study and understand the pool
of resistance genes present in Salmonella strains, with a focus on
strains found in fresh produce, to understand the links between
foodborne Salmonella and environmental strains with respect to
resistance genes.
LINKING GENOMIC AND CLINICAL DATA
It will be essential to match phenotypic, epidemiological and
available clinical Salmonella data (antibiotic resistance, virulence,
and anonymized clinical observations) to the genomic data
produced. We will categorize metadata in SalFoS so that isolates
can be sorted by phenotype, allowing rapid identification of
linked genomic signatures and the development of prognostic
approaches for diagnostic, epidemiology and surveillance.
We will develop tools to rapidly collate data for a given
strain type and produce a concise phenotypic and clinical
profile that provides users with an evidence-based decision-
making platform. The Canadian Food Inspection Agency,
Health Canada, Agriculture Canada, provincial public health
laboratories and the National Microbiology Laboratory-Public
Health Agency of Canada group are expected to be end-users of
the projects outcomes.
FUTURE GENOMIC AND BIOLOGICAL
STUDIES OF Salmonella
We will continuously improve SalFoS by adding Salmonella
strains, NGS data and analysis as well as experimental results.
Another aim of the Syst-OMICS consortium is to avoid
duplication of efforts in Salmonella genomics and enhance
interest from researchers having common goals. Additional
members are welcome to join in and expand on our original
Genome Canada project. We also intend to seek collaboration
with other groups to connect our database with those
developed for other Salmonella genomes. Finally, the Salmonella
Syst-OMICS project could be a model for other groups interested
in the bacterial genomics of infectious diseases, a strategy that
we are also pursuing for Pseudomonas aeruginosa (Freschi et al.,
2015).
AUTHOR CONTRIBUTIONS
J-GER, JJ, LF, IK-I and RL collected strains, performed the
analyses and drafted the manuscript. BB provided support for
sequencing and analysis. MD contributed to the development of
SalFoS. All other authors handled strains and collected metadata.
All authors revised the manuscript.
ACKNOWLEDGMENTS
We express our gratitude to members of the genomics analysis
and bioinformatics platforms at IBIS. We also acknowledge
Betty Wilkie, Ketna Mistry, Robert Holtslander and Shaun
Kenaghan from the NML Salmonella reference laboratory for
their assistance with serotyping. RL, LG, AG, ST, PD, DM,
SG, SB, FD, SW, SM, GL, INF, YJ, PT, CN, RG, JoR, JW are
funded by Genome Canada, provincial genome centers Génome
Québec and Genome BC, and the Ontario Ministry of Research
and Innovation. SM holds a Tier 1 Canada Research Chair in
Bacteriophages.
SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found
online at: http://journal.frontiersin.org/article/10.3389/fmicb.
2017.00996/full#supplementary-material
REFERENCES
Ashton, P. M., Nair, S., Peters, T. M., Bale, J. A., Powell, D. G., Painset, A., et al.
(2016). Identification of Salmonella for public health surveillance using whole
genome sequencing. PeerJ 4:e1752. doi: 10.7717/peerj.1752
Bekal, S., Berry, C., Reimer, A. R., Van Domselaar, G., Beaudry, G., Fournier, E.,
et al. (2016). Usefulness of high-quality core genome single-nucleotide variant
analysis for subtyping the highly clonal and the most prevalent Salmonella
enterica serovar heidelberg clone in the context of outbreak investigations.
J. Clin. Microbiol. 54, 289–295. doi: 10.1128/JCM.02200-15
Casadevall, A. (2017). The pathogenic potential of a microbe. mSphere 2:e00015-17.
doi: 10.1128/mSphere.00015-17
Chen, L., Xiong, Z., Sun, L., Yang, J., and Jin, Q. (2012). VFDB 2012 update: toward
the genetic diversity and molecular evolution of bacterial virulence factors.
Nucleic Acids Res. 40, D641–D645. doi: 10.1093/nar/gkr989
D’costa, V. M., Mcgrann, K. M., Hughes, D. W., and Wright, G. D. (2006). Sampling
the antibiotic resistome. Science 311, 374–377. doi: 10.1126/science.1120800
Dhillon, B. K., Laird, M. R., Shay, J. A., Winsor, G. L., Lo, R., Nizam, F., et al.
(2015). IslandViewer 3: more flexible, interactive genomic island discovery,
visualization and analysis. Nucleic Acids Res. 43, 27. doi: 10.1093/nar/gkv401
Frontiers in Microbiology | www.frontiersin.org 6June 2017 | Volume 8 | Article 996
fmicb-08-00996 May 31, 2017 Time: 15:54 # 7
Emond-Rheault et al. The Salmonella Syst-OMICS Project
Didelot, X., Fraser, C., Gardy, J., and Colijn, C. (2017). Genomic infectious disease
epidemiology in partially sampled and ongoing outbreaks. Mol. Biol. Evol. 34,
997–1007. doi: 10.1093/molbev/msw275
Edlund, S. B., Beck, K. L., Haiminen, N., Parida, L. P., Storey, D. B., Weimer,
B. C., et al. (2016). Design of the MCAW compute service for food safety
bioinformatics. IBM J. Res. Dev. 60:12. doi: 10.1147/JRD.2016.2584798
Fatica, M. K., and Schneider, K. R. (2011). Salmonella and produce: survival in
the plant environment and implications in food safety. Virulence 2, 573–579.
doi: 10.4161/viru.2.6.17880
Forest, C., Faucher, S. P., Poirier, K., Houle, S., Dozois, C. M., and Daigle, F.
(2007). Contribution of the stg fimbrial operon of Salmonella enterica serovar
typhi during interaction with human cells. Infect. Immun. 75, 5264–5271.
doi: 10.1128/iai.00674-07
Freschi, L., Jeukens, J., Kukavica-Ibrulj, I., Boyle, B., Dupont, M. J., Laroche, J.,
et al. (2015). Clinical utilization of genomics data produced by the international
Pseudomonas aeruginosa consortium. Front. Microbiol. 6:1036. doi: 10.3389/
fmicb.2015.01036
Gal-Mor, O., Boyle, E. C., and Grassl, G. A. (2014). Same species, different diseases:
how and why typhoidal and non-typhoidal Salmonella enterica serovars differ.
Front. Microbiol. 5:391. doi: 10.3389/fmicb.2014.00391
Hoffmann, M., Luo, Y., Monday, S. R., Gonzalez-Escalona, N., Ottesen, A. R.,
Muruvanda, T., et al. (2016). Tracing origins of the Salmonella bareilly strain
causing a food-borne outbreak in the United States. J. Infect. Dis. 213, 502–508.
doi: 10.1093/infdis/jiv297
Hoffmann, M., Zhao, S., Pettengill, J., Luo, Y., Monday, S. R., Abbott, J., et al.
(2014). Comparative genomic analysis and virulence differences in closely
related Salmonella enterica serotype Heidelberg isolates from humans, retail
meats, and animals. Genome Biol. Evol. 6, 1046–1068. doi: 10.1093/gbe/evu079
Jackson, B. R., Griffin, P. M., Cole, D., Walsh, K. A., and Chai, S. J. (2013).
Outbreak-associated Salmonella enterica serotypes and food commodities,
United States, 1998-2008. Emerg. Infect. Dis. 19, 1239–1244. doi: 10.3201/
eid1908.121511
Jeukens, J., Freschi, L., Vincent, A. T., Emond-Rheault, J. G., Kukavica-Ibrulj, I.,
Charette, S. J., et al. (2017). A pan-genomic approach to understand the basis of
host adaptation in Achromobacter.Genome Biol. Evol. doi: 10.1093/gbe/evx061
[Epub ahead of print].
Jia, B., Raphenya, A. R., Alcock, B., Waglechner, N., Guo, P., Tsang, K. K.,
et al. (2016). CARD 2017: expansion and model-centric curation of
the comprehensive antibiotic resistance database. Nucleic Acids Res. 45,
D566–D573. doi: 10.1093/nar/gkw1004
Kheadr, E., Zihler, A., Dabour, N., Lacroix, C., Le Blay, G., and Fliss, I. (2010).
Study of the physicochemical and biological stability of pediocin PA-1 in the
upper gastrointestinal tract conditions using a dynamic in vitro model. J. Appl.
Microbiol. 109, 54–64. doi: 10.1111/j.1365-2672.2009.04644.x
Kim, S. (2010). Salmonella serovars from foodborne and waterborne diseases in
korea, 1998-2007: total isolates decreasing versus rare serovars emerging. J. Kor.
Med. Sci. 25, 1693–1699. doi: 10.3346/jkms.2010.25.12.1693
Kozak, G. K., Macdonald, D., Landry, L., and Farber, J. M. (2013). Foodborne
outbreaks in Canada linked to produce: 2001 through 2009. J. Food Prot. 76,
173–183. doi: 10.4315/0362-028X.JFP-12- 126
Langille, M. G. I., and Brinkman, F. S. L. (2009). IslandViewer: an integrated
interface for computational identification and visualization of genomic islands.
Bioinformatics 25, 664–665. doi: 10.1093/bioinformatics/btp030
Lanza, V. F., De Toro, M., Garcillan-Barcia, M. P., Mora, A., Blanco, J., Coque,
T. M., et al. (2014). Plasmid flux in Escherichia coli ST131 sublineages, analyzed
by plasmid constellation network (PLACNET), a new method for plasmid
reconstruction from whole genome sequences. PLoS Genet. 10:e1004766.
doi: 10.1371/journal.pgen.1004766
Le Blay, G., Hammami, R., Lacroix, C., and Fliss, I. (2012). Stability and inhibitory
activity of pediocin PA-1 against Listeria sp. in simulated physiological
conditions of the human terminal ileum. Probiotics Antimicrob. Proteins 4,
250–258. doi: 10.1007/s12602-012- 9111-1
Leekitcharoenphon, P., Lukjancenko, O., Friis, C., Aarestrup, F. M., and Ussery,
D. W. (2012). Genomic variation in Salmonella enterica core genes for
epidemiological typing. BMC Genomics 13:88. doi: 10.1186/1471-2164- 13-88
Little, T. J., Allen, J. E., Babayan, S. A., Matthews, K. R., and Colegrave, N. (2012).
Harnessing evolutionary biology to combat infectious disease. Nat. Med. 18,
217–220. doi: 10.1038/nm.2572
Liu, Y. Y., Chen, C. C., and Chiou, C. S. (2016). Construction of a pan-genome allele
database of Salmonella enterica serovar enteritidis for molecular subtyping and
disease cluster identification. Front. Microbiol. 7:2010. doi: 10.3389/fmicb.2016.
02010
Majowicz, S. E., Musto, J., Scallan, E., Angulo, F. J., Kirk, M., O’brien, S. J., et al.
(2010). The global burden of nontyphoidal Salmonella gastroenteritis. Clin.
Infect. Dis. 50, 882–889. doi: 10.1086/650733
Mcarthur, A. G., Waglechner, N., Nizam, F., Yan, A., Azad, M. A., Baylay,
A. J., et al. (2013). The comprehensive antibiotic resistance database.
Antimicrob. Agents Chemother. 57, 3348–3357. doi: 10.1128/AAC.
00419-13
Nuesch-Inderbinen, M., Cernela, N., Althaus, D., Hachler, H., and Stephan, R.
(2015). Salmonella enterica serovar szentes, a rare serotype causing a 9-month
outbreak in 2013 and 2014 in switzerland. Foodborne Pathog. Dis. 12, 887–890.
doi: 10.1089/fpd.2015.1996
Perry, J. A., and Wright, G. D. (2014). Forces shaping the antibiotic resistome.
Bioessays 36, 1179–1184. doi: 10.1002/bies.201400128
Petkau, A., Mabon, P., Sieffert, C., Knox, N., Cabral, J., Iskander, M., et al. (2016).
SNVPhyl: a single nucleotide variant phylogenomics pipeline for microbial
genomic epidemiology. bioRxiv 092940. doi: 10.1101/092940
Podolak, R., Enache, E., Stone, W., Black, D. G., and Elliott, P. H. (2010).
Sources and risk factors for contamination, survival, persistence, and
heat resistance of Salmonella in low-moisture foods. J. Food Prot. 73,
1919–1936.
Ribot, E. M., and Hise, K. B. (2016). Future challenges for tracking foodborne
diseases: pulseNet, a 20-year-old US surveillance system for foodborne diseases,
is expanding both globally and technologically. EMBO Rep. 17, 1499–1505.
doi: 10.15252/embr.201643128
Rouli, L., Merhej, V., Fournier, P. E., and Raoult, D. (2015). The bacterial
pangenome as a new tool for analysing pathogenic bacteria. New Microbes New
Infect 7, 72–85. doi: 10.1016/j.nmni.2015.06.005
Roy, M.-F., Riendeau, N., Bédard, C., Hélie, P., Min-Oo, G., Turcotte, K.,
et al. (2007). Pyruvate kinase deficiency confers susceptibility to Salmonella
Typhimurium infection in mice. J. Exp. Med. 204, 2949–2961. doi: 10.1084/jem.
20062606
Scharff, R. L., Besser, J., Sharp, D. J., Jones, T. F., Peter, G.-S., and Hedberg, C. W.
(2016). An Economic Evaluation of PulseNet. Am. J. Prev. Med. 50, S66–S73.
doi: 10.1016/j.amepre.2015.09.018
Shariat, N., and Dudley, E. G. (2014). CRISPRs: molecular signatures used for
pathogen subtyping. Appl. Environ. Microbiol. 80, 430–439. doi: 10.1128/AEM.
02790-13
Thomas, M. K., Murray, R., Flockhart, L., Pintar, K., Fazil, A., Nesbitt, A., et al.
(2015). Estimates of foodborne illness–related hospitalizations and deaths in
Canada for 30 specified pathogens and unspecified agents. Foodborne Pathog.
Dis. 12, 820–827. doi: 10.1089/fpd.2015.1966
Timme, R. E., Pettengill, J. B., Allard, M. W., Strain, E., Barrangou, R., Wehnes, C.,
et al. (2013). Phylogenetic diversity of the enteric pathogen Salmonella enterica
subsp. enterica inferred from genome-wide reference-free SNP characters.
Genome Biol. Evol. 5, 2109–2123. doi: 10.1093/gbe/evt159
Tritt, A., Eisen, J. A., Facciotti, M. T., and Darling, A. E. (2012). An integrated
pipeline for de novo assembly of microbial genomes. PLoS ONE 7:e42304.
doi: 10.1371/journal.pone.0042304
Weedmark, K. A., Mabon, P., Hayden, K. L., Lambert, D., Van Domselaar, G.,
Austin, J. W., et al. (2015). Clostridium botulinum Group II isolate
phylogenomic profiling using whole-genome sequence data. Appl. Environ.
Microbiol. 81, 5938–5948. doi: 10.1128/aem.01155-15
Weimer, B. C., Storey, D. B., Elkins, C. A., Baker, R. C., Markwell, P., Chambliss,
D. D., et al. (2016). Defining the food microbiome for authentication, safety,
and process management. IBM J. Res. Dev. 60:13. doi: 10.1147/JRD.2016.
2582598
Yang, J., Chen, L., Sun, L., Yu, J., and Jin, Q. (2008). VFDB 2008 release: an
enhanced web-based resource for comparative pathogenomics. Nucleic Acids
Res. 36, D539–D542. doi: 10.1093/nar/gkm951
Yoshida, C. E., Kruczkiewicz, P., Laing, C. R., Lingohr, E. J., Gannon, V. P.,
Nash, J. H., et al. (2016). The Salmonella in silico typing resource (SISTR):
an open web-accessible tool for rapidly typing and subtyping draft Salmonella
genome assemblies. PLoS ONE 11:e0147101. doi: 10.1371/journal.pone.
0147101
Frontiers in Microbiology | www.frontiersin.org 7June 2017 | Volume 8 | Article 996
fmicb-08-00996 May 31, 2017 Time: 15:54 # 8
Emond-Rheault et al. The Salmonella Syst-OMICS Project
Conflict of Interest Statement: The handling Editor declared a shared affiliation,
though no other collaboration, with the authors ST and AG, and the handling
Editor states that the process met the standards of a fair and objective review.
The other authors declare that the research was conducted in the absence of any
commercial or financial relationships that could be construed as a potential conflict
of interest.
Copyright © 2017 Emond-Rheault, Jeukens, Freschi, Kukavica-Ibrulj, Boyle,Dupont,
Colavecchio, Barrere, Cadieux, Arya, Bekal, Berry, Burnett, Cavestri, Chapin,
Crouse, Daigle, Danyluk, Delaquis, Dewar, Doualla-Bell, Fliss, Fong, Fournier,
Franz, Garduno, Gill, Gruenheid, Harris, Huang, Huang, Johnson, Joly, Kerhoas,
Kong, Lapointe, Larivière, Loignon, Malo, Moineau, Mottawea, Mukhopadhyay,
Nadon, Nash, Ngueng Feze, Ogunremi, Perets, Pilar, Reimer, Robertson, Rohde,
Sanderson, Song, Stephan, Tamber, Thomassin, Tremblay, Usongo, Vincent, Wang,
Weadge, Wiedmann, Wijnands, Wilson, Wittum, Yoshida, Youfsi, Zhu, Weimer,
Goodridge and Levesque. This is an open-access article distributed under the terms
of the Creative Commons Attribution License (CC BY). The use, distribution or
reproduction in other forums is permitted, provided the original author(s) or licensor
are credited and that the original publication in this journal is cited, in accordance
with accepted academic practice. No use, distribution or reproduction is permitted
which does not comply with these terms.
Frontiers in Microbiology | www.frontiersin.org 8June 2017 | Volume 8 | Article 996
... In Africa, Escherichia coli and Salmonella sp. have the highest prevalence in fresh produce (Paudyal et al., 2017). Salmonella is a genus of rodshaped, Gram negative and facultative anaerobic bacteria that contaminates diversity of foods from different environmental sources and a leading cause of foodborne diseases (Emond-Rheault et al., 2017). Diseases due to consumption of Salmonella-contaminated foods are important causes of morbidity and mortality, and a significant contributor to socio-economic challenges (WHO, 2015). ...
... This study revealed the presence of Salmonella in 71.43% (30/42) of the fresh produce samples. Although no African data is available, Salmonella was reported to be responsible for more than 50% of fresh produce-borne outbreaks in North Africa (Emond-Rheault et al., 2017). A study on green leafy lettuce in Thailand, with good agricultural practices and effective safety control programs reported 23.33% of samples to be positive for Salmonella (Chanseyha et al., 2018). ...
... This suggests the unhygienic distribution conditions and the persistence of Salmonella, which is capable of surviving and proliferating in diverse environment, including conditions of transportation, storage and retailing (Akhtar et al., 2010). Salmonella survives in complex ecological niches and harsh environments over a long period (Emond-Rheault et al., 2017). ...
Article
Full-text available
This study investigated the microbial load of fresh produce retailed in Umuahia, Nigeria, and assessed the prevalence and antibiotic resistance of Salmonella. The loads of bacteria and coliforms as well as presence of Salmonella in 42 fresh produce samples were determined by standard microbiological methods. Antimicrobial susceptibility profile of the Salmonella isolates was determined using disc diffusion assay, while the multidrug resistant isolates were assessed for tolerance to different concentrations of acetic acid (0.5-5%) and NaCl (1-5%). The total bacterial and coliform counts ranged from 7.42 to 8.59 and 4.75 to 6.53 log 10 CFUg-1 , respectively. Salmonella was detected in 30 (71.43%) samples. All 24 Salmonella isolates (100%) were resistant to amoxicillin, augmentin, cefuxoxime, cefuroxime and ceftazidime, while absolute susceptibility (100%) was only recorded for ofloxacin. Nine resistance patterns were demonstrated by the isolates, being resistant to at least 5/10 antibiotics and at most 8/10 antibiotics. All selected multidrug resistant isolates except Salmonella sp. cror1 survived in 5% NaCl, while no growth was observed for 7/8 isolates in 1.5% acetic acid. The high prevalence of Salmonella in retailed fresh produce and high frequency of multidrug resistance amongst the isolates suggest the need for increased awareness about hygienic practices and effective regulation of medically important antimicrobials.
... Tüm bu omik araçlar, söz konusu patojenin konakçı-patojen etkileşimleri, RNA düzenleyici mekanizmaları ve strese adaptasyon süreçleri gibi temel biyolojik mekanizmalarının incelenmesi ve yorumlanması için kapsamlı bir analiz imkânı sağlamaktadır. Ayrıca; bu platformlar özellikle Listeria monocytogenes gibi ölümcül türlerin anlaşılmasına katkı sağlayarak, gıda güvenliği ve halk sağlığı alanında önemli bir perspektif sunmaktadır (Emond-Rheault et al., 2017;Becavin et al., 2017). ...
... Bu amaç çerçevesinde genomik çeşitlilik, antibiyotik direnç ve virülans ile ilgili genomik ve fenotipik meta verileri içeren bir analiz ağıyla veritabanı oluşturulmuştur. Projenin hedefleri arasında, tanı yöntemlerinin doğruluğunu artırmak, gözetim ve epidemiyolojiye yardımcı olacak prognostik belirteçler tanımlamak ve ayrıca sahada kontrol yöntemleri geliştirmek gibi önemli amaçlar bulunmaktadır (Emond-Rheault et al., 2017). ...
... Three Salmonella enterica serotype Enteritidis strains, S3 (isolated from Human), S187 (leafy greens), and S5-483 (Human), were collected from the Salmonella Foodborne Syst-Omics database [38]. Phage SF1 was isolated from cattle feces collected in Greater Vancouver, British Columbia, Canada, using strain S5-483 as a host. ...
... From evolved bacterial isolates, various mutation types (e.g., frameshift, point, and nonsense mutations) were recorded (Fig. 4A, Tables S5, S6, and S7). S187 R isolates had the highest number (107) of mutations, followed by S187 PR group (38). Much fewer mutations were found in S3 PR (8), S3 R (11), S5-483 PR (7), and S5-483 R (12) isolates. ...
Article
Full-text available
Parasite–host co-evolution results in population extinction or co-existence, yet the factors driving these distinct outcomes remain elusive. In this study, Salmonella strains were individually co-evolved with the lytic phage SF1 for 30 days, resulting in phage extinction or co-existence. We conducted a systematic investigation into the phenotypic and genetic dynamics of evolved host cells and phages to elucidate the evolutionary mechanisms. Throughout co-evolution, host cells displayed diverse phage resistance patterns: sensitivity, partial resistance, and complete resistance, to wild-type phage. Moreover, phage resistance strength showed a robust linear correlation with phage adsorption, suggesting that surface modification-mediated phage attachment predominates as the resistance mechanism in evolved bacterial populations. Additionally, bacterial isolates eliminating phages exhibited higher mutation rates and lower fitness costs in developing resistance compared to those leading to co-existence. Phage resistance genes were classified into two categories: key mutations, characterized by nonsense/frameshift mutations in rfaH-regulated rfb genes, leading to the removal of the receptor O-antigen; and secondary mutations, which involve less critical modifications, such as fimbrial synthesis and tRNA modification. The accumulation of secondary mutations resulted in partial and complete resistance, which could be overcome by evolved phages, whereas key mutations conferred undefeatable complete resistance by deleting receptors. In conclusion, higher key mutation frequencies with lower fitness costs promised strong resistance and eventual phage extinction, whereas deficiencies in fitness cost, mutation rate, and key mutation led to co-existence. Our findings reveal the distinct population dynamics and evolutionary trade-offs of phage resistance during co-evolution, thereby deepening our understanding of microbial interactions.
... To that end, the Salmonella Foodborne Syst-OMICS research database (SalFoS, https://salfos.ibis.ulaval.ca/, accessed on 30 December 2021), which was developed as part of a project on the diagnostics, surveillance, and control of Salmonella, contains a total of 3143 draft genomes of Salmonella (as of December 2021) and is a very useful repository of Salmonella genomes to identify biomarkers for surveillance and food outbreak investigations and to facilitate the development of tools to control salmonellosis [10]. In addition, the bioinformatics analysis of 500 Salmonella genomes completed by Rakov and colleagues (2019) led to the identification of 70 allelic variants of in silico expressed VGs associated with pathogenesis either manifesting as gastrointestinal or invasive disease [11]. ...
... The second group was made of 34 Salmonella strains sourced from the SalFoS database (SalFoS; https://salfos.ibis.ulaval. ca/, accessed on 27 January 2020) [10], which was developed as part of the Genome Canada Salmonella Foodborne Syst-OMICS project and was provided by Dr. R. C. Levesque (Laval University, Quebec City, QC, Canada). The serovar designations of the strains belonging to the second group as well as the metadata of the organisms were not provided at the time of testing (blind testing), which allowed us to infer the serovar designation using the AmpliSeq Salm_227VG procedure (see below), and the results were compared with the widely used Salmonella genome serovar designation tool-the Salmonella In Silico Typing Resource (SISTR) [17]. ...
Article
Full-text available
We have developed a targeted, amplicon-based next-generation sequencing method to detect and analyze 227 virulence genes (VG) of Salmonella (AmpliSeqSalm_227VG) for assessing the pathogenicity potential of Salmonella. The procedure was developed using 80 reference genomes representing 75 epidemiologically-relevant serovars associated with human salmonellosis. We applied the AmpliSeqSalm_227VG assay to (a) 35 previously characterized field strains of Salmonella consisting of serovars commonly incriminated in foodborne illnesses and (b) 34 Salmonella strains with undisclosed serological or virulence attributes, and were able to divide Salmonella VGs into two groups: core VGs and variable VGs. The commonest serovars causing foodborne illnesses such as Enteritidis, Typhimurium, Heidelberg and Newport had a high number of VGs (217–227). In contrast, serovars of subspecies not commonly associated with human illnesses, such as houtenae, arizonae and salame, tended to have fewer VGs (177–195). Variable VGs were not only infrequent but, when present, displayed considerable sequence variation: safC, sseL, sseD, sseE, ssaK and stdB showed the highest variation and were linked to strain pathogenicity. In a chicken infection model, VGs belonging to rfb and sse operons showed differences and were linked with pathogenicity. The high-throughput, targeted NGS-based AmpliSeqSalm_227VG procedure provided previously unknown information about variation in select virulence genes that can now be applied to a much larger population of Salmonella for evaluating pathogenicity of various serovars of Salmonella and for risk assessment of foodborne salmonellosis.
... This could be partly due to the laborious nature of detection techniques, which include plaque assays followed by examination under a transmission electron microscope (TEM) to identify "bulb-like" baseplate structures at the base of phage tails indicative of TSPs (Bhandare et al., 2024;Knecht et al., 2020). The decreasing costs of sequencing and the availability of improved bioinformatics tools have facilitated the construction of large-scale genome and metagenome datasets (Emond-Rheault et al., 2017;Wattam et al., 2014). High-throughput in silico detection of TSP-encoding genes in genomic data would not only provide further details regarding the diversity of TSPs in virulent phages but could also be used to identify TSPs in prophages. ...
... as a dataset. SalFoS contains genomic sequences from 2850 diverse Salmonella isolates from the environment, plant, and animal food products, as well as from human infections 25 . ...
Article
Full-text available
The bacterial genus Salmonella includes diverse isolates with multiple variations in the structure of the main polysaccharide component (O antigen) of membrane lipopolysaccharides. In addition, some isolates produce a transient (T) antigen, such as the T1 polysaccharide identified in the 1960s in an isolate of Salmonella enterica Paratyphi B. The structure and biosynthesis of the T1 antigen have remained enigmatic. Here, we use biophysical, biochemical and genetic methods to show that the T1 antigen is a complex linear glycan containing tandem homopolymeric domains of galactofuranose and ribofuranose, linked to lipid A–core, like a typical O antigen. T1 is a phase-variable antigen, regulated by recombinational inversion of the promoter upstream of the T1 genetic locus through a mechanism not observed for other bacterial O antigens. The T1 locus is conserved across many Salmonella isolates, but is mutated or absent in most typhoidal serovars and in serovar Enteritidis.
... The province of Ontario had 18.7 cases per 100,000 persons in 2018 [2]; this figure is slightly lower than the national incidence rate of 19.3 cases per 100,000 persons in the same year [2]. With considerations for lost work, medical care, and economic losses to food companies and restaurants, the estimated economic burden of salmonellosis in Canada is CAD 1 billion annually [3]. ...
Article
Full-text available
This study’s goal was to determine the prevalence, temporal trends, seasonal patterns, and temporal clustering of Salmonella enterica isolated from environmental samples from Ontario’s poultry breeding flocks between 2009 and 2018. Clusters of common serovars and those of human health concern were identified using a scan statistic. The period prevalence of S. enterica was 25.3% in broiler breeders, 6.4% in layer breeders, and 28.6% in turkey breeders. An overall decreasing trend in S. enterica prevalence was identified in broiler breeders (from 27.8% in 2009 to 22.1% in 2018) and layer breeders (from 15.4% to 4.9%), while an increasing trend was identified in turkey breeders (from 12.0% to 24.5%). The most common serovars varied by commodity. Among broiler breeders, S. enterica serovars Kentucky (42.4% of 682 submissions), Heidelberg (19.2%), and Typhimurium (5.4%) were the most common. Salmonella enterica serovars Thompson (20.0% of 195 submissions) and Infantis (16.4%) were most common among layer breeders, and S. enterica serovars Schwarzengrund (23.6% of 1368 submissions), Senftenberg (12.9%), and Heidelberg and Uganda (9.6% each) were most common among turkey breeders. Salmonella enterica ser. Enteritidis prevalence was highest in submissions from broiler breeders (3.7% of 682 broiler breeder submissions). Temporal clusters of S. enterica serovars were identified for all poultry commodities. Seasonal effects varied by commodity, with most peaks occurring in the fall. Our study provides information on the prevalence and temporality of S. enterica serovars within Ontario’s poultry breeder flocks that might guide prevention and control programs at the breeder level.
Article
Full-text available
Non-Typhoidal Salmonella (NTS) is one of the most common foodborne pathogens worldwide, with poultry products being the major vehicle for pathogenesis in humans. The use of bacteriophage (phage) cocktails has recently emerged as a novel approach to enhancing food safety. Here, a multi-receptor Salmonella phage cocktail of five phages was developed and characterized. The cocktail targets four receptors: O-antigen, BtuB, OmpC, and rough Salmonella strains. Structural analysis indicated that all five phages belong to unique families or subfamilies. Genome analysis of four of the phages showed they were devoid of known virulence or antimicrobial resistance factors, indicating enhanced safety. The phage cocktail broad antimicrobial spectrum against Salmonella, significantly inhibiting the growth of all 66 strains from 20 serovars tested in vitro. The average bacteriophage insensitive mutant (BIM) frequency against the cocktail was 6.22×10−6 in S. Enteritidis, significantly lower than that of each of the individual phages. The phage cocktail reduced the load of Salmonella in inoculated chicken skin by 3.5 log10 CFU/cm2 after 48 hours at 25 and 15°C, and 2.5 log10 CFU/cm2 at 4°C. A genome-wide transduction assay was used to investigate the transduction efficiency of the selected phage in the cocktail. Only one of the four phages tested could transduce the kanamycin resistance cassette at a low frequency comparable to that of phage P22. Overall, the results support the potential of cocktails of phage that each target different host receptors to achieve complementary infection and reduce the emergence of phage resistance during biocontrol applications.
Article
Full-text available
Salmonella enterica is a zoonotic pathogen and a leading cause of foodborne gastroenteritis in humans. Here, we report the draft genome sequences of two Salmonella Uzaramo isolates, which were isolated from poultry organs during routine post-mortem examination in South Africa. Currently, whole-genome sequences on Salmonella Uzaramo are scanty.
Article
Full-text available
Accurate detection of all Salmonella serovars present in a sample is important in surveillance programs. Current detection protocols are limited to detection of a predominant serovar, missing identification of less abundant serovars in a sample. An alternative method, called CRISPR-SeroSeq, serotyping by sequencing of amplified CRISPR spacers, was employed to detect multiple serovars in a sample without the need of culture isolation. The CRISPR-SeroSeq method successfully detected 34 most frequently reported Salmonella serovars in pure cultures and target serovars at 104 CFU/mL in 27 Salmonella-negative environmental enrichment samples post-spiked with one of 15 different serovars, plus 2 additional serovars at 1 log CFU/mL higher abundance. When the method was applied to 442 naturally contaminated environmental samples collected from 192 poultry farms, 25 different serovars were detected from 430 of the samples. In 73.1% of the samples, 2 to 7 serovars were detected, with Salmonella Kiambu (55.7%), Salmonella Infantis (48.4%), Salmonella Kentucky (27.1%), Salmonella Livingstone (26.6%), and Salmonella Mbandaka/Montevideo (23.4%) being the most prevalent on the farms. Single isolates from 384 samples were also analyzed using a traditional serotyping method, and the same serovar identified by culture was detected by CRISPR-SeroSeq in 96.1% (369/384) of samples, with the former missing detection of additional and sometimes critical serovars. The surveillance data obtained via CRISPR-SeroSeq revealed a significant emergence of Salmonella Kiambu and Salmonella Rissen on poultry farms in Ontario. The results highlight the effectiveness of the CRISPR-SeroSeq approach in detecting multiple Salmonella serovars in poultry environmental samples under applied conditions, providing updated surveillance information on Salmonella serovars on poultry farms in Ontario. IMPORTANCE The CRISPR-SeroSeq method represents an alternative molecular tool to the traditional culture-based serotyping method that can detect multiple Salmonella serovars in a sample and provide rapid serovar results without the need of selective enrichment and culture isolation. The evaluation results can facilitate implementation of the method in routine Salmonella surveillance on poultry farms and in outbreak investigations. The application of the method can increase the accuracy of current serovar prevalence information. The results highlight the effectiveness of the validated method and the need for monitoring Salmonella serovars in poultry environments to improve current surveillance programs. The updated surveillance data provide timely information on emergence of different Salmonella serovars on poultry farms in Ontario and support on-farm risk assessment and risk management of Salmonella.
Preprint
Full-text available
Motivation The recent widespread application of whole-genome sequencing (WGS) for microbial disease investigations has spurred the development of new bioinformatics tools, including a notable proliferation of phylogenomics pipelines designed for infectious disease surveillance and outbreak investigation. Transitioning the use of WGS data out of the research lab and into the front lines of surveillance and outbreak response requires user-friendly, reproducible, and scalable pipelines that have been well validated. Results SNVPhyl (Single Nucleotide Variant Phylogenomics) is a bioinformatics pipeline for identifying high-quality SNVs and constructing a whole genome phylogeny from a collection of WGS reads and a reference genome. Individual pipeline components are integrated into the Galaxy bioinformatics framework, enabling data analysis in a user-friendly, reproducible, and scalable environment. We show that SNVPhyl can detect SNVs with high sensitivity and specificity and identify and remove regions of high SNV density (indicative of recombination). SNVPhyl is able to correctly distinguish outbreak from non-outbreak isolates across a range of variant-calling settings, sequencing-coverage thresholds, or in the presence of contamination. Availability SNVPhyl is available as a Galaxy workflow, Docker and virtual machine images, and a Unix-based command-line application. SNVPhyl is released under the Apache 2.0 license and available at http://snvphyl.readthedocs.io/ or at https://github.com/phac-nml/snvphyl-galaxy .
Article
Full-text available
Over the past decade, there has been a rising interest in Achromobacter sp., an emerging opportunistic pathogen responsible for nosocomial and cystic fibrosis (CF) lung infections. Species of this genus are ubiquitous in the environment, can outcompete resident microbiota, and are resistant to commonly used disinfectants as well as antibiotics. Nevertheless, the Achromobacter genus suffers from difficulties in diagnosis, unresolved taxonomy and limited understanding of how it adapts to the CF lung, not to mention other host environments. The goals of this first genus-wide comparative genomics study were to clarify the taxonomy of this genus and identify genomic features associated with pathogenicity and host adaptation. This was done with a widely applicable approach based on pan-genome analysis. First, using all publicly available genomes, a combination of phylogenetic analysis based on 1,780 conserved genes with average nucleotide identity and accessory genome composition allowed the identification of a largely clinical lineage composed of A. xylosoxidans A insuavis A. dolens and A. ruhlandii. Within this lineage, we identified 35 positively selected genes involved in metabolism, regulation and efflux-mediated antibiotic resistance. Second, resistome analysis showed that this clinical lineage carried additional antibiotic resistance genes compared to other isolates. Finally, we identified putative mobile elements that contribute 53% of the genus's resistome and support horizontal gene transfer between Achromobacter and other ecologically similar genera. This study provides strong phylogenetic and pan-genomic bases to motivate further research on Achromobacter, and contributes to the understanding of opportunistic pathogen evolution.
Article
Full-text available
Virulence is a microbial property that is realized only in susceptible hosts. There is no absolute measurement for virulence, and consequently it is always measured relative to a standard, usually another microbe or host. This article introduces the concept of pathogenic potential, which provides a new approach to measuring the capacity of microbes for virulence. The pathogenic potential is proportional to the fraction of individuals who become symptomatic after infection with a defined inoculum and can include such attributes as mortality, communicability, and the time from infection to disease. The calculation of the pathogenic potential has significant advantages over the use of the lethal dose that kills 50% of infected individuals (LD 50 ) and allows direct comparisons between individual microbes. An analysis of the pathogenic potential of several microbes for mice reveals a continuum, which in turn supports the view that there is no dividing line between pathogenic and nonpathogenic microbes.
Article
Full-text available
Genomic data is increasingly being used to understand infectious disease epidemiology. Isolates from a given outbreak are sequenced, and the patterns of shared variation are used to infer which isolates within the outbreak are most closely related to each other. Unfortunately, the phylogenetic trees typically used to represent this variation are not directly informative about who infected whom - a phylogenetic tree is not a transmission tree. However, a transmission tree can be inferred from a phylogeny while accounting for within-host genetic diversity by colouring the branches of a phylogeny according to which host those branches were in. Here we extend this approach and show that it can be applied to partially sampled and ongoing outbreaks. This requires computing the correct probability of an observed transmission tree and we herein demonstrate how to do this for a large class of epidemiological models. We also demonstrate how the branch colouring approach can incorporate a variable number of unique colours to represent unsampled intermediates in transmission chains. The resulting algorithm is a reversible jump Monte-Carlo Markov Chain, which we apply to both simulated data and real data from an outbreak of tuberculosis. By accounting for unsampled cases and an outbreak which may not have reached its end, our method is uniquely suited to use in a public health environment during real-time outbreak investigations. We implemented this transmission tree inference methodology in an R package called TransPhylo, which is freely available from https://github.com/xavierdidelot/TransPhylo.
Article
Full-text available
We built a pan-genome allele database with 395 genomes of Salmonella enterica serovar Enteritidis and developed computer tools for analysis of whole genome sequencing (WGS) data of bacterial isolates for disease cluster identification. A web server (http://wgmlst.imst.nsysu.edu.tw) was set up with the database and the tools, allowing users to upload WGS data to generate whole genome multilocus sequence typing (wgMLST) profiles and to perform cluster analysis of wgMLST profiles. The usefulness of the database in disease cluster identification was demonstrated by analyzing a panel of genomes from 55 epidemiologically well-defined S. Enteritidis isolates provided by the Minnesota Department of Health. The wgMLST-based cluster analysis revealed distinct clades that were concordant with the epidemiologically defined outbreaks. Thus, using a common pan-genome allele database, wgMLST can be a promising WGS-based subtyping approach for disease surveillance and outbreak investigation across laboratories.
Article
Full-text available
The Comprehensive Antibiotic Resistance Database (CARD; http://arpcard.mcmaster.ca) is a manually curated resource containing high quality reference data on the molecular basis of antimicrobial resistance (AMR), with an emphasis on the genes, proteins and mutations involved in AMR. CARD is ontologically structured, model centric, and spans the breadth of AMR drug classes and resistance mechanisms, including intrinsic, mutation-driven and acquired resistance. It is built upon the Antibiotic Resistance Ontology (ARO), a custom built, interconnected and hierarchical controlled vocabulary allowing advanced data sharing and organization. Its design allows the development of novel genome analysis tools, such as the Resistance Gene Identifier (RGI) for resistome prediction from raw genome sequence. Recent improvements include extensive curation of additional reference sequences and mutations, development of a unique Model Ontology and accompanying AMR detection models to power sequence analysis, new visualization tools, and expansion of the RGI for detection of emergent AMR threats. CARD curation is updated monthly based on an interplay of manual literature curation, computational text mining, and genome analysis.
Article
Full-text available
Under intense scrutiny for safety and authenticity, our food supply encompasses probiotic supplementation, fermentation organisms, pathogenic bacteria, and microbial toxins - in short, the microbiome and metabolome of food. Recent claims regarding probiotic supplements, additives, and cultured foods highlight the need for widely accepted protocols for evidence-based oversight of such products, as well as specific methods to assess their safety and authenticity. Rapid improvements in high-throughput sequencing technologies, curated and annotated reference databases of whole genome sequences, bacterial strain banks, and novel informatics techniques coupled to a scalable computing platform are poised to provide a robust solution extendable to encompass systematic authentication of the microbiome and its variations up and down the supply chain. Members of the Sequence the Food Supply Chain Consortium are working to characterize and quantify the microbiome at a baseline and after processing. They are also working to create reference databases and develop a Metagenomics Computation and Analytics Workbench, capable of verifying the effectiveness of good manufacturing practices and monitoring control measures highlighted in a site's Hazard Analysis Critical Control Point plan. In this paper, we propose how microbial ecology, evolvability, and phylogenetic diversity exhort the application of new molecular techniques to assure safety, authenticity, and traceability for wholesome food.
Article
Full-text available
The techniques of microbe community genome sequencing as applied to environmental samples - metagenomics - offer powerful insight into microbial community structure and ecology that can affect food safety decisions for public health security. In this paper, the design and characteristics of a new informatics service, the Metagenomics Computation and Analytics Workbench (MCAW), are presented and illustrated with reference to the analysis of metagenomics data. The service is designed to meet the requirements for analyzing metagenomic and metatranscriptomic sequence data to assess microbial hazards and food authentication in the supply chain. Moreover, MCAW provides for reliable storage and management of raw genomic sequences and analysis results, high-volume informatics processing, meticulous tracking of data provenance and processing steps, and function-rich visualization of results.
Article
Full-text available
Regardless of where they get their information from, Americans are very likely to learn almost instantly whenever there is an outbreak of bacterial pathogens— Salmonella , Listeria , or the “bad” Escherichia coli —from contaminated food products. This is a huge achievement and a great benefit for public health: The earlier this information reaches consumers, the less people will be affected and public health and other authorities have more time to identify and contain the source of the outbreak. However, despite its contribution to public health, most Americans are not aware that a little‐known government program called “PulseNet USA” detects nearly all foodborne outbreaks of pathogenic bacteria. This is a bit odd because PulseNet has not only been very efficient in detecting foodborne disease but has thereby positively impacted public health and saved millions of dollars since it was founded 20 years ago. PulseNet is now undergoing profound changes as it both expands internationally to protect consumers in other countries and invests heavily—financially and scientifically—in new technologies such as next‐generation sequencing (NGS) to further improve its capacity to detect food contaminations. PulseNet is a national surveillance network based in Atlanta at the US Centers for Disease Control and Prevention to detect outbreaks of foodborne bacterial pathogens in real time [1], [2]. Most of the detection itself is done at 83 accredited state, local, and federal laboratories that are connected with each other via an efficient communications network. PulseNet—both the center in Atlanta and individual laboratories—works closely with epidemiologists and other public health officials to investigate the source of an outbreak, establish appropriated public health measures, and assist federal agencies with improving the safety of the food supply. Simply stated, PulseNet's goal is to link information about people who have likely consumed the same contaminated food, even if …
Article
Full-text available
In April 2015, Public Health England implemented whole genome sequencing (WGS) as a routine typing tool for public health surveillance of Salmonella , adopting a multilocus sequence typing (MLST) approach as a replacement for traditional serotyping. The WGS derived sequence type (ST) was compared to the phenotypic serotype for 6,887 isolates of S. enterica subspecies I, and of these, 6,616 (96%) were concordant. Of the 4% ( n = 271) of isolates of subspecies I exhibiting a mismatch, 119 were due to a process error in the laboratory, 26 were likely caused by the serotype designation in the MLST database being incorrect and 126 occurred when two different serovars belonged to the same ST. The population structure of S. enterica subspecies II–IV differs markedly from that of subspecies I and, based on current data, defining the serovar from the clonal complex may be less appropriate for the classification of this group. Novel sequence types that were not present in the MLST database were identified in 8.6% of the total number of samples tested (including S. enterica subspecies I–IV and S. bongori ) and these 654 isolates belonged to 326 novel STs. For S. enterica subspecies I, WGS MLST derived serotyping is a high throughput, accurate, robust, reliable typing method, well suited to routine public health surveillance. The combined output of ST and serovar supports the maintenance of traditional serovar nomenclature while providing additional insight on the true phylogenetic relationship between isolates.