Exploiting the explosion of information associated with whole genome
sequencing to tackle Shiga toxin-producing Escherichia coli (STEC) in
global food production systems
Eelco Franza, Pascal Delaquisb, Stefano Morabitoc, Lothar Beutind, Kari Gobiuse, David A. Raskof, Jim Bonog,
Nigel Frenchh, Jacek Oseki, Bjørn-Arne Lindstedtj, Maite Muniesak, Shannon Manningl, Jeff LeJeunem,
Todd Callawayn, Scott Beatsono, Mark Eppingerp, Tim Dallmanq, Ken J. Forbesr, Henk Aartsa, David L. Pearls,
Victor P.J. Gannont, Chad R. Laingt, Norval J.C. Strachanu,⁎
aNational Institute for Public Health and the Environment, Bilthoven, the Netherlands
bAgriculture and Agri-Food Canada, Summerland, British Columbia, Canada
cNational Institute of Health, Rome, Italy
dFederal Institute for Risk Assessment, Berlin, Germany
eCSIRO, Archerfield BC Queensland, Australia
fUniversity of MD School of Medicine, Baltimore, USA
gUSDA/ARS, Clay Centre, NE, USA
hEpiLab, Infectious Disease Research Centre, Institute of Veterinary Animal and Biomedical Sciences, Massey University, New Zealand
iNational Veterinary Research Institute, Pulawy, Poland
jUnit of Gene Technology, Akershus University Hospital Lørenskog, Norway
kDepartment of Microbiology, University of Barcelona, Barcelona, Spain
lMicrobiology and Molecular Genetics, MI State University, East Lansing, USA
mOhio Agricultural Research and Development Center, Wooster, USA
nUSDA/ARS, College Station, TX, USA
oAustralian Infectious Disease Research Centre, The University of Queensland, St. Lucia, Australia
pDepartment of Biology, & South Texas Center for Emerging Infectious Diseases (STCEID), The University of Texas at San Antonio, San Antonio, TX, USA
qHealth Protection Agency, London, United Kingdom
rSchool of Medicine and Dentistry, The University of Aberdeen, Aberdeen, Scotland, United Kingdom
sOntario Veterinary College, University of Guelph, Guelph, Canada
tPubl Health Agency Canada, Lab Foodborne Zoonoses, Lethbridge, Canada
uSchool of Biological Sciences, The University of Aberdeen, Aberdeen, Scotland, United Kingdom
a b s t r a c ta r t i c l e i n f o
Received 7 March 2014
Received in revised form 27 June 2014
Accepted 4 July 2014
Available online 11 July 2014
Whole genome sequencing
Shiga toxin producing E. coli (STEC)
E. coli O157
The rates of foodborne disease caused by gastrointestinal pathogens continue to be a concern in both the devel-
oped and developing worlds. The growing world population, the increasing complexity of agri-food networks
and the wide range of foods now associated with STEC are potential drivers for increased risk of human disease.
tohelpaddress theissuesassociatedwith these pathogenic microorganisms. Thisposition paper, arising from an
OECD funded workshop, provides a brief overview of next generation sequencing technologies and software. It
then uses the agent–host–environment paradigm as a basis to investigate the potential benefits and pitfalls of
WGS in the examination of (1) the evolution and virulence of STEC, (2) epidemiology from bedside diagnostics
to investigations of outbreaks and sporadic cases and (3) food protection from routine analysis of foodstuffs to
global food networks. A number of key recommendations are made that include: validation and standardization
ofacquisition,processingand storage of sequence data including the development of an open access “WGSNET”;
building up of sequence databases from both prospective and retrospective isolates; development of a suite of
open-access software specific for STEC accessible to non-bioinformaticians that promotes understanding of
both the computational and biological aspects of the problems at hand; prioritization of research funding to
op a supply of individuals working in bioinformatics/software development; training for clinicians, epidemiolo-
gists, the food industry and other stakeholders to ensure uptake of the technology and finally review of progress
ofimplementation of WGS. Currently the benefits of WGS are being slowly teased out by academic, government,
International Journal of Food Microbiology 187 (2014) 57–72
⁎ Corresponding author. Tel.: +44 1224 272699; fax: +44 1224 272703.
E-mail address: firstname.lastname@example.org (N.J.C. Strachan).
0168-1605/© 2014 Published by Elsevier B.V.
Contents lists available at ScienceDirect
International Journal of Food Microbiology
journal homepage: www.elsevier.com/locate/ijfoodmicro
and industry or private sector researchers around the world. The next phase will require a coordinated interna-
effective and timely manner.
© 2014 Published by Elsevier B.V.
Critical advances in genomic technologies and analytical tools
Genomics in evolution and virulence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.1.Population structure and virulence of E. coli O157. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.2. Population structure and emergence of non-O157 STEC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.3. Emergence and evolution of STEC: role of horizontal gene-transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.4.Molecular risk assessment of STEC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Genomics in epidemiology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.1. Bedside diagnostics and clinical decision-making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.2. Outbreak detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.3. Sporadic cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4. Identifying the source of infection (source attribution and tracing genotypes) . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.5.Antibiotic resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.6.Asymptomatic carriers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Genomics in food protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Conclusions/recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Rates of infection with foodborne pathogens and their attendant
economic burden remain stubbornly high in industrialized and devel-
oping countries despite persistent efforts to increase the safety of the
food supply (Havelaar et al., 2010; Newell et al., 2010; Scharff, 2012).
Shifts in the traditional association of foodborne pathogens from foods
or re-emergence of known and new pathogens are reported by public
health authorities worldwide (Jones et al., 2008). The dynamic ge-
netic content of relevant microorganisms, changes in agricultural
production systems due to the introduction of new technologies,
climate change, and increasing worldwide trade in food products
and raw materials forfoodproduction to meetevolvingglobal consum-
er demand undoubtedly contribute to food safety risks (Morse, 1995).
Tragically, the European Shiga toxin-producing enteroaggregative
Escherichia coli O104:H4 outbreak associated with fenugreek sprouts
produced from imported seed in 2011 provided a clear example of the
sudden emergence of a highly virulent and unusual hybrid strain of
foodborne pathogen and illustrated the urgent need for improved
analytical tools to deal with such crises (Soon et al., 2013). During
this outbreak, the value of next generation sequencing (Table 1 for
definitions) technology to understanding the virulence, origins and
(Mellmann et al., 2011).
Shiga toxin-producing E. coli (STEC) are a diverse group of
enteric pathogens capable of causing severe gastrointestinal disease
(haemorrhagic colitis), acute kidney failure (haemolytic uremic syn-
drome: HUS), and chronic post-infection sequelae (e.g. irritable bowel
syndrome and end-stage renal disease) or death (Spinale et al., 2013;
Tarr et al., 2005). While large foodborne outbreaks and significant
numbers of sporadic cases caused by STEC serotype O157:H7 have
been reported since the early 1980's as well as emergence of sorbitol
fermenting strains (Werber et al., 2011), infections and outbreaks
caused by other serotypes (i.e. non-O157 STEC) with various combina-
tions of (putative) virulence traits (i.e. virulence profiles) are increas-
ingly recognized (Mathusa et al., 2010; Gould et al., 2013; Scallan
et al., 2011). The emergence of non-O157 STEC serotypes has become
a seriouschallenge for theagri-food sector and public health/regulatory
disease. Foods associated with STEC-outbreaks have included both
locally produced and imported goods. The prevention and control of
the global spread of foodborne disease caused by STEC requires contin-
uousand integrated internationalefforts to trackthespreadof illnesses,
to anticipate the emergence of new or altered forms of the pathogen
and to devise strategies that lessen the risk of contamination in food-
stuffs distributed through increasingly complex international supply
Microbiological techniques typically require enrichment, culture
on selective media and serotyping to detect and characterize STEC
for diagnostic and epidemiological purposes. In recent years various
molecular techniques based on the recognition of specific DNA
sequences or proteins have supplemented classical techniques to en-
able faster and more accurate recognition and improved characteri-
zation of STEC. However, these techniques are mainly adapted to
the detection of virulent serotypes known to cause human disease.
Given the evident genotypic plasticity and anticipated diversity
of virulence in STEC (Steyert et al., 2012; Karch et al., 2012), there
is a clear need to improve the means to detect and type potentially
virulent isolates in clinical settings, for epidemiological purposes
and to facilitate research on the behavior of STEC along the food
production-distribution chain. In the context of food safety, geno-
mics data are providing the means to identify marker genes associat-
ed with pathogen stress survival, growth and/or virulence. There is
also the potential to correlate disease incidence with the distribution
and diversity of specific genomic sequences within a geographic
area, to anticipate changes in pathogenicity, to develop new tracking
and diagnostic methods and to investigate pathogen behavior in
food chains. On a practical level, the linking of genomics data to
phenotypic response could lead to far more detailed and accurate
quantitative microbiological risk assessment in foods, although this
remains to be implemented (reviewed in (Brul et al., 2012)). In
addition, it should be stressed that the application of these genomic
methods to STEC will need to be done using individual isolates
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
because the attribution of virulence factors to a specific isolate would
not be possible using sequence data derived from mixed cultures
(i.e., metagenomic data).
Different methodological approaches are applied to characterize
STEC strains or to process and store associated typing data. The collec-
tionof data maytake placein laboratoriesservingdifferentjurisdictions
within individual countries. At present, lack of clarity about the avail-
ability and type of data hinders access to critical information that is
needed to perform robust comparative analyses of STEC across geo-
graphic regions. Hence it is not yet possible to fully realize the potential
of genomics to address the objective of enhancing food safety systems
to enable the mitigation of disease caused by STEC. Resolving this
would deliver benefits that include global improvements to the health
and well-being of citizens, increasing confidence in the safety of the
food supply and reduced disruptions in the trade of agricultural
The present manuscript summarizes the findings of a workshop,
held in July 2013, and attended by international experts in the use of
advanced genomic technologies for the characterization of bacterial
pathogens (specifically STEC) to identify means to overcome these bar-
riers. The workshop aimed to improve the flow of critical data between
scientists from many countries attempting to determine the origin
of STEC, how foods become contaminated, which specific strains are
implicated and why they cause such serious illness. In particular the
workshop investigated potential applications for genomic technologies
to span the complex interactions along agent (STEC organism), host
(animal reservoir and humans) and environment (soil, water, food
etc.) paradigm (Thomas and Weber, 2001). This encompasses the
threemain areasofevolution/emergence, food protection andepidemi-
ogies followed by a summary of immediate practical benefits and
potential value for theoretical studies on STEC evolution/virulence, epi-
demiology and food protection. There then follows a comprehensive
discussion which is both cross-cutting and identifies the potential
major benefits that genomics will bring to STEC research, policy and
that were considered to this end is depicted in Fig. 2.
2. Critical advances in genomic technologies and analytical tools
Novel methodological approaches developed in the mid to late
1990s laid the foundation for the advent of “next generation sequenc-
ing” (NGS) technologies that have vastly enhanced thespeed, increased
decade platforms based on pyrosequencing (454 Life Sciences, now
Roche Diagnostics), reversible dye-terminator sequencing by synthesis
(Illumina) and sequencing by ligation (SOLiD, Life Technologies) have
captured most of the demand for high-throughput technology needed
for the sequencing of large eukaryotic genomes (e.g. the 3.2 billion
base pair (Gb) human genome). The choice of a specific platform is
guided by application since available systems have variable characteris-
tics (e.g., cost, read-length and number of reads) (Table 2). Although
sequencing costs have fallen significantly it should be stressed that
estimates (for example www.genome.gov/sequencingcosts/) rapidly
change and do not account for upstream costs (culture, DNA extraction
and library construction) or post-sequencing computational analysis
to assemble, close, edit or annotate sequences. Numerous analysis
tation, and interpretation of these data, which will facilitate the transi-
tion to use in diagnostic laboratories. Such a transition will rapidly
enhance our capacity to characterize STEC from clinical, environmental
or food samples.
Sequencing with WGS technologies for bacterial pathogens requires
prior shearing or enzymatic cleavage of the DNA into short fragments
that vary in length according to platform. Computational bio-informatic
(SRS) into contiguous sequences. Epidemiological or evolutionary infer-
ence based on these assemblies are dependent on the quality (fidelity
and completeness of these assemblies; imperfections, such as erroneous
bled genomes. However, tools are being developed which can help
identify these problems (Cook and Ussery, 2013). Several algorithms
bly or for mappingagainstexisting backbone sequences and anintroduc-
tion to these techniques is already available (Edwards and Holt, 2013).
Manufacturers of WGS provide software suited to the alignment of SRS
lengths for specific platforms. A number of open source resources are
also available that vary in flexibility, performance and ease-of-use
(Kisand and Lettieri, 2013). The selection of an assembly tool depends
largely on the sequencing platform and purpose of sequencing, which
can range from the generation of a draft assembly for comparative
genomic analyses against available complete sequences to the resolution
of research questions that require more detailed understanding. Draft
assemblies of contiguous sequences (contigs) containing gaps and no
information on true order and orientation of contiguous sequences with-
in the genome are generally sufficient for the detection of genetic deter-
minants associated with species or sub-species, known genes related to
specific functions such as antibiotic resistance, insertions or deletion
events (InDels), or single nucleotide polymorphisms (SNPs). In contrast,
repetitive regions such as prophage and genomic islands are often poorly
assembled in draft assemblies. Two publications (Kisand and Lettieri,
2013; Edwards and Holt, 2013) provide a comprehensive overview of
StrainThe descendants of a single isolation in pure culture, usually made
up of a succession of cultures ultimately derived from an initial
A category that circumscribes a genomically coherent group of
individual isolates/strains sharing a high degree of similarity in
(many) independent genotypic and phenotypic features.
Subdivision of a species distinguishable from other strains on the
basis of a characteristic set of antigens.
Group of strains belonging to the same species with a common
mode of action with respect to the infection process and virulence.
A group of species with a common line of descent from an
immediate ancestral species. Within E. coli used to define groups
among which little recombination of chromosomal genes occurs.
A group of monophyletic organisms that comprises all the
evolutionary descendants of a common ancestor. Within E. coli
often used as groups with shared multi-locus-sequence-type
A person considered directly affected by an outbreak.
A person who excretes the pathogen (STEC) but presents no
An outbreak is the occurrence of cases of disease in excess of what
would normally be expected in a defined community, geographical
area or season.
Whole genome sequencing: a laboratory process that determines
the complete DNA sequence of an organism's genome.
Next generation sequencing refers to high-throughput sequencing
technologies developed in the post-Sanger period.
Single nucleotide polymorphism: Single nucleotide polymorphism:
a genetic variation referring to a difference in DNA sequence
occurring at a specific nucleotide within the genome.
Is a set of overlapping DNA segments that together represent a
consensus region of DNA. If sufficient sequence data is collected, the
sequence derived from the overlapping sequence reads can be
combined to determine the complete sequence of the source DNA;
otherwise the assembly of the shorter sequence reads will result in
a collection of contigs that represent the nearly complete genome
with gaps of missing sequence.
The total of all genetic elements in a genome that can mediate
horizontal transfer of DNA sequences such has plasmids and
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
bioinformatic tools currently available for the assembly, ordering and
annotation of bacterial genomes for comparative analysis. Additional
tools are available in the commercial and public domains for gap closure,
removal of errors due to sequencing artefacts, the resolution of repeat
sequences, and polishing to correct consensus errors where finished se-
quences are also available for more detailed analysis of genome structure
and/or function (e.g.(Swain et al., 2012)).
Rapid growth in sequencing capacity has produced a parallel in-
crease in the availability of genomic data. Several public databases
store data reliably and provide free/public access to the sequencing
information. For example, the well-known National Center for Biotech-
nology Information (NCBI, http://www.ncbi.nlm.nih.gov/) receives and
provides access to genomic data for a range of species, including bacte-
ria, and tools (e.g. BLAST microbial genome) to search for sequences of
interest in completed genomes. Currently (February 2014), the NCBI
database contains 10 completed STEC genomes, five are serotype
O157:H7 with 2 from serotype O145:H28 and a single genome from se-
rotypes O111:H−, O26:H11 and O103:H2. In addition to thecompleted
genomes, there are draft genomes available for 40 — O157, 11 — O111,
6 — O145, 7 — O26, 3 — O103 and 1 — O45. Specialized databases and
Fig. 1. Extension of agent, host, environment paradigm incorporating the omics revolution.
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
tool sets designed specifically for the study of bacterial genomic data
and web-based resources to alleviate data management, computational
and analytical complexities implicit to the multiplication of datasets
have been developed and are summarized in Table 3.
In addition tothe developmentof software platformsforthestorage
ry for STEC/EHEC only and widely-applicable genomic analyses with
expert-guidedcomparativeanalysesofE.coliwouldseem tobea logical
step in the evolution of these platforms. This type of platform would
provide an excellent framework for carrying out pathogroup-specific
analyses and make use of the decades of genotypic and phenotypic
work carried out by STEC researchers around the world. For example,
a free and open source E. coli-based computational platform that com-
bines all publicly available genomesequences andallows for theupload
of user-specific sequences, as well as providing organism-specific anal-
yses, could be of great benefit to the E. coli community. Such analyses
could include Shiga-toxin typing, anti-microbial resistance profiling,
virulence gene identification, phylogenetic analyses, identification of
lineage- and clade-specific biomarkers (presence/absence of specific
genes, as well as SNPs), and epidemiological analyses. One such plat-
form currently under development is SuperPhy (http://lfz.corefacility.
3. Genomics in evolution and virulence
In the current post-genomics era we have transitioned from se-
quencingarchetypical isolates from a groupof pathogensto sequencing
larger samples in an effort to understand the population structure and
genomics plasticity of the pathogen associated with human disease.
WGS technologies with higher throughput and speed promote the con-
cept of genomic epidemiology. WGS typing strategies provide enriched
polymorphism databases that are fundamental in offering higher
phylogenetic resolution and accuracy that typically cannot be achieved
using the limited lower resolution markers of the traditional methodol-
ogiescommonlyusedin public healthlaboratories.Thewealthandcon-
tinuously growingnumbers of whole genome sequence information for
different E. coli offer insight into the plasticity of the pathovars (Steyert
et al., 2012) even for strains of genetically highly similar serotypes
(Eppinger et al., 2011a,b; Manning et al., 2008; Bono et al., 2012).
3.1. Population structure and virulence of E. coli O157
The evolution and virulence of serogroup O157, the prevailingcause
of STEC-associated HUS, has received the most attention. The first
evolutionary model, based on multi-locus enzyme electrophoresis
(MLEE), proposed the stepwise evolution of E. coli O55:H7 to O157:H7
(Feng et al., 1998). Subsequently, multi-locus sequence typing (MLST)
was used to describe the population structure of E. coli O157 (Qi et al.,
2004). However, both MLEE and MLST rely on the characterization of
only a few genes (representing only a fraction of the entire genome)
and do not have the discriminating power to differentiate closely relat-
ed isolates (Noller et al., 2003).A refined classification system,based on
a PCR-based assay that interrogates the repeat length at six genic and
intergenic chromosomal loci, i.e., the “lineage-specific polymorphism
assay” (LSPA), ultimately separated E. coli O157:H7 into lineages I, I/II,
utedamongbovineandhuman isolates (Yangetal.,2004;Sharmaetal.,
2009; Ziebell et al., 2008; Franz et al., 2012). The introduction of whole
genome sequencing (WGS) driven by the development of high speed
high turnout next generation sequencing technologies provides oppor-
tunities for systematic analysis with much greater resolution of more
closely related genotypes thus enabling greater insight into genome
divergence (Hazen et al., 2013; Abu-Ali and Manning, 2011).
Fig. 2. Overview of the benefits (green) and challenges (red) that genomics bring to STEC research.
Overview of the two major sequencing technologies.
Sequence technologyPlatform Reads Read lengtha
Illumina/Ion Torrent 400,000–400,000,000 200–600Single-end and
Low Variant (SNP) discovery e.g. outbreak, surveillance etc.
PacBio 22,000–47,000 4600–8500 HighComplete (de novo) genome assembly e.g. virulence,
horizontal gene-transfer, evolution
aWhile longer read lengths give more accurate information on the relative positions of the bases in a genome, costs are higher than for shorter reads and errors are more common.
bPaired end runs give additional positioning information in the genome, making it a good choice for de novo genome assembly as well as making it easier to resolve structural
re-arrangements such as deletions, insertions and inversions. Experiments designed for SNP identification are best served by paired-end runs.
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
The first STEC genomes sequenced were the O157:H7 strains
EDL933 (Perna et al., 2001) and Sakai (Hayashi et al., 2001). These
sequences showed considerable variation largely due to mobile genetic
elements that cause DNA segment insertion and deletion events, rather
than single-nucleotide changes (Kudva et al., 2002). Indeed it is such
InDels that generate the diverse restriction fragments in PFGE. In
order to accurately characterize the diversity within E. coli O157:H7
both single-nucleotide changes and large region turnover events must
be taken into consideration (Laing et al., 2009).
In general, the genotypic diversity of E. coli O157 in the cattle reser-
voir exceeds that of clinical human isolates (Besser et al., 2007). Identi-
fying genetic subtypes that are non-randomly distributed among
bovine and human isolates has not only expanded our understanding
on theevolutionarypathways ofE.coli O157 butalsowill provideuseful
insight into virulence mechanisms. WGS could significantly contribute
to this aim since it allows for comparative genomics of strains
and subsequent identification of genetic elements that differentiate
human- and cattle-associated strains. Several studies identified E. coli
O157 genetic subtypes that are rarely isolated from diseased humans
(Kim et al., 1999; Besser et al., 2007; Zhang et al., 2004; Bono et al.,
2007; Whitworth et al., 2010). A combined analysis of several genotyp-
ing methods and super network construction confirmed that the E. coli
O157:H7 population is distributed among three major lineages (Laing
et al., 2009). Numerous lineage II-specific genome signatures of isolates
found predominantly in bovine isolates, some of which appear to be in-
tion in the bovine rumen, and underlying differentiation into bovine
super shedders, have now been catalogued (Eppinger et al., 2011a,b;
Bono et al., 2012). However, recent studies showed that this non-
random distribution of lineages, clades, Shiga toxin (Stx)-subtypes and
SNPs among bovine and clinical isolates may vary by geographical loca-
tion (Franz et al., 2012; Mellor et al., 2012, 2013). The phylogeographic
structuring of E. coli O157 populations suggests divergent evolution of
E. coli O157 in different geographical locations. This might in turn be
related to observed differences in E. coli O157 disease incidence linked
to many factors not necessarily related to the bacterium, such as diet
of host and host microbiomes (Fig. 3). There are also differences in the
preponderant strain types extant in any area over time.
The observed non-random distribution of E. coli O157 genetic sub-
types among bovine and human clinical isolates might be the result of
a differentiation in virulence, transmission capacity and survival, or
some combination of all three (Franz et al., 2012). It is of importance
that genetic differences betweengenotypes differingin epidemiological
importance are also studied phenotypically. The phenotypic implica-
tions of SNPs in functional genes and/or regulatory regions can be
assessed by comparative phenotyping studies. Human-biased strains
like clade 8 (showing an association with hospitalization and HUS) ad-
hered significantly better to epithelial cells and demonstrated higher
expression levels of virulence genes (including Stx production) com-
pared to bovine-associated clade 2 strains (Abu-Ali and Manning,
2011; Neupane et al., 2011). In turn, bovine-biased O157 strains seem
to be more resistant to adverse environmental conditions (concluded
from observed upregulation of acid resistance and stress fitness-
associated genes in the bovine-biased genotype relative to the clinical
genotype, and higher survival in a model stomach) (Vanaja et al.,
2010). Recently, a relative high frequency of human isolates was
found to have mutations in the general stress response gene rpoS
while such mutations were absent in bovine isolates and rare among
food isolates (van Hoek et al., 2013). This would indicate that survival
in the bovine gastrointestinal tract and the food processing environ-
ment require a functional general stress response system. On the
other hand, the expression of locus of enterocyte effacement (LEE)-
encoded virulence genes in E. coli O157 is negatively regulated by rpoS
(Dong and Schellhorn, 2010) and rpoS attenuated strains showed
broader nutritional abilities and increased competitive abilities under
sient accidental host for E. coli O157 (with possibly less optimal intesti-
nal conditions compared to cattle), it has been hypothesized that the
human gastrointestinal tract system could select for rpoS mutants
which are characterized by increased nutrient scavenging abilities at
the expense of stress-resistance (van Hoek et al., 2013). Mutations in
the rpoS operon, causing the abolishment of the negative regulation of
virulence genes, likely contribute to the accidental nature of STEC
(O157) pathogenesis in humans. Recently it was shown how EHEC
coopts established mechanisms for sensing the metabolites and stress
cues in the environment, to induce virulence factors in a temporal and
energy-efficient manner, culminating in disease (Njoroge et al., 2012).
WGS used together with other “omics” techniques could be used to
identify genetic differences that correlate to distinct phenotypes, there-
by shedding light on selective forces shaping different E. coli O157
3.2. Population structure and emergence of non-O157 STEC
Although LEE (locus of enterocyte effacement)-positive STEC have
been responsible for the majority of STEC disease outbreaks, LEE nega-
tive non-O157 STEC are increasingly associated with outbreaks in the
United States and Europe (Gould et al., 2013; Brooks et al., 2005;
Buvens et al., 2012; Preussel et al., 2013). In addition, LEE-negative
non-O157 STEC from diverse serogroups have been found to cause
severe disease, including HUS (Johnson et al., 2006; Mellmann et al.,
2008; Cooper et al., 2014). With the exception of the recent O104:H4
Specialized databases and tools designed to alleviate data management, computational and analytical complexities implicit to the multiplication of datasets and associated investigative
tools for the study of bacterial genomic data.
Bacterial Isolate Genome
Sequence Database (BIGSdb)
Global Microbial Identifier
Storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible
computational format (Jolley and Maiden, 2010)
Integration of existing data from public genome sequence repositories with important metadata in a
global system designed to enable aggregation and analysis of genomic data for micro-organisms in
Offers a wide-array of tools including specialized searches, comparative analyses tools, visual
browsers, and annotation pipelines (Wattam et al., 2014).
Fully-automated service for annotating complete or nearly complete bacterial genomes and provides
high quality annotations for these genomes across the whole phylogenetic tree (Overbeek et al., 2014).
Rapid Annotation using
Center for Genomic
www.genomicepidemiology.org/Enables the submission of raw sequence data for extraction of MLST genes, etc, a resource that is
particularly suitable for non-bioinformaticians and is especially useful for epidemiologists are
interested in classifying strains based on genetic characteristics and linking to epidemiological data.
Integrates publicly available genomes and provides tools and viewers for analyzing and reviewing the
annotations of genes and genomes in a comparative context. This is an attempt to relieve data overload
and an increasing analytical burden for the rapidly growing volume of genomic data by synthesizing
genomic data using a pangenomic approach, which is particularly appropriate to bacterial species given
the large variation in gene content among closely related strains (Laing et al., 2010).
Genomes (IMG) system
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
outbreak in Germany (Rasko et al., 2011), non-O157 STEC and LEE-
negative STEC have received much less attention at the whole genome
level than E. coli O157 and LEE-positive strains. Recent comparative ge-
nomics have revealed the broad phylogenetic diversity of LEE-negative
STEC and that these strains vary significantly in their virulence reper-
toire (Steyert et al., 2012).
The clinical diagnostic benefit of WGS became apparent through
ly virulent and unusual STEC O104:H4 outbreak of HUS in Germany in
May 2011 (Frank et al., 2011). Preliminary genetic characterization by
traditional PCR methods suggested that this STEC strain should be clas-
sified within the enteroaggregative pathotype of E. coli (Scheutz et al.,
2011; Bielaszewska et al., 2011). Subsequent application of high-
throughput sequencing technologies allowed the genome and origins
break evolved (Mellmann et al., 2011; Rasko et al., 2011). The rapid re-
lease of sequence data to the research community (Rohde et al., 2011)
not only facilitated the development of diagnostic tools to more effi-
ciently identify the source and infected patients, but also demonstrated
a high levelof similaritybetweenthe outbreakstrain and thephylotype
B1, serotype O104:H4 enteroaggregative strain 55989, originally isolat-
ed from an HIV-patient in central Africa (Rohde et al., 2011; Bernier
et al., 2002). Subsequent genome comparison with other O104:H4
enteroaggregative strains confirmed this relationship (Rasko et al.,
2011). After the outbreak in Germany and France, several cases of HUS
caused by E. coli O104:H4 appeared in France and Turkey with no
clear epidemiological links. Comparative genomics revealed that the
isolates from cases that took place after the summer 2011 outbreaks
were not derived directly from the outbreak, but instead shared a
close common ancestor (Grad et al., 2013). This supports the view
that genetically related virulent O104:H4 isolates are less rare than pre-
can take place in parts of the genome that are exchanged among bacte-
ria, and that these regions contain genes involved in adaptation to local
environments (Grad et al., 2013).
The German outbreak clearly challenges the dogma that E. coli
pathotypes are separate entities (Fig. 4). It is now generally accepted
Fig. 3. Annual incidence of E. coli O157 related disease.
Fig. 4.RelationshipsbetweenhumandiarrheagenicE.colipathotypesincluding LEE(locus
of enterocyte effacement) positive and LEE-negative Shiga toxin-producing E. coli. DAEC:
Diffuse-adhering E. coli; STEC: Shiga toxin-producing E. coli; AEEC: Attaching and
effacing E. coli; EIEC: Enteroinvasive E. coli; EPEC: Enteropathogenic E. coli; EHEC:
Enterohaemorrhagic E. coli; ETEC: Enterotoxigenic E. coli; EAEC: Enteroaggregative
E. coli; Stx-EAEC: Shiga toxin-producing Enteroaggregative E. coli.
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
thatE.colihas acoregenome(of around2200 genes)thatiscommon to
all E.coliandthat this is supplementedineach strain with an “accessory
genome” (Chaudhuri et al., 2010; Chaudhuri and Henderson, 2012;
Rasko et al., 2008; Touchon et al., 2009). The integration of bacterio-
phages, acquisition of pathogenicity islands and horizontal gene trans-
fer of strain-specific genes then result in a mosaic structure to the
E. coli genome (Escobar-Paramo et al., 2004). This makes the E. coli ge-
nome highly flexible and dynamic with numerous opportunities for
newtypestoemerge.Althoughprediction of futureevolutionaryevents
is extremely difficult, genomics will provide us with more insights into
mechanisms underlying the evolution of STEC. For example, it was
shown that a specific phylogenetic background is required for the
acquisition of virulence factors located on plasmid and pathogenicity
islands (Escobar-Paramo et al., 2004). This type of information will
give more direction to our understanding of STEC evolution. However,
a better understanding of the conditions favoring horizontal gene
transfer of virulence determinants among E. coli and which selective
advantages promote the subsequent emergence of new variants is also
3.3. Emergence and evolution of STEC: role of horizontal gene-transfer
Despite considerable efforts, the mechanisms and evolutionary
forces underlying the evolution and emergence of new STEC types are
not well understood. Phylogenetic analysis revealed that the gain
and loss of virulence elements, including the LEE-island, the plasmid-
borne haemolysin, and phage-encoded Shiga toxins, has occurred
several times and in parallel in separate lineages (Reid et al., 2000).
This convergent evolution suggests a selective advantage associated
with the build-up of specific combinations of virulence factors. These
selective advantages likely operate on the bacterium as well as the
phage (Canchaya et al., 2003; Muniesa and Schmidt, 2014). It should
be stressed that such selective advantages are more likely to operate
in the bovine reservoir or the external environment rather than in
humans since the latter are considered transient accidental hosts and
STEC/EHEC lineages are unlikely to have evolved specifically to cause
infections in humans.
Whole-genome comparison of STEC strains revealed that phage
induced lysogeny is deeply involved in the evolution and emergence
of EHEC (Ogura et al., 2007; Allison, 2007). Sequence comparisons of
the Stx-phages revealed large variation suggesting extensive genetic
exchange between phages (Steyert et al., 2012; Muniesa et al., 2004).
Recent data support the hypothesis that Stx-phages themselves have a
genetic mosaic structure, and recombination events between the host,
phages and their remnants within the same bacterial cell is a major
driver of the evolution of Stx phage variants and the subsequent
dissemination of Shigatoxigenic potential (Smith et al., 2012). Stx
phages are converting phages that transform nonpathogenic or mildly
tremely persistent in the natural environment (Muniesa et al., 2004).
Consequently, environments like sewage, wastewater, and even food
function as a stable genetic reservoir for Stx phages that can infect a
large range of E. coli serogroups and related species (Imamovic et al.,
Clearly, the Stx phages are an extremely important driver of the
emergence and evolution of STEC. Recent work has shown cross-
regulation between Stx-phages and virulence factors such as the Type
III secretion system and that down regulation of the LEE is driven by
Stx phage (Muniesa and Schmidt, 2014; Xu et al., 2012). However, little
mobilization of the phage complement, and the underlying genetics for
observed variation of Stx production and how Stx phages affect the
fitness and biology of the E. coli host cell. Comparative genomics of the
mobilome, and in particular prophage profiling of Stx phages and host
integration sites have the potential to reveal evolutionary pathways
and help to classify strains while providing insights into differences in
the mobilome-borne virulence complement (Abu-Ali and Manning,
2011; Eppinger et al., 2011a,b). The identification of the genetic charac-
teristics of high-level toxin producers is crucial to assess the risk in
infected individuals developing severe clinical symptoms offering
improved diagnostic risk assessment. In addition, potentially new ther-
apeutic targets are revealed enabling suppression of toxin production
during human infection. For example, by identifying DNA segments
that are characteristic of Stx2a prophages present in EAEC-STEC O104:
H4 strains, the detection of these sequences in STEC from the bovine
reservoir, and subsequent successful transduction of Stx-negative
tion of this particular clone (Beutin et al., 2013). Whereas traditional
molecular typing methods are restricted to the analysis of sequence
variation within specific conserved genes belonging to the causative
agent WGS enables tracing the movement of genetic elements that
contribute to virulence and other important phenotypes that are trans-
ferred between strains and analysis of gene flux between bacterial
communities (Baquero and Tobes, 2013). This will provide us with
an extremely powerful tool for improving our understanding of STEC
evolution and emergence.
3.4. Molecular risk assessment of STEC
The observation that some STEC strains cause outbreaks and severe
disease such as HUS and hemorrhagic colitis, whereas others are associ-
is based upon the serotype association with human epidemics and HUS
(Karmali et al., 2003). Although informative as an ex post facto determi-
nant of virulence potential, it faces some limitations. First, significant
variation in virulence occurs within the same serotype. Second, it is an
indirect marker and tells little about the phenotype or specific virulence
gene complement of the organisms. Third, it is not useful for detailed
epidemiological investigations. Finally, classification by SPT may be
affected by differences in the relative occurrence of various serotypes
in different geographic locations. The local trade in food, animal feed
and animals may introduce virulent types into regions where they
cols, effective public and veterinary health actions and clinical manage-
ment develop a more proactive approach to STEC risk assessment
utilizing genetic information. Although extremely difficult because of
associations between (specific) genetic content of strains and their epi-
demiology as well as using this information to elucidate mechanisms
The ability to produce an attaching and effacing (AE) cytopathology
and the subsequent production of Stx are considered the hallmarks of
highly virulent STEC (Karmali et al., 2010). Indeed, the LEE-island
(encoding the genes responsible for the AE phenotype) as well as the
Stx-subtype are strongly associated with disease incidence and severity
of symptoms among humans (Boerlin et al., 1999; Ethelberg et al.,
2004; Friedrich et al., 2002). There have been several reports (Fuller
et al., 2011) that indicate that Stx2 exhibits greater pathogenicity
(i.e. diarrhea plus HUS) than Stx1-positive isolates. It has also been re-
cently reported (Shringi et al., 2012) that Stx2a is 25 times more potent
that Stx2c in both Vero cell and human renal proximal tube epithelial
with disease and conversely some LEE-negative STEC strains have been
reported to cause a range of clinical symptoms including hemorrhagic
colitis and HUS (Mellmann et al., 2008; Pradel et al., 2008). There is
mounting evidence suggesting that the pathogenesis of STEC infection
involves many additional effector molecules, which are encoded on pro-
phages in “exchangeable effector loci” outside the LEE and are named
non-LEE effectors (nle) genes (Coombes et al., 2008; Tobe et al., 2006).
Molecular risk assessment approaches based on an evaluation of the
virulence gene content derived from a number of pathogenicity islands
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
risk to human health (Coombes et al., 2008; Bugarel et al., 2010; Brandt
et al., 2011). However, since these nle genes are also frequent in some
EPEC strains (Bugarel et al., 2010), they are not very well suited to be
used alone as diagnostic markers for EHEC. In addition, the presence of
nle-genes have been found to be highly correlated (i.e. occur together)
with each other, the intimin genes and other virulence genes (Ju et al.,
2013). This makes it redundant to screen for this large set of virulence
genes. The term enterohemorrhagic E. coli (EHEC) is often defined as a
subgroup of STEC that are characterized by certain serotypes that fre-
quently occur in outbreaks and are associated with severe clinical illness
(Nataro and Kaper, 1998). Recently, two putative genes, called Z2098
and Z2099, were identified from the genomic island OI-57 that were
closely associated with EHEC, but rarely found in EPEC, STEC and non-
of molecular markers that can differentiate between high risk strains and
strains posing lower risk will likely be accelerated by the application of
WGS on strains differing in isolation source and clinical manifestations.
Ongoing large scale sequencing projects of strains classified as EPEC,
EHEC and other E. coli pathotypes with detailed source and epidemiolog-
ical data will enable the discovery of additional biomarkers specific for
pathogenic E. coli.
4. Genomics in epidemiology
Epidemiology is “the study of diseases and their determinants
in populations” (Giesecke, 2002). Studying the epidemiology of STEC
bial typingmethods in order to characterize isolates beyondthe species
level. Classical immunological serotyping is an important basis for
differentiating STEC and often the starting point in studying the epide-
miology of STEC. As mentioned above, the association of serotypes
with diseases of varying severity in humans and with sporadic disease
or outbreaks has led to the proposal that STEC can be classified into 5
seropathotypes, A to E (Karmali et al., 2003). However, epidemiological
studies, outbreak investigations, and early detection of geographically
dispersed foodborne disease outbreaks are dependent on subtyping
methods that discriminate beyond the level of serotype. Several high-
ing Pulsed-field gel electrophoresis (PFGE) (Rivas et al., 2006) and
Multiple-Locus Variable number tandem repeat Analysis (MLVA)
(Bustamante et al., 2012). Multi-locus sequence typing (MLST) has
been shown to be of limited value for STEC typing because of the low
variability in the housekeeping genes which are used (Noller et al.,
2003). Also, a number of PCR based assays have been developed to sub-
type isolates based on the presence or absence of of genes encoding
virulence factors that include subtypes of Stx, attaching proteins like
intimin, and other colonization factors (Beutin et al., 1994). Virulence
profiles are important to determine the pathogenic potential of the
organism but are of more limited value for epidemiological studies
(e.g. identification of outbreaks). With these molecular sub-typing
tools already in hand there arises the question of how whole genome
sequencing may further elucidate the aetiology of STEC infections in
humans. The WGS work (Grad et al., 2013) on the E. coli O104 outbreak
and related strains exemplify what is possible (see above).
4.1. Bedside diagnostics and clinical decision-making
treatment should be based solely on symptoms and prior to confirma-
tion of the causal agent. The identification of STEC by culture methods
is a time consuming and challenging task because of the lack of specific
culturemedia and thetimerequired fortheorganism togrowtoreadily
detectable levels. Therefore, the primary interest of the physician with
regard to potential STEC infections is the rapid detection of Stx. This
can be performed through the gold standard Vero cell cytotoxicity
assay (Staples et al., 2012), enzyme-linked immunosorbent assays
(ELISA) (Staples et al., 2012; Willford et al., 2009), and at the genomic
level by PCR. In larger hospitals the future detection of Stx, and selected
markers for most E. coli pathovars, may be done on highly automated
real-time PCR based platforms capable of analyzing thousands of sam-
ples almost without intervention.This latter technique is likely to result
in high numbers of false STEC diagnoses since only the genes encoding
for Stx, either within the STEC strain or in the free Stx encoding phages,
are detected and not the toxin itself. For example, it has been reported
that 62% of healthy individuals were found to excrete Stx phages in
their faeces (Martinez-Castillo et al., 2013). However, secretion of the
phage in faeces, can be used as a marker of STEC in the GI tract and
metagenomic sequencing can be used as a diagnostic tool.
Once an isolate is obtained detailed information on the virulence
profile may allow for an assessment of risk by the treating clinician
duringtheearly onset of thedisease.Knowledge of the virulence profile
may guide whether hospitalization, a specific therapy or strict hygiene
measures are required to prevent further complications or spread.
Several studies have been published that describe the association of
virulence factors or genetic backbone (e.g. by SNPs) with increased
morbidity (see molecular risk assessment chapter). Although some
microarray and high-throughput PCR systems for determiningtheviru-
lence profile exist (Bugarel et al., 2010; Bruant et al., 2006; Gonzales
et al., 2011), WGS has the potential to make a substantial contribution
since it is not restricted to a specific choice of target genes. All relevant
information can be extracted in silico from the sequence data, including
serotype, virulence and antibiotic resistance profiles, and genetic back-
side thehospitalbypublic healthauthorities for outbreakdetection and
tracing sources of contamination. Extant databases that utilize classical
typing can still be used for these purposes since MLST, PFGE and
MLVA patterns can also be obtained in silico from the sequence data
(Carrillo et al., 2012). However, for techniques such as MLVA which
involve repetitive elements problems in assembly using NGS can occur
if short read technologies are being used.
4.2. Outbreak detection
As a result of the increased speed and decreased cost of acquisition
and the high resolution whole genome sequences, comparative
genomic analyses of bacterial pathogens is rapidly becoming part of
many outbreak investigations (Rasko et al., 2011; Frank et al., 2011). A
significant proportion of STEC isolates are associated with outbreaks
(e.g. approximately 80%) of E. coli O157 cases in Scotland though this
number can vary dramatically from year to year (Locking et al., 2011).
There is a need to determine which cases are actually part of an out-
break (i.e. have the same common source), and if so, identify what
that source is and then to act to either reduce contamination of the
source or exposure to it.
Public health officials/epidemiologists routinely detect an outbreak
essarily) geographical location that can include theapplicationof statis-
tical methodologies such as the spatial scan statistic (Pearl et al., 2006).
Confirmation of cases belonging to an outbreak can be conducted by
case–control studies which involve interviewing patients and controls
(non-affected) and looking for risk factors (e.g. increased odds of expo-
sure to a waterorfood source). However, these data can beproblematic
to obtain because they depend upon locating the infected person, their
willingness to be interviewed, and their ability to remember exposure
(often a number of weeks after the exposure actually took place). This
is particularly problematic for food products that form a minor part of
the diet (e.g. herbs) and for multi-ingredient foods and complex
meals. There will also be a sporadic background of cases that will not
be part of the outbreak but will confound the identification of the out-
break. Typing of the isolated organisms can aid in detection of outbreak
cases, lead to a more specific case definition and is required to
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
definitively linka source to the outbreak. This is currently performed by
serotyping, PFGE, MLVA or some combination of methods (PulseNet,
www.cdc.gov/pulsenet/about/index.html). However serotyping does
not have the resolution required for consistent detection of outbreaks
(Barrett et al., 1994), whilst PFGE data is of fairly high resolution and
be readily share shared, issues arise with regard to digitization of the
data into consistent pulsotypes. However, the challenge for WGS is
the large amount of data to be transferred and be stored as well as de-
velopment of internationally harmonized analysis methods. Further,
although MLVA is also a useful tool for studying outbreaks, it is still a
new methodology and no international standardization efforts have
been undertaken (Christiansson et al., 2011). Finally, attaching signifi-
cance to single locus variants can also be misleading (Underwood
et al., 2013).
Conventionally, one outbreak strain is associated with an outbreak
and all isolates within one outbreak are indistinguishable from each
other, though recent studies have clearly demonstrated the genomic
heterogeneity in isolates derived from single outbreaks of human dis-
ease that were acquired during the time course of the outbreak
(Eppinger et al., 2011a,b; Hasan et al., 2012; Reeves et al., 2011). How-
ever, variation among isolates from affected human cases can occur
within an outbreak (Underwood et al., 2013; Turabelidze et al., 2013).
This can occur by two primary mechanisms. The first is co-infection,
where more than one strain of STEC is present in the source (e.g. food,
water, animal faeces etc.) (Proctor et al., 2002). The second mechanism
petting farm outbreak where there were 93 human cases of E. coli O157
of which 22% suffered HUS. The SNP analysis was able to demonstrate
that a single strain successfully spread through the farm by clonal
expansion before human cases began. Such mutational changes could
however happen at any point in the transmission pathway and the
higher the resolution of the typing method the greater the likelihood
of detecting such events.
discovery is crucial in formulating a molecular-guided surveillance and
miology. Canonical SNPs have been implemented in an efficient and
cheap typing assay that surpasses classical technologies (serotyping,
PFGE, MLST) in terms of phylogenetic accuracy and resolution. An
example is the NeoSEEK system (Neogen corporation, Lansing, US)
for rapid, isolate free, highly multiplexed detection and identification
of STEC using mass spectrometry-based detection of specific targets
(http://www.neogen.com/FoodSafety/NS_STEC.asp) (Norman et al.,
2012). In the future, the constantly growing databases of STEC/EHEC
isolate collections, their recorded associated metadata, and detected
mutational genetic markers will become invaluable in public health, vet-
erinary and surveillance settings. For example a SNP panel obtained from
comparative genome sequencing microarray data identified nine distinct
clades of O157:H7 which have differential epidemiological importance
(i.e. disease incidence and virulence) (Manning et al., 2008). An analysis
of 25 O157:H7 genomes from three foodborne disease outbreaks identi-
fied 1225 SNPs in this clonal expansion and provided insights in the
genome heterogeneity dynamics of O157:H7 in the time-course of single
outbreaks (Eppinger et al., 2011a,b). This enabled public health workers
not only to track and distinguish separate unrelated outbreaks and to
classify isolates as members of an outbreak population, but also to work
towards refining standards currently used in outbreak investigations
(e.g. indistinguishable pulsed field patterns). These SNP typing methods
may also prove very useful for risk-based monitoring and surveillance
to the contaminated source, monitor its spread and transmission, and fa-
cilitate accurate risk assessment based on the isolates' carried virulence
profiles. The development and use of WGS analysis pipelines specific for
STEC and other foodborne pathogens is therefore an important future
4.3. Sporadic cases
Sporadic cases are defined as those not belonging to general out-
breaks. In particular they are symptomatic cases of laboratory-
confirmed infection in which only members of a single household
were affected, and where secondary spread beyond the household
was ruled out (Locking et al., 2011). WGS, through SNP analysis, offers
the opportunity to determine the genetic relatedness of cases and as
such whether a strain from an apparently sporadic case could arise
from the same source as another case. For robust decisions to be made
as to whether cases are really sporadic or part of a diffuse outbreak it
for WGS data need to be developed as has been done for PFGE data
(Pearl et al., 2007).
Case–control studies have previously been used to determine risk
factors for infection utilizing sporadic cases (Locking et al., 2001; Jaros
et al., 2013). These studies have traditionally not made use of typing
information but this is now changing as typing information becomes
available (Jaros et al., 2013). For other pathogenssuch as Campylobacter
(Mughini Gras et al., 2012), typing information to help identify the
source (e.g. source attribution more details of which follow in the next
section) has been combined with epidemiological case control data.
This approach is potentially useful when there is variation in type at
either the reservoir level or along the pathway of infection. It is likely
that methods such as MLVA would be useful here but the higher
resolution of WGS is potentially of much greater discriminatory value
(Cody et al., 2010).
4.4. Identifying the source of infection (source attribution and
Source attribution determines the proportion of cases allocated to
a specific source. Three different points where attribution can be con-
ducted along the food transmission pathway include production, distri-
bution and consumption (Pires et al., 2009). Attribution conducted at
the point of production (e.g. at farm level) enables the relative impor-
tance of the reservoirs to be determined. There has been considerable
success in this area for Campylobacter (Wilson et al., 2008; Sheppard
et al., 2009). Although comparative genotyping studies using non-
traditional typing methods identified common subpopulations of STEC
O157 from cattle and affected humans (Yang et al., 2004; Besser et al.,
2007), other reservoirs are generally neglected. Consequently, compre-
hensive source attribution studies for STEC are lacking which might
limit our understanding of STEC ecology. Sheep are generally neglected
as an important reservoir, although they shed comparable numbers of
STEC O157 into the environment (at least in Scotland) (Strachan et al.,
2005). Other animals including pigs and pigeons have been identified
as shedders of stx2e and stx2f-producing STEC respectively (Schmidt
human isolates with animal isolates at higher resolution than MLST or
application of source attribution methodologies and to determine
geographical differences in phylogeny (Mather et al., 2013). It should
be noted that for source attribution a higher level of discrimination is
not necessarily needed since the goal is not to identify a single source
of an outbreak of a cluster of cases, but rather to relate groups of bacte-
rial strains with particular reservoirs/sources and then attribute human
sporadic cases to these sources (EFSA, 2013). The applied method has
to allow for some genetic diversity between isolates from human and
animal/food sources, but only to the degree that it can still be assumed
that they originate from the same source. However, WGS might
contribute considerably to future source attribution studies for STEC
when 7 locus MLST combined with determination of virulence and
other functional genes (antimicrobial resistance, markers for persis-
tence, metabolic markers, host specificity markers etc.) provide better
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
insight into identifying food/animal strains relevant for human disease
than 7 locus MLST alone. Thus the increased genomic resolution that
can be achieved, either with SNPs or with the many more loci and
their associated alleles, in combination or with selection of different
subsets from the full array, offers exciting possibilities in interpretation
of the biology of STEC.
AnotherwaythatSTECcasescan beattributedtoasourceis interms
of the pathway leading to human exposure, e.g. foodborne, waterborne,
environmental (including direct contact with animals and their faeces
and secondary transmission). This can be addressed in a number of
ways including analysis of outbreak data, quantitative microbiological
risk assessment and epidemiological analysis of sporadic cases
could help elucidate transmission along these pathways. Indeed it has
been shown that only certain genotypes (and associated phenotypes)
of STEC O157 are capable of long-term environmental persistence
(van Hoek et al., 2013; Franz et al., 2011). Comparative genomics on
isolates differing in environmental persistence might reveal additional
genetic differences that can be used as markers for increased persis-
tence as well as leading to the potential discovery of the mechanisms
4.5. Antibiotic resistance
Current infections with STEC are not routinely treated with antibi-
otics because many antibiotics induce the Stx-encoding phage into the
lytic cycle and/or increase the production of Stx as a consequence of
the multiple stx gene copies produced after the replication of the
phage genome (Herold et al., 2004; Wong et al., 2000). Consequently,
the risk of HUS is generally considered to increase with the use of anti-
biotics (Wong et al., 2000). However, it may be that specific antibiotic
treatment may be used in the future after clinical trials have demon-
strated their safety. WGS can be used to determine the resistome of
the isolate. Further, the resistome may be used as a means of typing
the organisms and may elucidate the ecology of resistance genes in
STEC populations. The presence of Extended Spectrum β-Lactamase
(ESBL) genes in clinical isolates of EHEC O26 (Venturini et al., 2010),
STEC O104 (Frank et al., 2011) and O157 (Litrup et al., 2007), show
how widely this form of resistance has spread between commensal
information on the types of ESBL-resistance located on the genome,
thereby giving information on the source of the resistance gene.
Although WGS can identify which antibiotic genes (andallelic variants)
are present on both the chromosome and plasmids (McArthur et al.,
2013), a major short-coming is that at the moment expression needs
to be confirmed by a phenotypic assay. Studies are required to deter-
mine the association between the presence of these resistance genes
and AMR phenotypes.
4.6. Asymptomatic carriers
Asymptomatic carriers are commonly associated with STEC out-
breaks (Silvestro et al., 2004). An important question for public health
is to determine what should be done with these individuals particularly
if they are likely to come into contact with vulnerable groups, such as a
parent of a young child or nursery/kindergarten teacher. It is important
to ascertain whether the strains excreted by these individuals are avir-
ulent and distinct from the pathogenic outbreak strain or whether the
host–microbe interaction is for some reason not resulting in morbidity.
The application of WGS offers the opportunity to investigate this,
(e.g. screening for potential defects in pathways that regulate expres-
sion of the virulence gene complements), but a phenotypic test may
also be required (e.g. Vero cell assay) and as such will help guide the
option for treatment of the asymptomatic host. Furthermore, based on
the results obtained for Stx phages, it is expected that a relatively
large fraction of the general population carry asymptomatically and
shed STEC in their faeces (Hong et al., 2009). Asymptomatic carriers
might be more important in STEC ecology than previously thought.
It is possible that some STEC strains cycle in human populations
(maintained by secondary transmission) rather than animal reservoirs.
Research into the role of host genetics and the microbiome of the
human gastrointestinal tract will be important in understanding the
nature of STEC carriage and virulence in humans.
5. Genomics in food protection
It is clear that food safety is a key consideration in theformulation of
regulations and policies designed to enhance the security, defense and
economic robustness of national agri-food sectors. Individual nations
have adopted explicit policies to deal with the emergence of STEC dis-
ease andtheriskof foodcontamination.One currentand specific exam-
ple concerns the enactment of a new USDA policy which identifies six
STEC serogroups (O26, O45, O103, O111, O121 and O145) in addition
to O157:H7 as adulterants in beef trim and ground beef (USDA-FSIS,
2011). Compliance will require the implementation of expensive new
testing programs, leading to criticism by the US agri-food industry
(Hodges, 2012). Concern with the policy has also triggered trade dis-
putes with nations that export food products to the US market,
prompting calls to satisfy the World Trade Organization Sanitary and
Phytosanitary Agreement which stipulates that risk assessment must
clearly and irrefutably demonstrate the public health ramifications of
non-O157 STECs in beef (http://www.wto.org/english/tratop_e/sps_e/
spsagr_e.htm). Here accurate data on the geographic distribution and
diversity of STEC is clearly needed to inform risk assessments used by
regulators and decision makers to deal with the issue in domestic
agri-food systems and international trade.
The Codex Alimentarius defines food hygiene as comprising
“conditions and measures necessary for the production, processing,
dictions may deploy several layers of prevention, intervention and
response readiness to ensure the protection of food supplies against
threats to food hygiene and to reduce both the frequency and impact
of food contamination events. Protection strategies can incorporate a
including good agricultural practices (GAPS), hazard analysis critical
control point (HACCP) programs in manufacturing and distribution,
and surveillance to verify outcomes on regional or national scales.
Despite such measures, persistent rates of foodborne infections caused
by pathogens such as STEC emphasize the need to improve the efficacy
of current measures or to develop alternative, validated means to
reduce the inherent vulnerability of food supplies.
Criteria used to verify the performance of current food protection
systems commonly rely on pathogen prevalence data derived from
microbiological analyses that employ cultural techniques for the
romolecular traits (substrate utilization, serotype, etc.). As discussed
elsewhere in the present work, detection methods that require cultiva-
tion of microorganisms are time consuming and often fail to recover
STEC targets present in low numbers, that are metabolically stressed
or in a viable but non-culturable state. Where isolation is successful
the identification of isolates using serotyping schemes is typically labo-
rious, notably for the phenotypically diverse STEC group (Ballmer et al.,
2007). Consequently, alternative methods based on the recognition of
virulence genes or discrete regions of the genome (typically by PCR)
are increasingly used to delineate the STEC from other E. coli or to iden-
tify specific seropathotypes. However, the presence of virulence genes
does not necessarily mean that the organism will cause disease in a
human host. Further, if PCR and metagenomic approaches are used to
identify an array of virulence genes directly from food samples there is
no guarantee that these will be from the same organism. The ability
to quickly and accurately detect serotypes with well-known causal
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
relationships to human disease is undeniably beneficial in a clinical set-
ting or the food quality control laboratory. However, the resolution
afforded by targeting discrete genomic sequences is more finite than
the inherent diversity of the STEC group. Difficulties encountered in
for the sprout-associated German outbreak in 2011 clearly illustrated
the constraints imposed by conventional approaches when they are
applied to the analysis of rare or atypical STEC (Altmann et al., 2011).
Analyses based on the detection of discrete sequences may also yield
ambiguous data and fail to recognize genomic variants of known
seropathotypes or novel variants containing altered complements of
virulence genes. Moreover, considerable progress has been achieved
nucleic acids from complex biological samples (Mertens et al., 2014;
Ercolini, 2013); these developments pave the way for direct detection
and characterization of a range of seropathotypes in various sample
matrices including food, water or clinical materials using enhanced
sequencing technologies. These data will enable the elucidation of
transmission routes, to better identify which are the STEC strains that
really matter, and to gain insights into the ecology and evolution of
these pathogens. In addition, they will enable population-based epide-
miological studies or broader ecological investigations in food systems
to support the efficacy of interventions.
ing to the spread of human pathogens, including STEC, across interna-
tional borders. To date, the vast majority of STEC WGS data deposited
in accessible depositories are derived from clinical isolates. Unfortu-
nately, metadata that could be of value in determining the role of food
in theepidemiologyof the isolates is rare and when presentincomplete
(Flynn, 2013). In addition, comparatively few WGS from isolates
recovered through food surveillance programs are available. This is
unfortunate as the information would make it possible to monitor
and characterize domestic and cross-border movement of STEC
seropathotypes, with obvious benefits for the validation of intervention
strategies on both scales. It should be noted that there could be a reluc-
tance todisseminatenational data for fear that it could beused bycom-
petitors to fuel trade disputes. However, the latter have historically
been aggravated by assumptions about the prevalence and characteris-
were apposite data are available.
Clearly, the continuing reduction in sequencing costs and the short-
ening of the ‘time to result’ makes WGS an attractive strategy for both
research and diagnostics. Being a high-resolution tool, high-throughput
sequencing will increasingly influence diagnostics, epidemiology, risk
management, and patient care (Bertelli and Greub, 2013). However,
several obstacles must be overcome before the routine application of
WGS becomes a reality (Fig. 2). One important concern is that failure
to properly harmonize the storage and interpretation of WGS data will
lead to fragmentation of data so that, for example, similar pathogens
may be regarded as different due to the use of distinct approaches of
sequence assembly and annotation (EFSA, 2013). In addition, general
availability of the data and associated meta-data is not assured. Specific
of these databases, and criteria for granting access rights as well as legal
restrictions forexportinginformation on class 3 hazard group pathogens
like STEC. Only once this has been accomplished can WGS be fully
exploited for public health, epidemiological, food safety and research
The most significant advantage of WGS is that typing of pathogens
can be conducted at a much higher phylogenetic accuracy and resolu-
tion than with traditional typing methods. This gives the opportunity
to refine evolutionary models, source attribution studies, microbial
risk assessments and epidemiological investigations. However, this
will require either new or refined modeling approaches and/or novel
algorithms to cope with WGS data (Muellner et al., 2013).
Paradoxically, the high level of resolution obtained by WGS
mightalsoleadtoanimportantpitfall. WGSdoes notchangetheoverall
E. coli phylogeny based on lower resolution typing methods (see Figs. 3
and 4 in Chaudhuri and Henderson, 2012; Sahl et al., 2012) but WGS
will provide ability to study more short-term evolutionary processes,
separating closely related strains, and inform us about the enormous
genetic diversity within this family of organisms as well as providing
insight into how this may have arisen.
Large-scale bacterial whole genome sequencing for genomic epide-
miology outbreak investigations has become a reality (Mellmann
et al., 2011; Rohde et al., 2011; Rasko et al., 2011). The sequencing
speed achieved by NGS technology enables complete sequence and
functionally annotated genomes within days and even hours. However,
such outbreak investigations need a close collaboration between differ-
entscientific disciplines and constantexchange ofmeta-,experimental-
and in-silico-derived data sets. After recognition of an outbreak, the
investigation would initiate the collection of isolates for bacterial culti-
vation and DNA preparation as well as strain-associated metadata.
This is a prerequisite and serves as robust foundation for both a whole
array of experimental biochemical assays and DNA-based technologies,
such as genome sequencing, whole genome sequence typing and
discovery assessing both the genomic backbone and mobilome and to
relate the sequence based in-silico findings to the recorded strain
associated metadata and virulence phenotypes.
Current source attribution methods have to allow for some genetic
diversity between isolates from human and animal/food sources, but
only to the degree that it can still be assumed that they originate from
the same source. It is likely that WGS source attribution (e.g. using
SNPs) will show that all isolates are different and that the degree
of difference will have to be handled in a probabilistic framework.
This should readily be achieved using either existing or extensions
to existing software (e.g. STRUCTURE (Pritchard et al., 2000) and
ASYMMETRICISLAND(Wilson et al., 2008) models). Withrespecttomicrobi-
alriskassessment, WGS will allow for better incorporation of variability
between bacterial strains in risk assessment models. Importantly, high
levels of genome similarity do not imply similar behavior in the food
chain or similar levels of virulence since small genetic changes may re-
sult in large phenotypic differences. Itis therefore of high importanceto
link data on genome sequences with phenotypic data describing the
persistence in different niches (environment, food) as well as with
in vitro or in vivo dose–response assessment. At this stage other gene-
expression studies, transcriptomics, proteomics, metabolomics and
high-throughput phenotyping might provide a useful addition. These
genotype–phenotype association studies will deliver markers that pre-
dict behavior in the environment, food chain and human host and
thereby be highly informative in the risk assessment.
Once isolates have been obtained from outbreaks, rapid WGS will
help elucidate the mechanisms of spread and identification of the
source. In addition this must be used in tandem with classical outbreak
epidemiology. A database of sequences will need to have been collected
so that these comparisons can be conducted. This database will also be
vital to understand sporadic cases.
The availability of high resolution typing of isolates will have in-
creasing impact on societal perception of infection with the possibility
of statistically robust associations between the organism on a contami-
nated food and thatcausinginfectionina case.Apportionmentof blame
industry practices. However, regulatory authorities will need to keep
firm control on their relationship with industry if improvements in
public health are to be sustained, rather than allowing industry to
seek cover in secrecy.
The world population is estimated to grow to 8.4 billion by 2030 in-
creasing pressure on the food supply. It is likely that the international
food and agriculture distribution networks will become even more
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
complex. Hence, the likelihood of international outbreaks will rise and
the need for improved surveillance is therefore required. Currently a
number of the methods to detect STEC in foods to satisfy international
regulations (e.g. in the USA (USDA-FSIS, 2011)) do not initially require
collection of isolates but instead detect a series of PCR products (Grant
et al., 2011). There is concern, that in the future there may not be a re-
quirement to obtain an actual isolate. However, it is important that
understand the current and potential risk of human infections. This
would require international commitment. Further, there needs to be
an exchange of information, which can be both company and country
sensitive, about the STEC organisms that are present in foods so that
appropriate action can be taken to minimize risk of disease. This is an
urgent matter that needs to be dealt with in order to ensure confidence
in the food system. Finally, traceability of food products must be as-
sured. There have been recent problems in the UK regarding horsemeat
being used instead of beef (Abbots and Coles, 2013). If it is not possible
to ascertain the origin of our food then it will also be problematic to
determine the origin of pathogens that the food may contain and inter-
vene to protect public health.
It is clear from this paper that WGS is going to be important for ad-
dressing current and future issues associated with STEC. The following
recommendations were made as a result of the workshop and the sub-
sequent synthesis of knowledge contained in the above. They include:
1. Validation and standardization of methods to acquire and process
2. Standardization of databases of sequence data through development
of a “WGSNET” with timelines for implementation. Sequence data
must be open access. Standardization of meta-data on sequenced
isolates (including detailed strain history and where appropriate
environmental origin, experiments associated with the isolate and
epidemiological information) is also required. Wherever possible
this should also be open access and although ethical, legal and com-
mercial considerations must be considered, these should only apply
where absolutely necessary.
3. Routinesequencingof isolates from clinical cases, foods, animals and
the environment to build up databases. This must include retrospec-
tive isolates held in laboratories across the world as well as those
currently being collected.
4. Development of a suite of standardized open access software for
interpretation of genome information (e.g. Stx typing, antibiotic resis-
tance, generation of SNP markers, source attribution etc. etc.). A pre-
cursor to this is the building of a workforce of bioinformaticians/
5. For both risk assessment and research purposes the integration of
genome sequence information and phenotypic behavior of STEC
(e.g. persistence and virulence) is crucial and should be seen as a
high priority for funding agencies.
6. Development of training programs in these technologies for those
working in clinical settings, epidemiology, food safety and industry.
both public health and the safety of the food supply. Currently its bene-
fits are being slowly teased out by both government and academic re-
searchers around the world. The next phase will require a coordinated
international approach that addresses the list of recommendations
above to ensure that its potential can be realized in a cost effective
and timely manner.
The OECD for both sponsoring the STEC genomics workshop held in
Charlotte, North Carolina, July 2013 and the open access journal fees.
Abbots, E., Coles, B., 2013. Horsemeat-gate the discursive production of a neoliberal food
scandal. Food Cult. Soc. 16, 535–550.
Abu-Ali, G.S., Manning, S.D., 2011. The Evolution of Foodborne Pathogens. Springer, New
Allison, H.E., 2007. Stx-phages: drivers and mediators of the evolution of STEC and STEC-
like pathogens. Future Microbiol 2, 165–174.
Altmann, M., Wadl, M., Altmann, D., Benzler, J., Eckmanns, T., Krause, G., Spode, A., an der
Heiden, M., 2011. Timeliness of surveillance during outbreak of Shiga toxin-
producing Escherichia coli infection, Germany, 2011. Emerg. Infect. Dis. 17, 1906–1909.
Ballmer, K., Korczak, L.M., Kuhnert, P., Slickers, P., Ehricht, R., Haechler, H., 2007. Fast DNA
serotyping of Escherichia coli by use of an oligonucleotide microarray. J. Clin.
Microbiol. 45, 370–379.
Baquero, F., Tobes, R., 2013. Bloody coli: A Gene Cocktail in Escherichia coli O104:H4. mBio
Barrett, T.J., Lior, H., Green, J.H., Khakhira, R., Wells, J.G., Bell, B.P., Greene, K.D., Lewis, J.,
Griffin, P.M., 1994. Laboratory investigation of a multistate food-borne outbreak of
Escherichia coli O157-H7 by using pulsed-field gel-electrophoresis and phage typing.
J. Clin. Microbiol. 32, 3013–3017.
Bernier, C., Gounon, P., Le Bouguenec, C., 2002. Identification of an aggregative adhesion
fimbria (AAF) type III-encoding operon in enteroaggregative Escherichia coli as a
sensitive probe for detecting the AAF-Encoding operon family. Infect. Immun. 70,
Bertelli, C., Greub, G., 2013. Rapid bacterial genome sequencing: methods and applica-
tions in clinical microbiology. Clin. Microbiol. Infect. 19, 803–813.
Besser, T.E., Shaikh, N., Holt, N.J., Tarr, P.I., Konkel, M.E., Malik-Kale, P., Walsh, C.W.,
Whittam, T.S., Bono, J.L., 2007. Greater diversity of Shiga toxin-encoding bacterio-
phage insertion sites among Escherichia coli O157: H7 isolates from cattle than in
those from humans. Appl. Environ. Microbiol. 73, 671–679.
Beutin, L., Aleksic, S., Zimmermann, S., Gleier, K., 1994. Virulence factors and phenotypical
traits of verotoxigenic strains of Escherichia coli isolated from human patients in
Germany. Med. Microbiol. Immunol. 183, 13–21.
Beutin, L., Hammerl, J.A., Reetz, J., Strauch, E., 2013. Shiga toxin-producing Escherichia coli
strains from cattle as a source of the Stx2a bacteriophages present in enteroaggregative
Escherichia coli 0104:H4 strains. Int. J. Med. Microbiol. 303, 595–602.
Bielaszewska, M., Mellmann, A., Zhang, W., Koeck, R., Fruth, A., Bauwens, A., Peters, G.,
Karch, H., 2011. Characterisation of the Escherichia coli strain associated with an out-
break of haemolytic uraemic syndrome in Germany, 2011: a microbiological study.
Lancet Infect. Dis. 11, 671–676.
Boerlin, P., McEwen, S., Boerlin-Petzold, F., Wilson, J., Johnson, R., Gyles, C., 1999. Associa-
tions between virulence factors of Shiga toxin-producing Escherichia coli and disease
in humans. J. Clin. Microbiol. 37, 497–503.
Bono, J.L., Keen, J.E., Clawson, M.L., Durso, L.M., Heaton, M.P., Laegreid, W.W., 2007.
Association of Escherichia coli O157: H7 tir polymorphisms with human infection.
BMC Infect. Dis. 7, 98.
Bono, J.L., Smith, T.P.L., Keen, J.E., Harhay, G.P., McDaneld, T.G., Mandrell, R.E., Jung, W.K.,
Besser, T.E., Gerner-Smidt, P., Bielaszewska, M., Karch, H., Clawson, M.L., 2012.
Phylogeny of Shiga toxin-producing Escherichia coli O157 isolated from cattle and
clinically ill humans. Mol. Biol. Evol. 29, 2047–2062.
Brandt, S.M., King, N., Cornelius, A.J., Premaratne, A., Besser, T.E., On, S.L.W., 2011.
Molecular risk assessment and epidemiological typing of Shiga toxin-producing
Escherichia coli by using a novel PCR binary typing system. Appl. Environ. Microbiol.
Brooks, J.T., Sowers, E.G., Wells, J.G., Greene, K.D., Griffin, P.M., Hoekstra, R.M., Strockbine,
N.A., 2005. Non-O157 Shiga toxin-producing Escherichia coli infections in the United
States, 1983–2002. J. Infect. Dis. 192, 1422–1429.
Bruant, G., Maynard, C., Bekal, S., Gaucher, I., Masson, L., Brousseau, R., Harell, J., 2006.
Development and validation of an oligonucleotide microarray for detection of
multiple virulence and antimicrobial resistance genes in Escherichia coli. Appl. Environ.
Microbiol. 72, 3780–3784.
Brul, S., Bassett, J., Cook, P., Kathariou, S., McClure, P., Jasti, P.R., Betts, R., 2012. ‘Omics’
technologies in quantitative microbial risk assessment. Trends Food Sci. Technol.
Bugarel, M., Beutin, L., Fach, P., 2010. Low-density macroarray targeting non-locus of
enterocyte effacement effectors (nle genes) and major virulence factors of Shiga
toxin-producing Escherichia coli (STEC): a new approach for molecular risk assess-
ment of STEC isolates. Appl. Environ. Microbiol. 76, 203–211.
Bustamante, A.V., Sanso, A.M., Parma, A.E., Lucchesi, P.M.A., 2012. Subtyping of STEC by
MLVA in Argentina. Frontiers in cellular and infection. Microbiology 2.
Buvens, G., De Gheldre, Y., Dediste, A., de Moreau, A., Mascart, G., Simon, A., Allemeersch,
D., Scheutz, F., Lauwers, S., Pierard, D., 2012. Incidence and virulence determinants of
verocytotoxin-producing Escherichia coli infections in the Brussels-capital region,
Belgium, in 2008–2010. J. Clin. Microbiol. 50, 1336–1345.
Canchaya, C., Fournous, G., Chibani-Chennoufi, S., Dillmann, M., Brussow, H., 2003. Phage
as agents of lateral gene transfer. Curr. Opin. Microbiol. 6, 417–424.
Carrillo, C.D., Kruczkiewicz, P., Mutschall, S., Tudor, A., Clark, C., Taboada, E.N., 2012. A
framework for assessing the concordance of molecular typing methods and the
true strainphylogeny of Campylobacter jejuniand C. coil using draft genome sequence
data. Front. Cell. Infect. Microbiol. 2 (Missing pages).
Chaudhuri, R.R., Henderson, I.R., 2012. The evolution of the Escherichia coli phylogeny.
Infect. Genet. Evol. 12, 214–226.
Chaudhuri, R.R., Sebaihia, M., Hobman, J.L., Webber, M.A., Leyton, D.L., Goldberg, M.D.,
Cunningham, A.F., Scott-Tucker, A., Ferguson, P.R., Thomas, C.M., Frankel, G., Tang,
C.M., Dudley, E.G., Roberts, I.S., Rasko, D.A., Pallen, M.J., Parkhill, J., Nataro, J.P.,
Thomson, N.R., Henderson, I.R., 2010. Complete genome sequence and comparative
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
metabolic profiling of the prototypical enteroaggregative Escherichia coli strain 042.
PLoS One 5, e8801.
Christiansson, M., Melin, S., Matussek, A., Lofgren, S., Soderman, J., 2011. MLVA is a
valuable tool in epidemiological investigations of Escherichia coli and for disclosing
multiple carriage. Scand. J. Infect. Dis. 43, 579–586.
Cody, A.J., Clarke, L., Bowler, I.C., Dingle, K.E., 2010. Ciprofloxacin-resistant campylobacteriosis
in the UK. Lancet 376 (1987–6736(10)62261–1. Is this correct?).
Cook, H., Ussery, D.W., 2013. Sigma factors in a thousand E. coli genomes. Environ.
Microbiol. 15, 3121–3129.
Coombes, B.K., Wickham, M.E., Mascarenhas, M., Gruenheid, S., Finlay, B.B., Karmali, M.A.,
2008. Molecular analysis as an aid to assess the public health risk of non-O157 Shiga
toxin-producing Escherichia coli strains. Appl. Environ. Microbiol. 74, 2153–2160.
Cooper, K.K., Mandrell, R.E.,Louie, J.W., Korlach,J., Clark, T.A.,Parker, C.T., Huynh, S., Chain,
P.S., Ahmed, S., Carter, M.Q., 2014. Comparative genomics of enterohemorrhagic
Escherichia coli O145:H28 demonstrates a common evolutionary lineage with
Escherichia coli O157:H7. BMC Genomics 15, 17.
Delannoy, S., Beutin, L., Fach, P., 2013. Towards a molecular definition of enterohemorrhagic
Escherichia coli (EHEC): detection of genes located on O island 57 as markers to distin-
guish EHEC from closely related enteropathogenic E. coli strains. J. Clin. Microbiol. 51,
Dong, T., Schellhorn, H.E., 2010. Role of RpoS in virulence of pathogens. Infect. Immun. 78,
Edwards, D.J., Holt, K.E., 2013. Beginner's guide to comparative bacterial genome analysis
using next-generation sequence data. Microb. Inform. Exp. 3, 2.
EFSA, 2013. Scientific Opinion on the evaluation of molecular typing methods for major
food-borne microbiological hazards and their use for attribution modelling, outbreak
investigation and scanning surveillance: Part 1 (evaluation of methods and applica-
tions). EFSA J. 11 (12), 3502 (1-84).
Eppinger, M., Mammel, M.K., LeClerc, J.E., Ravel, J., Cebula, T.A., 2011a. Genome signatures
of Escherichia coli O157:H7 isolates from the bovine host reservoir. Appl. Environ.
Microbiol. 77, 2916–2925.
Eppinger, M., Mammel, M.K., Leclerc, J.E., Ravel, J., Cebula, T.A., 2011b. Genomic anatomy
of Escherichia coli O157:H7 outbreaks. Proc. Natl. Acad. Sci. U. S. A. 108, 20142–20147.
Ercolini, D., 2013. High-throughput sequencing and metagenomics: moving forward in
the culture-independent analysis of food microbial ecology. Appl. Environ. Microbiol.
Escobar-Paramo, P., Clermont, O., Blanc-Potard, A., Bui, H., Le Bouguenec, C., Denamur, E.,
2004. A specific genetic background is required for acquisition and expression of
virulence factors in Escherichia coli. Mol. Biol. Evol. 21, 1085–1094.
Ethelberg, S., Olsen, K., Scheutz, F., Jensen, C., Schiellerup, P., Engberg, J., Petersen, A.,
Olesen, B., Gerner-Smidt, P., Molbak, K., 2004. Virulence factors for hemolytic uremic
syndrome, Denmark. Emerg. Infect. Dis. 10, 842–847.
FAO, 1997. Codex Alimentarius Commission: Procedural Manual. Food and Agriculture
Organisation of the United Nations World Health Organization, Rome, Italy.
Feng, P., Lampel, K., Karch, H., Whittam, T., 1998. Genotypic and phenotypic changes in
the emergence of Escherichia coli O157: H7. J. Infect. Dis. 177, 1750–1753.
Flynn, D.,2013. Letter from the Editor: IT Risks in the Metadata World. Food Safety News,
Frank, C., Werber, D., Cramer, J.P., Askar, M., Faber, M., an der Heiden, M., Bernard, H.,
Fruth, A., Prager, R., Spode, A., Wadl, M., Zoufaly, A., Jordan, S., Kemper, M.J., Follin,
P., Muller, L., King, L.A., Rosner, B., Buchholz, U., Stark, K., Krause, G., HUS Invest
Team, 2011. Epidemic profile of Shiga-Toxin-producing Escherichia coli O104:H4
outbreak in Germany. N. Engl. J. Med. 365, 1771–1780.
Franz, E., van Hoek, A.H.A.M., Bouw, E., Aarts, H.J.M., 2011. Variability of Escherichia coli
O157 strain survival in manure-amended soil in relation to strain origin, virulence
profile, and carbon nutrition profile. Appl. Environ. Microbiol. 77, 8088–8096.
Franz, E., van Hoek, A.H.A.M., van der Wal, F.J., de Boer, A., Zwartkruis-Nahuis, A., van der
Zwaluw, K., Aarts, H.J.M., Heuvelink, A.E., 2012. Genetic features differentiating
bovine, food, and human isolates of Shiga toxin-producing Escherichia coli O157 in
the Netherlands. J. Clin. Microbiol. 50, 772–780.
Friedrich,A.,Bielaszewska,M.,Zhang, W.,Pulz, M.,Kuczius, T.,Ammon, A., Karch,H., 2002.
Escherichia coli harboring Shiga toxin 2 gene variants: frequency and association with
clinical symptoms. J. Infect. Dis. 185, 74–84.
Fuller, C.A., Pellino, C.A., Flagler, M.J., Strasser, J.E., Weiss, A.A., 2011. Shiga toxin subtypes
display dramatic differences in potency. Infect. Immun. 79, 1329–1337.
Giesecke, J., 2002. Modern Infectious Disease Epidemiology. Arnold, London, UK.
Gonzales, T.K., Kulow, M., Park, D., Kaspar, C.W., Anklam, K.S., Pertzborn, K.M., Kerrish, K.
D., Ivanek, R., Doepfer, D., 2011. A high-throughput open-array qPCR gene panel to
identify, virulotype, and subtype O157 and non-O157 enterohemorrhagic Escherichia
coli. Mol. Cell. Probes 25, 222–230.
Gould, L.H., Mody, R.K., Ong, K.L., Clogher, P., Cronquist, A.B., Garman, K.N., Lathrop, S.,
Medus, C., Spina, N.L., Webb, T.H., White, P.L., Wymore, K., Gierke, R.E., Mahon, B.E.,
Griffin, P.M., Emerging Infect Program FoodNet, 2013. Increased recognition of non-
O157 Shiga toxin-producing Escherichia coli infections in the United States during
2000–2010: epidemiologic features and comparison with E. coli O157 infections.
Foodborne Pathog. Dis. 10, 453–460.
Grad, Y.H., Godfrey, P., Cerquiera, G.C., Mariani-Kurkdjian, P., Gouali, M., Bingen, E., Shea,
T.P., Haas, B.J., Griggs, A., Young, S., Zeng, Q., Lipsitch, M., Waldor, M.K., Weill, F.,
Wortman, J.R., Hanage, W.P., 2013. Comparative genomics of recent Shiga toxin-
producing Escherichia coli O104:H4: short-term evolution of an emerging pathogen.
mBio 4, e00452-12.
Grant, M.A., Hedberg, C., Johnson, R., Harris, J., Logue, C.M., Meng, J., Sofos, J., Dickson, J.S.,
2011. The significance of non-O157 Shiga toxin-producing Escherichia coli in food.
Food Prot. Trends 33–45 (January).
Hasan, N.A., Choi, S.Y., Eppinger, M., Clark, P.W., Chen, A., Alam, M., Haley, B.J., Taviani, E.,
Hine, E., Su, Q., Tallon, L.J., Prosper, J.B., Furth, K., Hoq, M.M., Li, H., Fraser-Liggett, C.M.,
Cravioto, A., Huq, A., Ravel, J., Cebula, T.A., Colwell, R.R., 2012. Genomic diversity of
2010 Haitian cholera outbreak strains. Proc. Nat. Acad. Sci. U.S.A. 109, E2010–E2017.
Havelaar, A.H., Brul, S., de Jong, A., de Jonge, R., Zwietering, M.H., ter Kuile, B.H., 2010.
Future challenges to microbial food safety. Int. J. Food Microbiol. 139, S79–S94.
Hayashi, T., Makino, K., Ohnishi, M.,Kurokawa, K., Ishii, K.,Yokoyama, K., Han, C.,Ohtsubo, E.,
Ogasawara, N., Yasunaga, T., Kuhara, S., Shiba, T., Hattori, M., Shinagawa, H., 2001.
Complete genome sequence of enterohemorrhagic Escherichia coli O157: H7 and
genomic comparison with a laboratory strain K-12. DNA Res. 8, 11–22.
Hazen, T.H., Sahl, J.W., Fraser, C.M., Donnenberg, M.S., Scheutz, F., Rasko, D.A., 2013.
Refining the pathovar paradigm via phylogenomics of the attaching and effacing
Escherichia coli. Proc. Natl. Acad. Sci. U. S. A. 110, 12810–12815.
Herold, S., Karch, H., Schmidt, H., 2004. Shiga toxin-encoding bacteriophages — genomes
in motion. Int. J. Med. Microbiol. 294, 115–121.
Hodges, J., 2012. USDA's New Shiga Toxin-Producing Escherichia coli Policy. pp. 17–20.
Hong, S., Oh, K., Cho, S., Kim, J., Park, M., Lim, H., Lee, B., 2009. Asymptomatic healthy
slaughterhouse workers in South Korea carrying Shiga toxin-producing Escherichia
coli. FEMS Immunol. Med. Microbiol. 56, 41–47.
Imamovic, L., Balleste, E., Jofre, J., Muniesa, M., 2010. Quantification of Shiga toxin-
converting bacteriophages in wastewater and in fecal samples by real-time quantita-
tive PCR. Appl. Environ. Microbiol. 76, 5693–5701.
Jaros, P., Cookson, A.L., Campbell, D.M., Besser, T.E., Shringi, S., Mackereth, G.F., Lim, E.,
Lopez, L., Dufour, M., Marshall, J.C., Baker, M.G., Hathaway, S., Prattley, D.J., French,
N.P., 2013. A prospective case–control and molecular epidemiological study of
human cases of Shiga toxin-producing Escherichia coli in New Zealand. BMC Infect.
Dis. 13, 450.
Johnson, K.E., Thorpe, C.M., Sears, C.L., 2006. The emerging clinical importance of
non-O157 Shiga toxin-producing Escherichia coli. Clin. Infect. Dis. 43, 1587–1595.
Jolley, K.A., Maiden, M.C.J., 2010. BIGSdb: Scalable analysis of bacterial genome variation
at the population level. BMC Bioinformatics 11, 595.
Jones, K.E., Patel, N.G., Levy, M.A., Storeygard, A., Balk, D., Gittleman, J.L., Daszak, P., 2008.
Global trends in emerging infectious diseases. Nature 451, 990-U4.
Ju, W., Shen, J., Toro, M., Zhao, S., Meng, J., 2013. Distribution of pathogenicity islands
OI-122, OI-43/48, and OI-57 and a high-pathogenicity island in Shiga toxin-
producing Escherichia coli. Appl. Environ. Microbiol. 79, 3406–3412.
Karch, H., Denamur, E., Dobrindt, U., Finlay, B.B., Hengge, R., Johannes, L., Ron, E.Z.,
Tonjum, T., Sansonetti, P.J., Vicente, M., 2012. The enemy within us: lessons
from the 2011 European Escherichia coli O104:H4 outbreak. EMBO Mol. Med. 4,
Karmali, M., Mascarenhas, M., Shen, S., Ziebell, K., Johnson, S., Reid-Smith, R., Isaac-
Renton, J., Clarks, C., Rahn, K., Kaper, J., 2003. Association of genomic O(−)island
122 of Escherichia coli EDL 933 with verocytotoxin-producing Escherichia coli
seropathotypes that are linked to epidemic and/or serious disease. J. Clin. Microbiol.
Karmali, M.A., Gannon, V., Sargeant, J.M., 2010. Verocytotoxin-producing Escherichia coli
(VTEC). Vet. Microbiol. 140, 360–370.
Kim, J., Nietfeldt, J., Benson, A., 1999. Octamer-based genome scanning distinguishes a
unique subpopulation of Escherichia coli O157: H7 strains in cattle. Proc. Natl. Acad.
Sci. U. S. A. 96, 13288–13293.
King, T., Ishihama, A., Kori, A., Ferenci, T., 2004. A regulatory trade-off as a source of strain
variation in the species Escherichia coli. J. Bacteriol. 186, 5614–5620.
Kisand, V., Lettieri, T., 2013. Genome sequencing of bacteria: sequencing, de novo
assembly and rapid analysis using open source tools. BMC Genomics 14, 211.
Kudva, I., Evans, P., Perna, N., Barrett, T., Ausubel, F., Blattner, F., Calderwood, S., 2002.
Strains of Escherichia coli O157: H7 differ primarily by insertions or deletions, not
single-nucleotide polymorphisms. J. Bacteriol. 184, 1873–1879.
Laing, C.R., Buchanan, C., Taboada, E.N., Zhang, Y., Karmali, M.A., Thomas, J.E., Gannon, V.P.,
2009.Insilicogenomic analysesrevealthree distinctlineagesof Escherichia coli O157:
H7, one of which is associated with hyper-virulence. BMC Genomics 10, 287.
Laing, C., Buchanan, C., Taboada, E.N., Zhang, Y., Kropinski, A., Villegas, A., Thomas, J.E.,
Gannon, V.P., 2010. Pan-genome sequence analysis using Panseq: an online tool for
the rapid analysis of core and accessory genomic regions. BMC Bioinformatics 11
Litrup, E., Torpdahl, M., Nielsen, E.M., 2007. Multilocus sequence typing performed on
Campylobacter coli isolates from humans, broilers, pigs and cattle originating in
Denmark. J. Appl. Microbiol. 103, 210–218.
Locking, M.E., O'Brien, S.J., Reilly, W.J., Wright, E.M., Campbell, D.M., Coia, J.E., Browning, L.M.,
Ramsay, C.N., 2001. Risk factors for sporadic cases of Escherichia coli O157 infection: the
importance of contact with animal excreta. Epidemiol. Infect. 127, 215–220.
Locking, M.E., Pollock, K.G.J., Allison, L.J., Rae, L., Hanson, M.F., Cowden, J.M., 2011.
Escherichia coli O157 infection and secondary spread, Scotland, 1999-2008. Emerg.
Infect. Dis. 17, 524–527.
Manning, S.D., Motiwala, A.S., Springman, A.C., Qi, W., Lacher, D.W., Ouellette, L.M.,
Mlaclonicky, J.M., Somsel, P., Rudrik, J.T., Dietrich, S.E., Zhang, W., Swaminathan, B.,
Alland, D., Whittam, T.S., 2008. Variation in virulence among clades of Escherichia
coli O157: H7, associated with disease outbreaks. Proc. Natl. Acad. Sci. U. S. A. 105,
Martinez-Castillo, A., Quiros, P., Navarro, F., Miro, E., Muniesa, M., 2013. Shiga toxin
2-encoding bacteriophages in human fecal samples from healthy individuals. Appl.
Environ. Microbiol. 79, 4862–4868.
Mather, A.E., Reid, S.W.J., Maskell, D.J., Parkhill, J., Fookes, M.C., Harris, S.R., Brown, D.
J., Coia, J.E., Mulvey, M.R., Gilmour, M.W., Petrovska, L., de Pinna, E., Kuroda, M.,
Akiba, M., Izumiya, H., Connor, T.R., Suchard, M.A., Lemey, P., Mellor, D.J.,
Haydon, D.T., Thomson, N.R., 2013. Distinguishable epidemics of multidrug-
resistant Salmonella Typhimurium DT104 in Different Hosts. Science 341,
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72
Mathusa, E.C., Chen, Y., Enache, E., Hontz, L., 2010. Non-O157 Shiga toxin-producing
Escherichia coli in foods. J. Food Prot. 73, 1721–1736.
McArthur, A.G., Waglechner, N., Nizam, F., Yan, A., Azad, M.A., Baylay, A.J., Bhullar, K.,
Canova, M.J., De Pascale, G., Ejim, L., Kalan, L., King, A.M., Koteva, K., Morar, M.,
Mulvey, M.R., O'Brien, J.S., Pawlowski, A.C., Piddock, L.J.V., Spanogiannopoulos, P.,
Sutherland, A.D., Tang, I., Taylor, P.L., Thaker, M., Wang, W., Yan, M., Yu, T., Wright,
G.D., 2013. The comprehensive antibiotic resistance database. Antimicrob. Agents
Chemother. 57, 3348–3357.
Mellmann, A., Bielaszewska, M., Koeck, R., Friedrich, A.W., Fruth, A., Middendorf, B.,
Harmsen, D., Schmidt, M.A., Karch, H., 2008. Analysis of collection of hemolytic
uremic syndrome-associated enterohemorrhagic Escherichia coli. Emerg. Infect. Dis.
Mellmann, A., Harmsen, D., Cummings, C.A., Zentz, E.B., Leopold, S.R., Rico, A., Prior,
K., Szczepanowski, R., Ji, Y., Zhang, W., McLaughlin, S.F., Henkhaus, J.K., Leopold,
B., Bielaszewska, M., Prager, R., Brzoska, P.M., Moore, R.L., Guenther, S., Rothberg, J.
M., Karch, H., 2011. Prospective genomic characterization of the German
enterohemorrhagic Escherichia coli O104:H4 outbreak by rapid next generation se-
quencing technology. PLoS One 6, e22751.
Mellor, G.E., Sim, E.M., Barlow, R.S., D'Astek, B.A., Galli, L., Chinen, I., Rivas, M., Gobius, K.S.,
2012. Phylogenetically related Argentinean and Australian Escherichia coli O157
isolates are distinguished by virulence clades and alternative Shiga toxin 1 and 2
prophages. Appl. Environ. Microbiol. 78, 4724–4731.
Mellor, G.E., Besser, T.E., Davis, M.A., Beavis, B., Jung, W., Smith, H.V., Jennison, A.V., Doyle,
C.J., Chandry, P.S., Gobius, K.S., Fegan, N., 2013. Multilocus genotype analysis of
Escherichia coli O157 isolates from Australia and the United States provides evidence
of geographic divergence. Appl. Environ. Microbiol. 79, 5050–5058.
Mertens, K., Freund, L., Hänsel, C., Melzer, F., Elschner, M.C., 2014. Comparative evaluation
of eleven commercial DNA extraction kits for real-time PCR detection of Bacillus
anthracis spores in spiked dairy samples. Int. J. Food Microbiol. 170, 29–37.
Morse, S.S., 1995. Factors in the emergence of infectious-diseases. Emerg. Infect. Dis. 1,
Muellner, P., Pleydell, E., Pirie, R., Baker, M.G., Campbell, D., Carter, P.E., French, N.P., 2013.
Molecular-based surveillance of campylobacteriosis in New Zealand — from source
attribution to genomic epidemiology. Eurosurveillance 18, 37–43.
Mughini Gras, L., Smid, J.H., Wagenaar, J.A., de Boer, A.G., Havelaar, A.H., Friesema, I.H.M.,
French, N.P., Busani, L., van Pelt, W., 2012. Risk factors for campylobacteriosis of
chicken, ruminant, and environmental origin: a combined case–control and source
attribution analysis. PLoS One 7.
Muniesa, M., Schmidt, H., 2014. Shiga toxin-encoding phages: multifunctional gene
ferries. In: Morabito, S. (Ed.), Pathogenic Escherichia coli. Caister Academic Press,
Portland, Oregon, USA.
Muniesa, M., Serra-Moreno, R., Jofre, J., 2004. Free Shiga toxin bacteriophages isolated
from sewage showed diversity although the stx genes appeared conserved. Environ.
Microbiol. 6, 716–725.
Nataro, J., Kaper, J., 1998. Diarrheagenic Escherichia coli. Clin. Microbiol. Rev. 11, 142–201.
Neupane, M., Abu-Ali, G.S., Mitra, A., Lacher, D.W., Manning, S.D., Riordan, J.T., 2011. Shiga
toxin 2 overexpression in Escherichia coil O157:H7 strains associated with severe
human disease. Microb. Pathog. 51, 466–470.
Newell, D.G., Koopmans, M., Verhoef, L., Duizer, E., Aidara-Kane, A., Sprong, H., Opsteegh,
M., Langelaar, M., Threfall, J., Scheutz, F., van der Giessen, J., Kruse, H., 2010. Food-
borne diseases — the challenges of 20 years ago still persist while new ones continue
to emerge. Int. J. Food Microbiol. 139, S3–S15.
Njoroge, J.W., Nguyen, Y., Curtis, M.M., Moreira, C.G., Sperandio, V., 2012. Virulence meets
metabolism: Cra and KdpE gene regulation in enterohemorrhagic Escherichia coli.
mBio 3, e00280-12.
Noller, A., McEllistrem, M., Stine, O., Morris, J., Boxrud, D., Dixon, B., Harrison, L., 2003.
Multilocus sequence typing reveals a lack of diversity among Escherichia coli O157:
H7 isolates that are distinct by pulsed-field gel electrophoresis. J. Clin. Microbiol.
Norman, K.N., Strockbine, N.A., Bono, J.L., 2012. Association of nucleotide polymorphisms
within the O-Antigen gene cluster of Escherichia coli O26, O45, O103, O111, O121,
and O145 with serogroups and genetic subtypes. Appl. Environ. Microbiol. 78,
Ogura, Y., Ooka, T., Asadulghani, Terajima, J., Nougayrede, J., Kurokawa, K., Tashiro, K.,
Tobe, T., Nakayama, K., Kuhara, S., Oswald, E., Watanabe, H., Hayashi, T., 2007. Exten-
sive genomic diversity and selective conservation of virulence-determinants in
enterohemorrhagic Escherichia coli strains of O157 and non-O157 serotypes. Genome
Biol. 8, R138.
Overbeek, R., Olson, R., Pusch, G.D., Olsen, G.J., Davis, J.J., Disz, T., Edwards, R.A., Gerdes, S.,
Parrello, B., Shukla, M., Vonstein, V., Wattam, A.R., Xia, F., Stevens, R., 2014. The SEED
and the rapid annotation of microbial genomes using subsystems technology (RAST).
Nucleic Acids Res. 42, D206–D214.
Pearl, D.L., Louie, M., Chui, L., Dore, K., Grinisrud, K.M., Leedell, D., Martin, S.W., Michel, P.,
Svenson, L.W., McEwen, S.A., 2006. The use of outbreak information in the interpre-
tation of clustering of reported cases of Escherichia coli O157 in space and time in
Alberta, Canada, 2000–2002. Epidemiol. Infect. 134, 699–711.
Pearl, D.L., Louie, M., Chui, L., Dore, K., Grimsrud, K.M., Martin, S.W., Michel, P., Svenson, L.W.,
McEwen, S.A., 2007. The use of randomization tests to assess the degree of similarity in
PFGE patterns of E. coli O157 isolates from known outbreaks and statistical space-time
clusters. Epidemiol. Infect. 135, 100–109.
Perna, N., Plunkett, G., Burland, V., Mau, B., Glasner, J., Rose, D., Mayhew, G., Evans, P.,
Gregor, J., Kirkpatrick, H., Postal, G., Hackett, J., Klink, S., Boutin, A., Shao, Y., Miller,
L., Grotbeck, E., Davis, N., Lim, A., Dimalanta, E., Potamousis, K., Apodaca, J.,
Anantharaman, T., Lin, J., Yen, G., Schwartz, D., Welch, R., Blattner, F., 2001. Genome
sequence of enterohaemorrhagic Escherichia coli O157: H7 (vol 409, pg 529, 2001).
Nature 410, 240-240 (Something is amiss here).
Pires, S.M., Evers, E.G., van Pelt, W., Ayers, T., Scallan, E., Angulo, F.J., Havelaar, A., Hald, T.,
Med-Vet-Net Workpackage 28 Working Group, 2009. Attributing the human disease
burden of foodborne infections to specific sources. Foodborne Pathog. Dis. 6,
Pradel, N., Bertin, Y., Martin, C., Livrelli, V., 2008. Molecular analysis of Shiga toxin-
producing Escherichia coli strains isolated from hemolytic-uremic syndrome patients
and dairy samples in France. Appl. Environ. Microbiol. 74, 2118–2128.
Preussel, K., Hoehle, M., Stark, K., Werber, D., 2013. Shiga toxin-producing Escherichia coli
O157 is more likely to lead to hospitalization and death than non-O157 serogroups —
except O104. PLoS One 8, e78180.
Pritchard, J.K., Stephens, M., Donnelly, P., 2000. Inference of population structure using
multilocus genotype data. Genetics 155, 945–959.
Proctor, M., Kurzynski, T., Koschmann, C., Archer, J., Davis, J., 2002. Four strains of
Escherichia coli O157: H7 isolated from patients during an outbreak of disease
associated with ground beef: Importance of evaluating multiple colonies from an
outbreak-associated product. J. Clin. Microbiol. 40, 1530–1533.
Qi, W., Lacher, D., Bumbaugh, A., Hyma, K., Ouellette, L., Large, T., Tarr, C., Whittam, T.,
2004. EcMLST: an online database for multi locus sequence typing of pathogenic
Escherichia coli. IEEE Computer Society, 10662 Los Vaqueros Circle, PO Box 3014,
Los Alamitos, CA 90720-1264 USA.
Rasko, D.A., Rosovitz, M.J., Myers, G.S.A., Mongodin, E.F., Fricke, W.F., Gajer, P., Crabtree, J.,
Sebaihia, M., Thomson, N.R., Chaudhuri, R., Henderson, I.R., Sperandio, V., Ravel, J.,
2008. The pangenome structure of Escherichia coli: Comparative genomic analysis
of E. coli commensal and pathogenic isolates. J. Bacteriol. 190, 6881–6893.
Rasko, D.A., Webster, D.R., Sahl, J.W., Bashir, A., Boisen, N., Scheutz, F., Paxinos, E.E., Sebra,
R., Chin, C., Iliopoulos, D., Klammer, A., Peluso, P., Lee, L., Kislyuk, A.O., Bullard, J.,
Kasarskis, A., Wang, S., Eid, J., Rank, D., Redman, J.C., Steyert, S.R., Frimodt-Moller, J.,
Struve, C., Petersen, A.M., Krogfelt, K.A., Nataro, J.P., Schadt, E.E., Waldor, M.K., 2011.
Origins of the E. coli strain causing an outbreak of hemolytic–uremic syndrome in
Germany. N. Engl. J. Med. 365, 709–717.
Reeves, P.R., Liu, B., Zhou, Z., Li, D., Guo, D., Ren, Y., Clabots, C., Lan, R., Johnson, J.R., Wang,
L., 2011. Rates of mutation and host transmission for an Escherichia coli clone over
3 years. PLoS One 6, e26907.
Reid, S., Herbelin, C., Bumbaugh, A., Selander, R., Whittam, T., 2000. Parallel evolution of
virulence in pathogenic Escherichia coli. Nature 406, 64–67.
Rivas, M., Miliwebsky, E., Chinen, I., Roldan, C., Balbi, L., Garcia, B., Fiorilli, G., Sosa-Estani,
S., Kincaid, J., Rangel, J., Griffin,P., Case-Control Study Grp, 2006. Characterization and
epidemiologic subtyping of Shiga toxin-producing Escherichia coli strains isolated
from hemolytic uremic syndrome and diarrhea cases in Argentina. Foodborne
Pathog. Dis. 3, 88–96.
Rohde, H., Qin, J., Cui, Y., Li, D., Loman, N.J., Hentschke, M., Chen, W., Pu, F., Peng, Y., Li, J., Xi,
F., Li, S., Li, Y., Zhang, Z., Yang, X., Zhao, M., Wang, P., Guan, Y., Cen, Z., Zhao, X.,
Christner, M., Kobbe, R., Loos, S., Oh, J., Yang, L., Danchin, A., Gao, G.F., Song, Y., Li, Y.,
Yang, H., Wang, J., Xu, J., Pallen, M.J., Wang, J., Aepfelbacher, M., Yang, R., E coli O104
H4 Genome Anal Crowd-Sourcing, 2011. Open-source genomic analysis of Shiga-
toxin-producing E. coli O104:H4. N. Engl. J. Med. 365, 718–724.
Sahl, J.W., Matalka, M.N., Rasko, D.A., 2012. Phylomark, a tool to identify conserved
phylogenetic markers from whole-genome alignments. Appl. Environ. Microbiol.
Scallan, E., Hoekstra, R.M., Angulo, F.J., Tauxe, R.V., Widdowson, M., Roy, S.L., Jones, J.L.,
Griffin,P.M., 2011. Foodborne illness acquired inthe United States—major pathogens.
Emerg. Infect. Dis. 17, 7–15.
Scharff, R.L., 2012. Economic burden from health losses due to foodborne illness in the
United States. J. Food Prot. 75, 123–131.
Scheutz, F., Nielsen, E.M., Frimodt-Moller, J., Boisen, N., Morabito, S., Tozzoli, R.,
Nataro, J.P., Caprioli, A., 2011. Characteristics of the enteroaggregative Shiga
toxin/verotoxin-producing Escherichia coli O104:H4 strain causing the outbreak
of haemolytic uraemic syndrome in Germany, May to June 2011. Eurosurveillance
Schmidt, H., Scheef, J., Morabito, S., Caprioli, A., Wieler, L., Karch, H., 2000. A new Shiga
toxin 2 variant (Stx2f) from Escherichia coli isolated from pigeons. Appl. Environ.
Microbiol. 66, 1205–1208.
Sharma, R., Stanford, K., Louie, M., Munns, K., John, S.J., Zhang, Y., Gannon, V., Chui, L.,
Read, R., Topp, E., McAllister, T., 2009. Escherichia coli O157:H7 lineages in healthy
beef and dairy cattle and clinical human cases in Alberta, Canada. J. Food Prot. 72,
Sheppard, S.K., Dallas, J.F., Strachan, N.J.C., MacRae, M., McCarthy, N.D., Wilson, D.J.,
Gormley, F.J., Falush, D., Ogden, I.D., Maiden, M.C., Forbes, K.J., 2009. Campylobacter
genotyping to determine the source of human infection. Clin. Infect. Dis. 48,
Shringi, S., Schmidt, C., Katherine, K., Brayton, K.A., Hancock, D.D., Besser, T.E., 2012.
Carriage of stx2a differentiates clinical and bovine-biased strains of Escherichia coli
O157. PLoS One 7, e51572.
Silvestro, L., Caputo, M., Blancato, S., Decastelli, L., Fioravanti, A., Tozzoli, R., Morabito, S.,
Caprioli, A., 2004. Asymptomatic carriage of verocytotoxin-producing Escherichia
coli O157 in farm workers in Northern Italy. Epidemiol. Infect. 132, 915–919.
Smith, D.L., Rooks, D.J., Fogg, P.C.M., Darby, A.C., Thomson, N.R., McCarthy, A.J., Allison, H.E.,
2012. Comparative genomics of Shiga toxin encoding bacteriophages. BMC Genomics
Soon, J.M.,Seaman, P., Baines, R.N.,2013. Escherichia coli O104:H4 outbreak from sprouted
seeds. Int. J. Hyg. Environ. Health 216, 346–354.
Spinale, J.M., Ruebner, R.L., Copelovitch, L., Kaplan, B.S., 2013. Long-term outcomes of
Shiga toxin hemolytic uremic syndrome. Pediatr. Nephrol. 28, 2097–2105.
Staples, M., Jennison, A.V., Graham, R.M.A., Smith, H.V., 2012. Evaluation of the Meridian
Premier EHEC assay as an indicator of Shiga toxinpresence indirectfaecal specimens.
Diagn. Microbiol. Infect. Dis. 73, 322–325.
E. Franz et al. / International Journal of Food Microbiology 187 (2014) 57–72