Figure 3 - uploaded by Trudy M Wassenaar
Content may be subject to copyright.
Pan-genome clustering of E. coli (black) and related species (colored), based on the alignment of their variable gene content. The genomes now cluster according to species and a relatedness between  

Pan-genome clustering of E. coli (black) and related species (colored), based on the alignment of their variable gene content. The genomes now cluster according to species and a relatedness between  

Source publication
Article
Full-text available
Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution. Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics trees, and to identify the pan- and core genomes of thi...

Contexts in source publication

Context 1
... the variable fraction contain genes that are present in some, absent in other genomes, a phylogenetic analysis cannot be performed to capture all information. Figure 3 displays a pan-genome clustering tree, based on the gene families that are variably present in the analyzed genomes (gene families comprising singletons were excluded). The hierarchical clustering obtained by this analysis correctly separates the Shigella spp. ...
Context 2
... tree, based on the gene families that are variably present in the analyzed genomes (gene families comprising singletons were excluded). The hierarchical clustering obtained by this analysis correctly separates the Shigella spp. and S. Typhimurium from Escherichia spp. and, within the latter genus, separates E. coli from the other Escherichia spp (Fig. 3). Moreover, all E. coli O157:H7 genomes now cluster together, as do the K12 derivatives (W3110, MG1655, DH1, BW2952, DH10B, and ATCC8739). The strains belonging to phylogenic group B are also positioned in one cluster, to which the non-pathogenic commensal strain HS also seems to belong. All these are avirulent isolates, and it is quite ...

Similar publications

Article
Full-text available
Mapping short reads against a reference genome is classically the first step of many next-generation sequencing data analyses, and it should be as accurate as possible. Because of the large number of reads to handle, numerous sophisticated algorithms have been developped in the last 3 years to tackle this problem. In this article, we first review t...

Citations

... Escherichia coli (E. coli) is a Gram-negative, non-sporing, aerobic and facultative anaerobic, rod-shaped bacterium that belongs to the Enterobacteriaceae family [1][2][3]. It is an in- testinal bacterium that is found in warm-blooded animals, humans, water, the environment, and in some instances, in contaminated food [1,[4][5][6]. ...
... coli) is a Gram-negative, non-sporing, aerobic and facultative anaerobic, rod-shaped bacterium that belongs to the Enterobacteriaceae family [1][2][3]. It is an in- testinal bacterium that is found in warm-blooded animals, humans, water, the environment, and in some instances, in contaminated food [1,[4][5][6]. This bacterium is highly adaptable and can survive, thrive, and multiply under various environmental conditions [7]. ...
... This bacterium is highly adaptable and can survive, thrive, and multiply under various environmental conditions [7]. E. coli isolates have been subdivided into pathogenic and commensal strains [1,8]. Commensal E. coli can adapt to different hosts and abiotic factors and is able to evolve to become pathogenic [4,9]. ...
Article
Full-text available
Escherichia coli (E. coli) is a Gram-negative, commensal/pathogenic bacteria found in human intestines and the natural environment. Pathogenic E. coli is known as extra-intestinal pathogenic E. coli (ExPEC) or intestinal pathogenic E. coli (InPEC). InPEC E. coli strains are separated into six pathogenic groups, known as enteropathogenic (EPEC), enterotoxigenic (ETEC), enteroinvasive (EIEC), enteroaggregative (EAEC), enterohaemorrhagic (EHEC), and diffusely adherent (DAEC), that have various virulence factors that cause infection. Virulence factors refer to a combination of distinctive accessory traits that affect a broad range of cellular processes in pathogens. There are two important virulence factors that directly interact with cells to cause diarrhoeal diseases within the intestines: adhesion and colonization factors and exotoxins. Virulence factors are crucial for bacteria to overcome the host’s immune system and result in antibiotic resistance. Antibiotics are used to combat the symptoms and duration of infection by pathogenic E. coli. However, the misuse and overuse of antibiotics have led to the global concern of antibiotic resistance. Currently, the antibiotic colistin is the last-resort drug to fight infection caused by this bacterium. Antibiotic resistance can be achieved in two main ways: horizontal gene transfer and mutation in different genes. The genetic basis for developing antibiotic resistance in E. coli occurs through four mechanisms: limiting drug uptake, modification of the drug target, inactivation of the drug, and active efflux of the drug. These mechanisms use different processes to remove the antibiotic from the bacterial cell or prevent the antibiotic from entering the bacterial cell or binding to targets. This prevents drugs from working effectively, and bacteria can acquire antibiotic resistance. E. coli is classified into different phylogenetic groups (A, B1, B2, D1, D2, E, and clade I). It is a very versatile bacterium that can easily adapt to different environmental factors. The present review gathered information about the pathogenicity, antimicrobial resistance, and phylogenetics of E. coli. These aspects are interconnected; thus, it will provide information on tracking the spread of pathogenic strains and antibiotic resistance genes of different strains using phylogenetics and how antibiotic resistance genes evolve. Understanding genetic variation in E. coli will help in monitoring and controlling outbreaks and in developing novel antibiotics and treatment. The increasing rate of antibiotic resistance, and the ability of E. coli to evolve rapidly, suggest that in-depth research is needed in these areas.
... In prokaryotes, with more frequent uptake of foreign DNA fragments, genomes are more dynamic. For example in the popular laboratory model organisms Escherichia coli (Yu et al., 2021;Lukjancenko et al., 2010), there are variation between isolates far greater than anything possible within eukaryote species. Amongst 61 strains each with about 4,500 genes, only ~1,000 genes are common to all strains. ...
Preprint
Full-text available
Sex in protists, evolution of mating types, Muller's ratchet, benefits of sex, clonal species, selfing, autogamy, mating types in Fungi, conjugation in ciliates, out-crossing species, genome rearrangements, meiosis toolkit, neutral mutations, immortal cell lines, sexual life cycles
... A majority proportion of bacterial population comprise of E. coli exhibiting enormous amount of both genetic and phenotypic diversity. E. coli is the most diverse bacterial species, with 20% of its genes share among all strains [23]. A group of closely related microbes that are distinguished by a common set of antigens is referred to as serotype. ...
Article
Full-text available
Urinary tract infection (UTI) is a prevalent condition that individuals may experience at least once in their lifetime. It is one of the most common reasons for hospital visits across all age groups, from neonates to adults. The predominant organism causing UTIs is Escherichia ( E .) coli, followed by other microorganisms such as Klebsiella pneumoniae, Staphylococcus saprophyticus, Citrobacter spp., Pseudomonas aeruginosa, and Proteus spp. This review focuses on E. coli as the predominant causative agent for UTIs, examining its contribution to the disease burden and antibiotic susceptibility which significantly impact on human health and society. Additionally, we discuss novel approaches to combat this common threat, including the development of bio-markers for UTI treatment, the application of AI, and nanotechnology in medical field to fight against UTIs. We also observe the global distribution of uropathogenic E. coli , with specific attention to India, and highlight the recent trends in drug resistance patterns among the uropathogenic E. coli isolates enabling physicians to administer appropriate antibiotics for UTI treatment.
... E. coli is an extremely diverse bacterial species in which only about 6% of the genes are shared by all strains. The remaining genes, accounting for more than 90%, are variable "accessory genes" that are differentially present in the various E. coli strains [17]. The results of the study revealed a high level of genetic diversity among the E. coli isolates, which were classified into seven distinct phylogenetic groups. ...
... The prominent ST131, which is a highly virulent and extensively antimicrobialresistant strain that has spread explosively throughout the world [26], was also found in our study. It is known to cause extraintestinal infections, including urinary tract and bloodstream infections [17]. The ST131 isolated in this study carried 28 VAGs, including afa (Dr binding adhesins), iutA (aerobactin receptor), and kpsMT II (group 2 capsule synthesis), which are characteristic of extraintestinal pathogenic E. coli (ExPEC) [18], demonstrating its ability to colonize and persist in the intestine, as confirmed by other studies [19]. ...
Article
Full-text available
The global spread of antimicrobial resistance genes (ARGs) in Escherichia coli is a major public health concern. The aim of this study was to investigate the genomic characteristics of extended-spectrum β-lactamase (ESBL)-producing and third-generation cephalosporin-resistant E. coli from a previously obtained collection of 260 E. coli isolates from fecal samples of patients attending primary healthcare facilities in Addis Ababa and Hossana, Ethiopia. A total of 29 E. coli isolates (19 phenotypically confirmed ESBL-producing and 10 third-generation cephalosporin-resistant isolates) were used. Whole-genome sequencing (NextSeq 2000 system, Illumina) and bioinformatic analysis (using online available tools) were performed to identify ARGs, virulence-associated genes (VAGs), mobile genetic elements (MGEs), serotypes, sequence types (STs), phylogeny and conjugative elements harbored by these isolates. A total of 7 phylogenetic groups, 22 STs, including ST131, and 23 serotypes with different VAGs were identified. A total of 31 different acquired ARGs and 10 chromosomal mutations in quinolone resistance-determining regions (QRDRs) were detected. The isolates harbored diverse types of MGEs, with IncF plasmids being the most prevalent (66.7%). Genetic determinants associated with conjugative transfer were identified in 75.9% of the E. coli isolates studied. In conclusion, the isolates exhibited considerable genetic diversity and showed a high potential for transferability of ARGs and VAGs. Bioinformatic analyses also revealed that the isolates exhibited substantial genetic diversity in phylogenetic groups, sequence types (ST) and serogroups and were harboring a variety of virulence-associated genes (VAGs). Thus, the studied isolates have a high potential for transferability of ARGs and VAGs.
... The relationship between Escherichia coli and Escherichia fergusonii illustrates this issue. E. coli, commonly found in the human gut, plays an essential role in digestion and immune functions, whereas E. fergusonii, although genetically similar to E. coli with a DNA-DNA hybridization similarity of approximately 64% [10], has been less studied and is not well understood [11,12]. Since certain strains of E. coli can be harmful while others can be beneficial, accurate identification and differentiation within the genus Escherichia is critical for the reliability of microbiome research. ...
Article
Full-text available
The gastrointestinal (GI) tract of shrimp, which is comprised of the stomach, hepatopancreas, and intestine, houses microbial communities that play crucial roles in immune defense, nutrient absorption, and overall health. While the intestine's microbiome has been well-studied, there has been limited research investigating the stomach and hepatopancreas. The present study addresses this gap by profiling the bacterial community in these interconnected GI segments of Pacific whiteleg shrimp. To this end, shrimp samples were collected from a local aquaculture farm in South Korea, and 16S rRNA gene amplicon sequencing was performed. The results revealed significant variations in bacterial diversity and composition among GI segments. The stomach and hepatopancreas exhibited higher Proteobacteria abundance, while the intestine showed a more diverse microbiome, including Cyanobacteria, Actinobacteria, Bacteroidetes, Firmicutes, Chloroflexi, and Verrucomicrobia. Genera such as Oceaniovalibus, Streptococcus, Actibacter, Ilumatobacter, and Litorilinea dominated the intestine, while Salinarimonas, Sphingomonas, and Oceaniovalibus prevailed in the stomach and hepatopancreas. It is particularly notable that Salinarimonas, which is associated with nitrate reduction and pollutant degradation, was prominent in the hepatopancreas. Overall, this study provides insights into the microbial ecology of the Pacific whiteleg shrimp's GI tract, thus enhancing our understanding of shrimp health with the aim of supporting sustainable aquaculture practices.
... The pool of conserved core genes is three times smaller than the pools of accessory and cloud genes, which suggests a flexible genome [36]. Other studies have also described the core genome of E. coli as being comparatively small [37][38][39]. However, it is worth emphasizing that the core genome is relative, as the concatenated core would become smaller if more genomes were added to the comparison [40]. ...
Article
Full-text available
Escherichia coli carrying IncK-blaCMY-2 plasmids mediating resistance to extended-spectrum cephalosporins (ESC) has been frequently described in food-producing animals and in humans. This study aimed to characterize IncK-blaCMY-2-positive ESC-resistant E. coli isolates from poultry production systems in Denmark, Finland, and Germany, as well as from Danish human blood infections, and further compare their plasmids. Whole-genome sequencing (Illumina) of all isolates (n = 46) confirmed the presence of the blaCMY-2 gene. Minimum inhibitory concentration (MIC) testing revealed a resistant phenotype to cefotaxime as well as resistance to ≥3 antibiotic classes. Conjugative transfer of the blaCMY-2 gene confirmed the resistance being on mobile plasmids. Pangenome analysis showed only one-third of the genes being in the core with the remainder being in the large accessory gene pool. Single nucleotide polymorphism (SNP) analysis on sequence type (ST) 429 and 1286 isolates showed between 0–60 and 13–90 SNP differences, respectively, indicating vertical transmission of closely related clones in the poultry production, including among Danish, Finnish, and German ST429 isolates. A comparison of 22 ST429 isolates from this study with 80 ST429 isolates in Enterobase revealed the widespread geographical occurrence of related isolates associated with poultry production. Long-read sequencing of a representative subset of isolates (n = 28) allowed further characterization and comparison of the IncK-blaCMY-2 plasmids with publicly available plasmid sequences. This analysis revealed the presence of highly similar plasmids in ESC-resistant E. coli from Denmark, Finland, and Germany pointing to the existence of common sources. Moreover, the analysis presented evidence of global plasmid transmission and evolution. Lastly, our results indicate that IncK-blaCMY-2 plasmids and their carriers had been circulating in the Danish production chain with an associated risk of spreading to humans, as exemplified by the similarity of the clinical ST429 isolate to poultry isolates. Its persistence may be driven by co-selection since most IncK-blaCMY-2 plasmids harbor resistance factors to drugs used in veterinary medicine.
... The 18 kbp plasmid, consisting of 22.3% G + C, encoded only 10 genes, whose biological relevance was elusive ( Figure 2; Table 1; Supplementary Table S2). The Aschnera genome, 0.77 Mbp in size, is around 1/5-1/7 of the E. coli genomes of 4-6 Mbp (Lukjancenko et al., 2010). The BUSCO score of A. chinzeii was 81.1%. ...
Article
Full-text available
Insect–microbe endosymbiotic associations are omnipresent in nature, wherein the symbiotic microbes often play pivotal biological roles for their host insects. In particular, insects utilizing nutritionally imbalanced food sources are dependent on specific microbial symbionts to compensate for the nutritional deficiency via provisioning of B vitamins in blood-feeding insects, such as tsetse flies, lice, and bedbugs. Bat flies of the family Nycteribiidae (Diptera) are blood-sucking ectoparasites of bats and shown to be associated with co-speciating bacterial endosymbiont “Candidatus Aschnera chinzeii,” although functional aspects of the microbial symbiosis have been totally unknown. In this study, we report the first complete genome sequence of Aschnera from the bristled bat fly Penicillidia jenynsii. The Aschnera genome consisted of a 748,020 bp circular chromosome and a 18,747 bp circular plasmid. The chromosome encoded 603 protein coding genes (including 3 pseudogenes), 33 transfer RNAs, and 1 copy of 16S/23S/5S ribosomal RNA operon. The plasmid contained 10 protein coding genes, whose biological function was elusive. The genome size, 0.77 Mbp, was drastically reduced in comparison with 4–6 Mbp genomes of free-living γ-proteobacteria. Accordingly, the Aschnera genome was devoid of many important functional genes, such as synthetic pathway genes for purines, pyrimidines, and essential amino acids. On the other hand, the Aschnera genome retained complete or near-complete synthetic pathway genes for biotin (vitamin B7), tetrahydrofolate (vitamin B9), riboflavin (vitamin B2), and pyridoxal 5'-phosphate (vitamin B6), suggesting that Aschnera provides these vitamins and cofactors that are deficient in the blood meal of the host bat fly. Similar retention patterns of the synthetic pathway genes for vitamins and cofactors were also observed in the endosymbiont genomes of other blood-sucking insects, such as Riesia of human lice, Arsenophonus of louse flies, and Wigglesworthia of tsetse flies, which may be either due to convergent evolution in the blood-sucking host insects or reflecting the genomic architecture of Arsenophonus-allied bacteria.
... In this paper, we demonstrate that a substantial proportion of Escherichia coli accessory genes can be predicted by the other genes present. E. coli has a large accessory genome (25,26) and occupies a wide range of niches (27). The E. coli pangenome has evolved divergent gene content over time-so much so, that a gene that is horizontally transferred from one E. coli to another will often find itself in a considerably different ensemble genetic background. ...
Article
Pangenomes exhibit remarkable variability in many prokaryotic species, much of which is maintained through the processes of horizontal gene transfer and gene loss. Repeated acquisitions of near-identical homologs can easily be observed across pangenomes, leading to the question of whether these parallel events potentiate similar evolutionary trajectories, or whether the remarkably different genetic backgrounds of the recipients mean that postacquisition evolutionary trajectories end up being quite different. In this study, we present a machine learning method that predicts the presence or absence of genes in the Escherichia coli pangenome based on complex patterns of the presence or absence of other accessory genes within a genome. Our analysis leverages the repeated transfer of genes through the E. coli pangenome to observe patterns of repeated evolution following similar events. We find that the presence or absence of a substantial set of genes is highly predictable from other genes alone, indicating that selection potentiates and maintains gene–gene co-occurrence and avoidance relationships deterministically over long-term bacterial evolution and is robust to differences in host evolutionary history. We propose that at least part of the pangenome can be understood as a set of genes with relationships that govern their likely cohabitants, analogous to an ecosystem’s set of interacting organisms. Our findings indicate that intragenomic gene fitness effects may be key drivers of prokaryotic evolution, influencing the repeated emergence of complex gene–gene relationships across the pangenome.
... Whole genome sequences available of STEC have shown high diversity because of horizontal gene transfer and genomic alterations [7,[32][33][34][35][36][37]. Using comparative genomics, identification of virulence and resistant genes and associated plasmids can be achieved to track pathogenic bacteria that pose as a public health threat. ...
Article
Full-text available
Escherichia coli O157:H7 is a foodborne pathogen that has been linked to global disease outbreaks. These diseases include hemorrhagic colitis and hemolytic uremic syndrome. It is vital to know the features that make this strain path-ogenic to understand the development of disease outbreaks. In the current study, a comparative genomic analysis was carried out to determine the presence of structural and functional features of O157:H7 strains obtained from 115 National Center for Biotechnology Information database. These strains of interest were analysed in the following programs: BLAST Ring Image Generator, PlasmidFinder, ResFinder, VirulenceFinder, IslandViewer 4 and PHASTER. Five strains (ECP19-198, ECP19-798, F7508, F8952, H2495) demonstrated a great homology with Sakai because of a few regions missing. Five resistant genes were identified, however, Macrolide-associated resistance gene mdf(A) was commonly found in all genomes. Majority of the strains (97%) were positive for 15 of the virulent genes (espA, espB, espF, espJ, gad, chuA, eae, iss, nleA, nleB, nleC, ompT, tccP, terC and tir). The plasmid analysis demonstrated that the IncF group was the most prevalent in the strains analysed. The prophage and genomic island analysis showed a distribution of bacteriophages and genomic islands respectively. The results indicated that structural and functional features of the many O157:H7 strains differ and may be a result of obtaining mobile genetic elements via horizontal gene transfer. Understanding the evolution of O157:H7 strains pathogenicity in terms of their structural and functional features will enable the development of detection and control of transmission strategies.
... Notable exceptions are D. chrysanthemi [1], D. oryzae [7] and D. dianthicola, for which analysis of more diverse strains [15] revealed a core genome of only around 3000 protein families. This is, however, still very large if compared to, for example, E. coli, for which, using less stringent conditions (50% identity on 50% of the length of the proteins), the core genome was estimated to comprise only about 1500 orthologous genes [33]. ...
Article
Full-text available
Bacterial diversity analyses often suffer from a bias due to sampling only from a limited number of hosts or narrow geographic locations. This was the case for the phytopathogenic species Dickeya solani, whose members were mainly isolated from a few hosts–potato and ornamentals–and from the same geographical area–Europe and Israel, which are connected by seed trade. Most D. solani members were clonal with the notable exception of the potato isolate RNS05.1.2A and two related strains that are clearly distinct from other D. solani genomes. To investigate if D. solani genomic diversity might be broadened by analysis of strains isolated from other environments, we analysed new strains isolated from ornamentals and from river water as well as strain CFBP 5647 isolated from tomato in the Caribbean island Guadeloupe. While water strains were clonal to RNS05.1.2A, the Caribbean tomato strain formed a third clade. The genomes of the three clades are highly syntenic; they shared almost 3900 protein families, and clade-specific genes were mainly included in genomic islands of extrachromosomal origin. Our study thus revealed both broader D. solani diversity with the characterisation of a third clade isolated in Latin America and a very high genomic conservation between clade members.