Zamin Iqbal's research while affiliated with EMBL-EBI and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (193)
While the malaria parasite P. falciparum has low average genome-wide diversity levels, likely due to its recent introduction from a gorilla-infecting ancestor (~10,000-50,000 years ago), some genes display extremely high diversity levels. In particular, certain proteins expressed on the surface of human red-blood-cell-infecting merozoites (merozoit...
Universal access to drug susceptibility testing for newly diagnosed tuberculosis patients is recommended. Access to culture-based diagnostics remains limited, and targeted molecular assays are vulnerable to emerging resistance mutations. Improved protocols for direct-from-sputum Mycobacterium tuberculosis sequencing would accelerate access to compr...
Outbreak strains of Mycobacterium tuberculosis are promising candidates as targets in the search for intrinsic determinants of transmissibility, as they are responsible for many cases with sustained transmission; however, the use of low-resolution typing methods and restricted geographical investigations represent flaws in assessing the success of...
The antibiotic Bedaquiline (BDQ) is a key component of new WHO regimens for drug resistant tuberculosis (TB) but predicting BDQ resistance (BDQ-R) from genotypes remains challenging. We analysed a collection (n=505) of Mycobacterium tuberculosis from two high prevalence areas in South Africa (Cape Town and Johannesburg, 2019-2020), and found 53 ind...
Background
Mycobacterium tuberculosis whole-genome sequencing (WGS) has been widely used for genotypic drug susceptibility testing (DST) and outbreak investigation. For both applications, Illumina technology is used by most public health laboratories; however, Nanopore technology developed by Oxford Nanopore Technologies has not been thoroughly eva...
Background
Universal access to drug susceptibility testing for newly diagnosed tuberculosis patients is recommended. Access to culture-based diagnostics remains limited and targeted molecular assays are vulnerable to emerging resistance conferring mutations. Improved sample preparation protocols for direct-from-sputum sequencing of Mycobacterium tu...
Background
Viet Nam has high rates of antimicrobial resistance (AMR) but little capacity for genomic surveillance. This study used whole genome sequencing to examine the prevalence and transmission of three key AMR pathogens in two intensive care units (ICUs) in Hanoi, Viet Nam.
Methods
A prospective surveillance study of all adults admitted to IC...
Background:
Multidrug-resistant (MDR) Mycobacterium tuberculosis complex (MTBC) strains are a serious health problem in India, also contributing to one-fourth of the global MDR tuberculosis (TB) burden. About 36% of the MDR MTBC strains are reported fluoroquinolone (FQ) resistant leading to high pre-extensively drug-resistant (pre-XDR) and XDR-TB...
The emergence of drug-resistant tuberculosis is a major global public health concern that threatens the ability to control the disease. Whole-genome sequencing as a tool to rapidly diagnose resistant infections can transform patient treatment and clinical practice. While resistance mechanisms are well understood for some drugs, there are likely man...
The Comprehensive Resistance Prediction for Tuberculosis: an International Consortium (CRyPTIC) presents here a data compendium of 12,289 Mycobacterium tuberculosis global clinical isolates, all of which have undergone whole-genome sequencing and have had their minimum inhibitory concentrations to 13 antitubercular drugs measured in a single assay....
There are many short-read variant-calling tools, with different strengths and weaknesses. We present a tool, Minos, which combines outputs from arbitrary variant callers, increasing recall without loss of precision. We benchmark on 62 samples from three bacterial species and an outbreak of 385 Mycobacterium tuberculosis samples. Minos also enables...
The open sharing of genomic data provides an incredibly rich resource for the study of bacterial evolution and function, and even anthropogenic perturbations such as the widespread use of antimicrobials. Whilst these archives are rich in data, considerable processing is required before a biological question can be addressed. Here, we have assembled...
Background
Outbreak strains are good candidates to look for intrinsic transmissibility as they are responsible for a large number of cases with sustained transmission. However, assessment of the success of long-lived outbreak strains has been flawed by the use of low-resolution typing methods and restricted geographical investigations. We now have...
Viral sequence data from clinical samples frequently contain contaminating human reads, which must be removed prior to sharing for legal and ethical reasons. To enable host read removal for SARS-CoV-2 sequencing data on low-specification laptops, we developed ReadItAndKeep, a fast lightweight tool for Illumina and nanopore data that only keeps read...
Background: Molecular diagnostics are considered the most promising route to achieving rapid, universal drug susceptibility testing for Mycobacterium tuberculosiscomplex (MTBC). We aimed to generate a WHO endorsed catalogue of mutations to serve as a global standard for interpreting molecular information for drug resistance prediction. Methods: A c...
Background
Mycobacterium tuberculosis whole-genome sequencing (WGS) using Illumina technology has been widely adopted for genotypic drug susceptibility testing (DST) and outbreak investigation. Oxford Nanopore Technologies is reported to have higher error rates but has not been thoroughly evaluated for these applications.
Methods
We analyse 151 is...
Background
Molecular diagnostics are considered the most promising route to achievement of rapid, universal drug susceptibility testing for Mycobacterium tuberculosis complex (MTBC). We aimed to generate a WHO-endorsed catalogue of mutations to serve as a global standard for interpreting molecular information for drug resistance prediction.
Method...
Background: Healthcare-associated infections (HCAIs) affect the most vulnerable persons in society and are increasingly difficult to treat in the face of mounting antimicrobial resistance (AMR). We used whole-genome sequencing (WGS) to retrospectively analyse carbapenemase-producing Gram negative bacteria from a single hospital in the United Kingdo...
Viral sequence data from clinical samples frequently contain human contamination, which must be removed prior to sharing for legal and ethical reasons. To enable host read removal for SARS-CoV-2 sequencing data on low-specification laptops, we developed ReadItAndKeep, a fast lightweight tool for Illumina and nanopore data that only keeps reads matc...
Motivation
Short-read whole genome sequencing (WGS) is a vital tool for clinical applications and basic research. Genetic divergence from the reference genome, repetitive sequences, and sequencing bias reduce the performance of variant calling using short-read alignment, but the loss in recall and specificity has not been adequately characterized....
The open sharing of genomic data provides an incredibly rich resource for the study of bacterial evolution and function and even anthropogenic activities such as the widespread use of antimicrobials. However, these data consist of genomes assembled with different tools and levels of quality checking, and of large volumes of completely unprocessed r...
Short-read variant calling for bacterial genomics is a mature field, and there are many widely-used software tools. Different underlying approaches (eg pileup, local or global assembly, paired-read use, haplotype use) lend each tool different strengths, especially when considering non-SNP (single nucleotide polymorphism) variation or potentially di...
We present pandora , a novel pan-genome graph structure and algorithms for identifying variants across the full bacterial pan-genome. As much bacterial adaptability hinges on the accessory genome, methods which analyze SNPs in just the core genome have unsatisfactory limitations. Pandora approximates a sequenced genome as a recombinant of reference...
Genome graphs allow very general representations of genetic variation; depending on the model and implementation, variation at different length-scales (single nucleotide polymorphisms (SNPs), structural variants) and on different sequence backgrounds can be incorporated with different levels of transparency. We implement a model which handles this...
Background
Multidrug-resistant Mycobacterium tuberculosis ( Mtb ) is a significant global public health threat. Genotypic resistance prediction from Mtb DNA sequences offers an alternative to laboratory-based drug-susceptibility testing. User-friendly and accurate resistance prediction tools are needed to enable public health and clinical practitio...
Shigella sonnei is the most common agent of shigellosis in high-income countries, and causes a significant disease burden in low- and middle-income countries. Antimicrobial resistance is increasingly common in all settings. Whole genome sequencing (WGS) is increasingly utilised for S. sonnei outbreak investigation and surveillance, but comparison o...
Background: Short-read whole genome sequencing (WGS) is a vital tool for clinical applications and basic research. Genetic divergence from the reference genome, repetitive sequences, and sequencing bias, reduce the performance of variant calling using short-read alignment, but the loss in recall and specificity has not been adequately characterized...
Introduction
Multidrug-resistant Mycobacterium tuberculosis ( Mtb ) is a significant global public health threat. Genotypic resistance prediction from Mtb DNA sequences offers an alternative to laboratory-based drug-susceptibility testing. User-friendly and accurate resistance prediction tools are needed to enable public health and clinical practit...
Bedaquiline (BDQ) and clofazimine (CFZ) are core drugs for treatment of multidrug resistant tuberculosis (MDR-TB), however, our understanding of the resistance mechanisms for these drugs is sparse which is hampering rapid molecular diagnostics. To address this, we employed a unique approach using experimental evolution, protein modelling, genome se...
The open sharing of genomic data provides an incredibly rich resource for the study of bacterial evolution and function, and even anthropogenic activities such as the widespread use of antimicrobials. Whilst these archives are rich in data, considerable processing is required before biological questions can be addressed. Here, we assembled and char...
Tuberculosis (TB) is an ancient disease affecting a plethora of domestic and wild animals, including humans. In primates, TB can cause severe multisystemic disease. The prevalence of TB in lemurs within Madagascar is unknown; the most recent documented case occurred in 1973 (1). Reverse zoonotic transmission of TB can occur when nonhuman primates a...
Background: Standard approaches to characterising genetic variation revolve around mapping reads to a reference genome and describing variants in terms of differences from the reference; this is based on the assumption that these differences will be small and provides a simple coordinate system. However this fails, and the coordinates break down, w...
Background
Vietnam has high rates of antimicrobial resistance (AMR) but limited capacity for genomic surveillance. This study used whole genome sequencing (WGS) to examine the prevalence and transmission of three key AMR pathogens in two intensive care units in Hanoi, Vietnam.
Methods
A prospective surveillance study of all adults admitted to inte...
Background
Bacterial genomes follow a U-shaped frequency distribution whereby most genomic loci are either rare (accessory) or common (core); the union of these is the pan-genome. The alignable fraction of two genomes from a single species can be low (e.g. 50-70%), such that no single reference genome can access all single nucleotide polymorphisms...
Shigella sonnei is the most common agent of shigellosis in high-income countries, and causes a significant disease burden in low-and middle-income countries. Antimicrobial resistance is increasingly common in all settings. Whole genome sequencing (WGS) is increasingly utilised for S. sonnei outbreak investigation and surveillance, but comparison of...
The characterization of de novo mutations in regions of high sequence and structural diversity from whole-genome sequencing data remains highly challenging. Complex structural variants tend to arise in regions of high repetitiveness and low complexity, challenging both de novo assembly, in which short reads do not capture the long-range context req...
Two billion people are infected with Mycobacterium tuberculosis, leading to 10 million new cases of active tuberculosis and 1.5 million deaths annually. Universal access to drug susceptibility testing (DST) has become a World Health Organization priority. We previously developed a software tool, Mykrobe predictor, which provided offline species ide...
Two billion people are infected with Mycobacterium tuberculosis , leading to 10 million new cases of active tuberculosis and 1.5 million deaths annually. Universal access to drug susceptibility testing (DST) has become a World Health Organization priority. We previously developed a software tool, Mykrobe predictor , which provided offline species i...
We present COBS, a COmpact Bit-sliced Signature index, which is a cross-over between an inverted index and Bloom filters. Our target application is to index k-mers of DNA samples or q-grams from text documents and process approximate pattern matching queries on the corpus with a user-chosen coverage threshold. Query results may contain a number of...
The characterization of de novo mutations in regions of high sequence and structural diversity from whole genome sequencing data remains highly challenging. Complex structural variants tend to arise in regions of high repetitiveness and low complexity, challenging both de novo assembly, where short-reads do not capture the long-range context requir...
We present COBS, a compact bit-sliced signature index, which is a cross-over between an inverted index and Bloom filters. Our target application is to index $k$-mers of DNA samples or $q$-grams from text documents and process approximate pattern matching queries on the corpus with a user-chosen coverage threshold. Query results may contain a number...
New antibiotics are urgently needed to combat rising rates of resistance against all existing classes of antimicrobials. We highlight key issues that complicate the prediction of resistance evolution in the real world and outline the ways in which these can be overcome.
The clinical phenotype of zoonotic tuberculosis and its contribution to the global burden of disease are poorly understood and probably underestimated. This shortcoming is partly because of the inability of currently available laboratory and in silico tools to accurately identify all subspecies of the Mycobacterium tuberculosis complex (MTBC). We p...
Exponentially increasing amounts of unprocessed bacterial and viral genomic sequence data are stored in the global archives. The ability to query these data for sequence search terms would facilitate both basic research and applications such as real-time genomic epidemiology and surveillance. However, this is not possible with current methods. To s...
BACKGROUND The World Health Organization recommends drug-susceptibility testing of Mycobacterium tuberculosis complex for all patients with tuberculosis to guide treatment decisions and improve outcomes. Whether DNA sequencing can be used to accurately predict profiles of susceptibility to first-line antituberculosis drugs has not been clear. METHO...
Background : In principle, whole genome sequencing (WGS) can predict phenotypic resistance directly from genotype, replacing laboratory-based tests. However, the contribution of different bioinformatics methods to genotype-phenotype discrepancies has not been systematically explored to date.
Methods : We compared three WGS-based bioinformatics meth...
Colistin represents one of the few available drugs for treating infections caused by carbapenem-resistant Enterobacteriaceae. As such, the recent plasmid-mediated spread of the colistin resistance gene mcr-1 poses a significant public health threat, requiring global monitoring and surveillance. Here, we characterize the global distribution of mcr-1...
Motivation:
The de Bruijn graph is a simple and efficient data structure that is used in many areas of sequence analysis including genome assembly, read error correction and variant calling. The data structure has a single parameter k, is straightforward to implement and is tractable for large genomes with high sequencing depth. It also enables re...
This study aimed to assess the feasibility of using the Oxford Nanopore Technologies (ONT) MinION long-read sequencer in reconstructing fully closed plasmid sequences from eight Enterobacteriaceae isolates of six different species with plasmid populations of varying complexity. Species represented were Escherichia coli, Klebsiella pneumoniae, Citro...
Purpose:
Speed of bloodstream infection diagnosis is vital to reduce morbidity and mortality. Whole genome sequencing (WGS) performed directly from liquid blood culture could provide single-assay species and antibiotic susceptibility prediction; however, high inhibitor and human cell/DNA concentrations limit pathogen recovery. We develop a method...
The clinical phenotype of zoonotic tuberculosis, its contribution to the global burden of disease and prevalence are poorly understood and probably underestimated. This is partly because currently available laboratory and in silico tools have not been calibrated to accurately identify all subspecies of the Mycobacterium tuberculosis complex ( Mtbc...
For all ontologies showing enrichment in within-patient BD-class variants, we identified the genes with variants contributing to the signal.
We counted the number of protein-altering variants in these genes within patients, and compared to the number in long-term asymptomatic carriers. p-Values calculated using Fisher’s exact test. *Variant totals...
Bacteria responsible for the greatest global mortality colonize the human microbiota far more frequently than they cause severe infections. Whether mutation and selection among commensal bacteria are associated with infection is unknown. We investigated de novo mutation in 1163 Staphylococcus aureus genomes from 105 infected patients with nose colo...
List of all variants found within patients with S. aureus infections, location on shared reference (MRSA252), or position and reference genome name and accession number if variant could not be localized on MRSA252.
Each variant is described by the alleles found, its location in gene, the predicted effect on gene product and the location of the vari...
List of all cultures included in the site, the site of infection (and any known source if bloodstream), number of isolates sequenced from each site, ST or CC by in silico MLST, number of variants found at each site and the mean pair-wise difference comparing isolates.
Neutrality indices show signals of adaptation among the genes, gene ontologies and expression pathways most significantly enriched for protein-altering B-class variants.
Neutrality indices (NIs, 41,42) were calculated as the odds ratio of the number of protein-altering to synonymous variants among B-class versus C/D-class variants. These tests are...
List of all variants found within long term asymptomatic carriers, location on shared reference (MRSA252), or position and reference genome name and accession number if variant was not localized on MRSA252.
Each variant is described by the alleles found, its location in gene and the predicted effect on gene product.
Genome sequencing of pathogens is now ubiquitous in microbiology, and the sequence archives are effectively no longer searchable for arbitrary sequences. Furthermore, the exponential increase of these archives is likely to be further spurred by automated diagnostics. To unlock their use for scientific research and real-time surveillance we have com...
Motivation:
Correct and rapid determination of Mycobacterium tuberculosis (MTB) resistance against available tuberculosis (TB) drugs is essential for the control and management of TB. Conventional molecular diagnostic test assumes that the presence of any well-studied single nucleotide polymorphisms is sufficient to cause resistance, which yields...