BMC Genomics publishes original research articles in all aspects of gene mapping, sequencing and analysis, functional genomics, and proteomics.

  • Impact factor
    Show impact factor history
    Impact factor
  • 5-year impact
  • Cited half-life
  • Immediacy index
  • Eigenfactor
  • Article influence
  • Website
    BMC Genomics website
  • Other titles
    BMC genomics, Genomics
  • ISSN
  • OCLC
  • Material type
    Document, Periodical, Internet resource
  • Document type
    Internet Resource, Computer File, Journal / Magazine / Newspaper

Publications in this journal

  • [Show abstract] [Hide abstract]
    ABSTRACT: Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM. Results 24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%. Conclusions In this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed. Overall, the results of this study, based on orthogonal concordance testing and phylogenetic scrutiny, supported that whole mtGenome sequence data with high accuracy can be obtained using the PGM platform.
    BMC Genomics 01/2015;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Background: Biological nitrogen fixation, with an emphasis on the legume-rhizobia symbiosis, is a key process for agriculture and the environment, allowing the replacement of nitrogen fertilizers, reducing water pollution by nitrate as well as emission of greenhouse gases. Soils contain numerous strains belonging to the bacterial genus Bradyrhizobium, which establish symbioses with a variety of legumes. However, due to the high conservation of Bradyrhizobium 16S rRNA genes-considered as the backbone of the taxonomy of prokaryotes-few species have been delineated. The multilocus sequence analysis (MLSA) methodology, which includes analysis of housekeeping genes, has been shown to be promising and powerful for defining bacterial species, and, in this study, it was applied to Bradyrhizobium species, increasing our understanding of the diversity of nitrogen-fixing bacteria. Description: Classification of bacteria of agronomic importance is relevant to biodiversity, as well as to biotechnological manipulation to improve agricultural productivity. We propose the construction of an on-line database that will provide information and tools using MLSA to improve phylogenetic and taxonomic characterization of Bradyrhizobium, allowing the comparison of genomic sequences with those of type and representative strains of each species. Conclusion: A database for the taxonomic and phylogenetic identification of the Bradyrhizobium genus, using MLSA, will facilitate the use of biological data available through an intuitive web interface. Sequences stored in the on-line database can be compared with multiple sequences of other strains with simplicity and agility through multiple alignment algorithms and computational routines integrated into the database. The proposed database and software tools are available at, and can be used, free of charge, by researchers worldwide to classify Bradyrhizobium strains; the database and software can be applied to replicate the experiments presented in this study as well as to generate new experiments. The next step will be expansion of the database to include other rhizobial species.
    BMC Genomics 01/2015;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Background Identification of differentially expressed genes from transcriptomic studies is one of the most common mechanisms to identify tumor biomarkers. This approach however is not well suited to identify interaction between genes whose protein products potentially influence each other, which limits its power to identify the molecular wiring of tumor cells dictating response to a drug. Due to the fact that signal transduction pathways are not linear and highly interlinked, the biological response they drive may be better described by the relative amount of their components and their functional relationships than by their individual, absolute expression. Methods Gene expression microarray data for 109 tumor cell lines with known sensitivity to the death ligand cytokine tumor necrosis factor-related apoptosis-inducing ligand (TRAIL) was used to identify genes with potential functional relationships determining responsiveness to TRAIL-induced apoptosis. The machine learning technique Random Forest in the statistical environment "R" with backward elimination was used to identify the key predictors of TRAIL sensitivity and differentially expressed genes were identified using the software GeneSpring. Gene co-regulation and statistical interaction was assessed with q-order partial correlation analysis and non-rejection rate. Biological (functional) interactions among the co-acting genes were studied with Ingenuity network analysis. Prediction accuracy was assessed by calculating area under the receiver operator curve using an independent data set. Results We show that the gene panel identified could predict TRAIL-sensitivity with a very high degree of sensitivity and specificity (AUC = 0 . 84). The genes in the panel are co-regulated and at least 40% of them functionally interact in signal transduction pathways that regulate cell death and cell survival, cellular differentiation and morphogenesis. Importantly, only 12% of the TRAIL-predictor genes were differentially expressed highlighting the importance of functional interactions in predicting the biological response. Conclusions The advantage of co-acting gene clusters is that this analysis does not depend on differential expression, but is able to incorporate direct- and indirect gene interactions as well as tissue- and cell-specific characteristics. This approach (1) identified a descriptor of TRAIL sensitivity which performs significantly better as a predictor of TRAIL sensitivity than any previously reported gene signatures, (2) identified potential novel regulators of TRAIL-responsiveness and (3) provided a systematic view highlighting fundamental differences between the molecular wiring of sensitive and resistant cell types.
    BMC Genomics 12/2014; 15(1144).
  • [Show abstract] [Hide abstract]
    ABSTRACT: Background Although profiling of RNA in single cells has broadened our understanding of development, cancer biology and mechanisms of disease dissemination, it requires the development of reliable and flexible methods. Here we demonstrate that the EpiStem RNA-AmpTM methodology reproducibly generates microgram amounts of cDNA suitable for RNA-Seq, RT-qPCR arrays and Microarray analysis. Results Initial experiments compared amplified cDNA generated by three commercial RNA-Amplification protocols (Miltenyi muMACSTM SuperAmpTM, NuGEN Ovation(R) One-Direct System and EpiStem RNA-AmpTM) applied to single cell equivalent levels of RNA (25-50 pg) using Affymetrix arrays. The EpiStem RNA-AmpTM kit exhibited the highest sensitivity and was therefore chosen for further testing. A comparison of Affymetrix array data from RNA-AmpTM cDNA generated from single MCF7 and MCF10A cells to reference controls of unamplified cDNA revealed a high degree of concordance. To assess the flexibility of the amplification system single cell RNA-AmpTM cDNA was also analysed using RNA-Seq and high-density qPCR, and showed strong cross-platform correlations. To exemplify the approach we used the system to analyse RNA profiles of small populations of rare cancer initiating cells (CICs) derived from a NSCLC patient-derived xenograft. RNA-Seq analysis was able to identify transcriptional differences in distinct subsets of CIC, with one group potentially enriched for metastasis formation. Pathway analysis revealed that the distinct transcriptional signatures demonstrated in the CIC subpopulations were significantly correlated with published stem-cell and epithelial-mesenchymal transition signatures. Conclusions The combined results confirm the sensitivity and flexibility of the RNA-AmpTM method and demonstrate the suitability of the approach for identifying clinically relevant signatures in rare, biologically important cell populations.
    BMC Genomics 12/2014; 15:1129.
  • BMC Genomics 12/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Background During the past two decades, avian influenza A H9N2 viruses have spread geographically and ecologically in China. Other than its current role in causing outbreaks in poultry and sporadic human infections by direct transmission, H9N2 virus could also serve as an progenitor for novel human avian influenza viruses including H5N1, H7N9 and H10N8. Hence, H9N2 virus is becoming a notable threat to public health. However, despite multiple lineages and genotypes that were detected by previous studies, the migration dynamics of the H9N2 virus in China is unclear. Increasing such knowledge would help us better prevent and control H9N2 as well as other future potentially threatening viruses from spreading across China. The objectives of this study were to determine the source, migration patterns, and the demography history of avian influenza A H9N2 virus that circulated in China. Results Using Bayesian phylogeography framework, we showed that the H9N2 virus in mainland China may have originated from the Hong Kong Special Administrative Region (SAR). Southern China, most likely the Guangdong province acts as the primary epicentre for multiple H9N2 strains spreading across the whole country, and eastern China, most likely the Jiangsu province, acts as an important secondary source to seed outbreaks. Our demography inference suggests that during the long-term migration process, H9N2 evolved into multiple diverse lineages and then experienced a selective sweep, which reduced its genetic diversity. Importantly, such a selective sweep may pose a greater threat to public health because novel strains confer higher fitness advantages than strains being replaced and could generate new viruses through reassortment. Conclusion Our analyses indicate that migratory birds, poultry trade and transportation have all contributed to the spreading of the H9N2 virus in China. The ongoing migration and evolution of H9N2, which poses a constant threat to the human population, highlights the need for a more comprehensive surveillance of wild birds and for the enhancement of biosafety for China's poultry industry.
    BMC Genomics 12/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Background Fimbriae are bacterial cell surface organelles involved in the pathogenesis of many bacterial species, including Gallibacterium anatis, in which a F17-like fimbriae of the chaperone-usher (CU) family was recently shown to be an important virulence factor and vaccine candidate. To reveal the distribution and variability of CU fimbriae 22 genomes of the avian hostrestricted bacteria Gallibacterium spp. were investigated. Fimbrial clusters were classified using phylogeny-based and conserved domain (CD) distribution-based approaches. To characterize the fimbriae in depth evolutionary analysis and in vitro expression of the most prevalent fimbrial clusters was performed. Results Overall 48 CU fimbriae were identified in the genomes of the examined Gallibacterium isolates. All fimbriae were assigned to γ4 clade of the CU fimbriae of Gram-negative bacteria and were organized in four-gene clusters encoding a putative major fimbrial subunit, a chaperone, an usher and a fimbrial adhesin. Five fimbrial clusters (Flf-Flf4) and eight conserved domain groups were defined to accommodate the identified fimbriae. Although, the number of different fimbrial clusters in individual Gallibacterium genomes was low, there was substantial amino acid sequence variability in the major fimbrial subunit and the adhesin proteins. The distribution of CDs among fimbrial clusters, analysis of their flanking regions, and evolutionary comparison of the strains revealed that Gallibacterium fimbrial clusters likely underwent evolutionary divergence resulting in highly host adapted and antigenically variable fimbriae. In vitro, only the fimbrial subunit FlfA was expressed in most Gallibacterium strains encoding this protein. The absence or scarce expression of the two other common fimbrial subunits (Flf1A and Flf3A) indicates that their expression may require other in vitro or in vivo conditions. Conclusions This is the first approach establishing a systematic fimbria classification system within Gallibacterium spp., which indicates a species-wide distribution of γ4 CU fimbriae among a diverse collection of Gallibacterium isolates. The expression of only one out of up to three fimbriae present in the individual genomes in vitro suggests that fimbriae expression in Gallibacterium is highly regulated. This information is important for future attempts to understand the role of Gallibacterium fimbriae in pathogenesis, and may prove useful for improved control of Gallibacterium infections in chickens.
    BMC Genomics 12/2014; 15:1093.
  • [Show abstract] [Hide abstract]
    ABSTRACT: The phytoplasma-borne disease flavescence doree is still a threat to European viticulture, despite mandatory control measures and prophylaxis against the leafhopper vector. Given the economic importance of grapevine, it is essential to find alternative strategies to contain the spread, in order to possibly reduce the current use of harmful insecticides. Further studies of the pathogen, the vector and the mechanisms of phytoplasma-host interactions could improve our understanding of the disease. In this work, RNA-Seq technology followed by three de novo assembly strategies was used to provide the first comprehensive transcriptomics landscape of flavescence doree phytoplasma (FD) infecting field-grown Vitis vinifera leaves. With an average of 8300 FD-mapped reads per library, we assembled 347 sequences, corresponding to 215 annotated genes, and identified 10 previously unannotated genes, 15 polycistronic transcripts and three genes supposedly localized in the gaps of the FD92 draft genome. Furthermore, we improved the annotation of 44 genes with the addition of 5[prime]/3[prime] untranslated regions. Functional classification revealed that the most expressed genes were either related to translation and protein biosynthesis or hypothetical proteins with unknown function. Some of these hypothetical proteins were predicted to be secreted, so they could be bacterial effectors with a potential role in modulating the interaction with the host plant. Interestingly, qRT-PCR validation of the RNA-Seq expression values confirmed that a group II intron represented the FD genomic region with the highest expression during grapevine infection. This mobile element may contribute to the genomic plasticity that is necessary for the phytoplasma to increase its fitness and endorse host-adaptive strategies. The RNA-Seq technology was successfully applied for the first time to analyse the FD global transcriptome profile during grapevine infection. Our results provided new insights into the transcriptional organization and gene structure of FD. This may represent the starting point for the application of high-throughput sequencing technologies to study differential expression in FD and in other phytoplasmas with an unprecedented resolution.
    BMC Genomics 12/2014; 15:1088.