[Show abstract][Hide abstract] ABSTRACT: Background
Bacterial operons are considerably more complex than what were thought. At least their components are dynamically rather than statically defined as previously assumed. Here we present a computational study of the landscape of the transcriptional units (TUs) of E. coli K12, revealed by the available genomic and transcriptomic data, providing new understanding about the complexity of TUs as a whole encoded in the genome of E. coli K12.
Results and conclusion
Our main findings include that (i) different TUs may overlap with each other by sharing common genes, giving rise to clusters of overlapped TUs (TUCs) along the genomic sequence; (ii) the intergenic regions in front of the first gene of each TU tend to have more conserved sequence motifs than those of the other genes inside the TU, suggesting that TUs each have their own promoters; (iii) the terminators associated with the 3’ ends of TUCs tend to be Rho-independent terminators, substantially more often than terminators of TUs that end inside a TUC; and (iv) the functional relatedness of adjacent gene pairs in individual TUs is higher than those in TUCs, suggesting that individual TUs are more basic functional units than TUCs.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-015-0805-8) contains supplementary material, which is available to authorized users.
[Show abstract][Hide abstract] ABSTRACT: Sheep red blood cells (SRBCs) have long been used as a model antigen for eliciting systemic immune responses, yet the basis for their adjuvant activity has been unknown. Here, we show that SRBCs failed to engage the inhibitory mouse SIRPα receptor on splenic CD4(+) dendritic cells (DCs), and this failure led to DC activation. Removal of the SIRPα ligand, CD47, from self-RBCs was sufficient to convert them into an adjuvant for adaptive immune responses. DC capture of Cd47(-/-) RBCs and DC activation occurred within minutes in a Src-family-kinase- and CD18-integrin-dependent manner. These findings provide an explanation for the adjuvant mechanism of SRBCs and reveal that splenic DCs survey blood cells for missing self-CD47, a process that might contribute to detecting and mounting immune responses against pathogen-infected RBCs.
[Show abstract][Hide abstract] ABSTRACT: The grade of a cancer is a measure of the cancer's malignancy level, and the stage of a cancer refers to the size and the extent that the cancer has spread. Here we present a computational method for prediction of gene signatures and blood/urine protein markers for breast cancer grades and stages based on RNA-seq data, which are retrieved from the TCGA breast cancer dataset and cover 111 pairs of disease and matching adjacent noncancerous tissues with pathologists-assigned stages and grades. By applying a differential expression and an SVM-based classification approach, we found that 324 and 227 genes in cancer have their expression levels consistently up-regulated vs. their matching controls in a grade- and stage-dependent manner, respectively. By using these genes, we predicted a 9-gene panel as a gene signature for distinguishing poorly differentiated from moderately and well differentiated breast cancers, and a 19-gene panel as a gene signature for discriminating between the moderately and well differentiated breast cancers. Similarly, a 30-gene panel and a 21-gene panel are predicted as gene signatures for distinguishing advanced stage (stages III-IV) from early stage (stages I-II) cancer samples and for distinguishing stage II from stage I samples, respectively. We expect these gene panels can be used as gene-expression signatures for cancer grade and stage classification. In addition, of the 324 grade-dependent genes, 188 and 66 encode proteins that are predicted to be blood-secretory and urine-excretory, respectively; and of the 227 stage-dependent genes, 123 and 51 encode proteins predicted to be blood-secretory and urine-excretory, respectively. We anticipate that some combinations of these blood and urine proteins could serve as markers for monitoring breast cancer at specific grades and stages through blood and urine tests.
PLoS ONE 09/2015; 10(9):e0138213. DOI:10.1371/journal.pone.0138213 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Triterpenoids are multifunctional secondary metabolites in plants. But little information is available concerning the actual yield, optimal extraction method and pharmacologic activity with regard to triterpenoids from Jatropha curcas leaves (TJL). Hence, response surface methodology (RSM) was used to optimize the extraction parameters. The effects of three independent variables, namely liquid-to-solid ratio, ethanol concentration and extraction time on TJL yield were investigated. TJL obtained by silica column chromatography was tested against bacterial and fungal species relevant to oral disease and wounds through broth microdilution. Antioxidant activity was assessed using the 2,2-diphenyl-2-picrylhydrazyl and 2,2′-azino-bis(3-ethylbenzothiazoline-6-sulfonic acid) assays.
[Show abstract][Hide abstract] ABSTRACT: Gastric cancer is one of the most prevalent and aggressive cancers worldwide, and its molecular mechanism remains largely elusive. Here we report the genomic landscape in primary gastric adenocarcinoma of human, based on the complete genome sequences of five pairs of cancer and matching normal samples. In total, 103,464 somatic point mutations, including 407 non-synonymous ones, were identified and the most recurrent mutations were harbored by Mucins (MUC3A and MUC12) and transcription factors (ZNF717, ZNF595 and TP53). 679 genomic rearrangements were detected, which affect 355 protein-coding genes; and 76 genes show copy number changes. Through mapping the boundaries of the rearranged regions to the folded three-dimensional structure of human chromosomes, we determined that 79.6% of the chromosomal rearrangements happen among DNA fragments in close spatial proximity, especially when two endpoints stay in a similar replication phase. We demonstrated evidences that microhomology-mediated break induced replication was utilized as a mechanism in inducing ~40.9% of the identified genomic changes in gastric tumor. Our data analyses revealed potential integrations of Helicobacter pylori DNA into the gastric cancer genomes. Overall a large set of novel genomic variations were detected in these gastric cancer genomes, which may be essential to the study of the genetic basis and molecular mechanism of the gastric tumorigenesis. This article is protected by copyright. All rights reserved.
International Journal of Cancer 12/2014; 137(1). DOI:10.1002/ijc.29352 · 5.09 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Background
Jatropha curcas is a rich reservoir of pharmaceutically active terpenoids. More than 25 terpenoids have been isolated from this plant, and their activities are anti-bacterial, anti-fungal, anti-cancer, insecticidal, rodenticidal, cytotoxic and molluscicidal. But not much is known about the pathway involved in the biosynthesis of terpenoids. The present investigation describes the cloning, characterization and subcellular localization of isopentenyl diphosphate isomerase (IPI) gene from J. curcas. IPI is one of the rate limiting enzymes in the biosynthesis of terpenoids, catalyzing the crucial interconversion of isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP).
A full-length JcIPI cDNA consisting of 1355 bp was cloned. It encoded a protein of 305 amino acids. Analysis of deduced amino acid sequence predicted the presence of conserved active sites, metal binding sites and the NUDIX motif, which were consistent with other IPIs. Phylogenetic analysis indicated a significant evolutionary relatedness with Ricinus communis. Southern blot analysis showed the presence of an IPI multigene family in J. curcas. Comparative expression analysis of tissue specific JcIPI demonstrated the highest transcript level in flowers. Abiotic factors could induce the expression of JcIPI. Subcellular distribution showed that JcIPI was localized in chloroplasts.
This is the first report of cloning and characterization of IPI from J. curcas. Our study will be of significant interest to understanding the regulatory role of IPI in the biosynthesis of terpenoids, although its function still needs further confirmation.
[Show abstract][Hide abstract] ABSTRACT: A computational analysis of genome-scale transcriptomic data collected on ∼1,700 tissue samples of three cancer types: breast carcinoma, colon adenocarcinoma and lung adenocarcinoma, revealed that each tissue consists of (at least) two major subpopulations of cancer cells with different capabilities to handle fluctuating O2 levels. The two populations have distinct genomic and transcriptomic characteristics, one accelerating its proliferation under hypoxic conditions and the other proliferating faster with higher O2 levels, referred to as the hypoxia and the reoxygenation subpopulations, respectively. The proportions of the two subpopulations within a cancer tissue change as the average O2 level changes. They both contribute to cancer development but in a complementary manner. The hypoxia subpopulation tends to have higher proliferation rates than the reoxygenation one as well as higher apoptosis rates; and it is largely responsible for the acidic environment that enables tissue invasion and provides protection against attacks from T-cells. In comparison, the reoxygenation subpopulation generates new extracellular matrices in support of further growth of the tumor and strengthens cell-cell adhesion to provide scaffolds to keep all the cells connected. This subpopulation also serves as the major source of growth factors for tissue growth. These data and observations strongly suggest that these two major subpopulations within each tumor work together in a conjugative relationship to allow the tumor to overcome stresses associated with the constantly changing O2 level due to repeated growth and angiogenesis. The analysis results not only reveal new insights about the population dynamics within a tumor but also have implications to our understanding of possible causes of different cancer phenotypes such as diffused versus more tightly connected tumor tissues.
[Show abstract][Hide abstract] ABSTRACT: Pancreatic cancer is the deadliest of all cancers with worst outcome and poor survival rate. Chemotherapy with gemcitabine works well for early stage cancer, but becomes ineffective for advanced-stage cancer. As such, there is a dire need for new approaches to treat this cancer. The metabolism of tumor cells is very different from that of normal cells. In particular, the differences in amino acid metabolism are gaining increasing attention in cancer biology. Selective amino acid transporters are upregulated in cancer in response to the increased demands for amino acids in tumor cells. Such tumor-selective amino acid transporters are logical druggable targets for cancer therapy. As such, pharmacologic blockade of such upregulated transporters would lead to cell death selectively in tumor cells by depriving the tumor cells of essential nutrients. With this in mind, we analyzed 8 different publically available microarray datasets in Gene Expression Omnibus for the amino acid transporters that are upregulated
Cancer Research 10/2014; 74(19 Supplement-19 Supplement):4340. DOI:10.1158/1538-7445.AM2014-4340 · 9.33 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Essential proteins are those that are indispensable to cellular survival and development. Existing methods for essential protein identification generally rely on knock-out experiments and/or the relative density of their interactions (edges) with other proteins in a Protein-Protein Interaction (PPI) network. Here, we present a computational method, called EW, to first rank protein-protein interactions in terms of their Edge Weights, and then identify sub-PPI-networks consisting of only the highly-ranked edges and predict their proteins as essential proteins. We have applied this method to publicly-available PPI data on Saccharomyces cerevisiae (Yeast) and Escherichia coli (E. coli) for essential protein identification, and demonstrated that EW achieves better performance than the state-of-the-art methods in terms of the precision-recall and Jackknife measures. The highly-ranked protein-protein interactions by our prediction tend to be biologically significant in both the Yeast and E. coli PPI networks. Further analyses on systematically perturbed Yeast and E. coli PPI networks through randomly deleting edges demonstrate that the proposed method is robust and the top-ranked edges tend to be more associated with known essential proteins than the lowly-ranked edges.
PLoS ONE 09/2014; 9(9):e108716. DOI:10.1371/journal.pone.0108716 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The availability of a large number of sequenced bacterial genomes facilitates in-depth studies about why genes (operons) in a bacterial genome are globally organized the way they are. We have previously discovered that (the relative) transcription- activation frequencies among different biological pathways encoded in a genome have a dominating role in the global arrangement of operons. One complicating factor in such a study is that some operons may be involved in multiple pathways with different activation frequencies. A quantitative model has been developed that captures this information, which tends to be minimized by the current global arrangement of operons in a bacterial (and archaeal) genome compared to possible alternative arrangements. A study is carried out here using this model on a collection of 52 closely related E. coli genomes, which revealed interesting new insights about how bacterial genomes evolve to optimally adapt to their environments through adjusting the (relative) genomic locations of the encoding operons of biological pathways once their utilization and hence transcription activation frequencies change, to maintain the above energy-efficiency property. More specifically we observed that it is the frequencies of the transcription activation of pathways relative to those of the other encoded pathways in an organism as well as the variation in the activation frequencies of a specific pathway across the related genomes that play a key role in the observed commonalities and differences in the genomic organizations of genes (and operons) encoding specific pathways across different genomes.
[Show abstract][Hide abstract] ABSTRACT: Germinal center (GC) B cell-like diffuse large B cell lymphoma (GCB-DLBCL) is a common malignancy yet the signaling pathways deregulated and the factors leading to its systemic dissemination are poorly defined1,2. Work in mice showed that sphingosine-1-phosphate receptor-2 (S1PR2), a Gα12 and Gα13 coupled receptor, promotes growth regulation and local confinement of GC B cells3,4. Recent GCB-DLBCL deep sequencing studies have revealed mutations in a large number of genes in this cancer, including in GNA13 (encoding Gα13) and S1PR25-7. Here we show using in vitro and in vivo assays that GCB-DLBCL associated mutations occurring in S1PR2 frequently disrupt the receptor's Akt and migration inhibitory functions. Gα13-deficient mouse GC B cells and human GCB-DLBCL cells were unable to suppress pAkt and migration in response to S1P, and Gα13-deficient mice developed GC B cell-derived lymphoma. GC B cells, unlike most lymphocytes, are tightly confined in lymphoid organs and do not recirculate. Remarkably, deficiency in Gα13, but not S1PR2, led to GC B cell dissemination into lymph and blood. GCB-DLBCL cell lines frequently carried mutations in the Gα13 effector ARHGEF1, and Arhgef1-deficiency also led to GC B cell dissemination. The incomplete phenocopy of Gα13- and S1PR2-deficiency led us to discover that P2RY8, an orphan receptor that is mutated in GCB-DLBCL and another GC B cell-derived malignancy, Burkitt lymphoma (BL), also represses GC B cell growth and promotes confinement via Gα13. These findings identify a Gα13-dependent pathway that exerts dual actions in suppressing growth and blocking dissemination of GC B cells that is frequently disrupted in GC B cell-derived lymphoma.
[Show abstract][Hide abstract] ABSTRACT: A challenge in phylogenetic inference of gene trees is how to properly sample a large pool of homologous sequences to derive a good representative subset of sequences. Such a need arises in various applications, e.g. when (1) accuracy-oriented phylogenetic reconstruction methods may not be able to deal with a large pool of sequences due to their high demand in computing resources; (2) applications analyzing a collection of gene trees may prefer to use trees with fewer operational taxonomic units (OTUs), for instance for the detection of horizontal gene transfer events by identifying phylogenetic conflicts; and (3) the pool of available sequences is biased towards extensively studied species. In the past, the creation of subsamples often relied on manual selection. Here we present an Automated sequence-Sampling method for improving the Taxonomic diversity of gene phylogenetic trees, AST, to obtain representative sequences that maximize the taxonomic diversity of the sampled sequences. To demonstrate the effectiveness of AST, we have tested it to solve four problems, namely, inference of the evolutionary histories of the small ribosomal subunit protein S5 of E. coli, 16 S ribosomal RNAs and glycosyl-transferase gene family 8, and a study of ancient horizontal gene transfers from bacteria to plants. Our results show that the resolution of our computational results is almost as good as that of manual inference by domain experts, hence making the tool generally useful to phylogenetic studies by non-phylogeny specialists. The program is available at http://csbl.bmb.uga.edu/~zhouchan/AST.php.
PLoS ONE 06/2014; 9(6):e98844. DOI:10.1371/journal.pone.0098844 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The Clostridium genus of bacteria contains the most widely studied biofuel-producing organisms such as Clostridium thermocellum and also some human pathogens, plus a few less characterized strains. Here, we present a comparative genomic analysis of 40 fully sequenced clostridial genomes, paying a particular attention to the biomass degradation ones. Our analysis indicates that some of the Clostridium botulinum strains may have been incorrectly classified in the current taxonomy and hence should be renamed according to the 16S ribosomal RNA (rRNA) phylogeny. A core-genome analysis suggests that only 169 orthologous gene groups are shared by all the strains, and the strain-specific gene pool consists of 22,668 genes, which is consistent with the fact that these bacteria live in very diverse environments and have evolved a very large number of strain-specific genes to adapt to different environments. Across the 40 genomes, 1.4–5.8 % of genes fall into the carbohydrate active enzyme (CAZyme) families, and 20 out of the 40 genomes may encode cellulosomes with each genome having 1 to 76 genes bearing the cellulosome-related modules such as dockerins and cohesins. A phylogenetic footprinting analysis identified cis-regulatory motifs that are enriched in the promoters of the CAZyme genes, giving rise to 32 statistically significant motif candidates.
BioEnergy Research 06/2014; 7(4). DOI:10.1007/s12155-014-9486-9 · 3.54 Impact Factor