[show abstract][hide abstract] ABSTRACT: BACKGROUND: Somatic mutation-calling based on DNA from matched tumor-normal patient samples is one of the key tasks carried by many cancer genome projects. One such large-scale project is The Cancer Genome Atlas (TCGA), which is now routinely compiling catalogs of somatic mutations from hundreds of paired tumor-normal DNA exome-sequence data. Nonetheless, mutation calling is still very challenging. TCGA benchmark studies revealed that even relatively recent mutation callers from major centers showed substantial discrepancies. Evaluation of the mutation callers or understanding the sources of discrepancies is not straightforward, since for most tumor studies, validation data based on independent whole-exome DNA sequencing is not available, only partial validation data for a selected (ascertained) subset of sites. RESULTS: To provide guidelines to comparing outputs from multiple callers, we have analyzed two sets of mutation-calling data from the TCGA benchmark studies and their partial validation data. Various aspects of the mutation-calling outputs were explored to characterize the discrepancies in detail. To assess the performances of multiple callers, we introduce four approaches utilizing the external sequence data to varying degrees, ranging from having independent DNA-seq pairs, RNA-seq for tumor samples only, the original exome-seq pairs only, or none of those. CONCLUSIONS: Our analyses provide guidelines to visualizing and understanding the discrepancies among the outputs from multiple callers. Furthermore, applying the four evaluation approaches to the whole exome data, we illustrate the challenges and highlight the various circumstances that require extra caution in assessing the performances of multiple callers.
[show abstract][hide abstract] ABSTRACT: The transcription factor inhibitor of DNA binding (Id)2 modulates T cell fate decisions, but the molecular mechanism underpinning this regulation is unclear. In this study we show that loss of Id2 cripples effector differentiation and instead programs CD8(+) T cells to adopt a memory fate with increased Eomesodermin and Tcf7 expression. We demonstrate that Id2 restrains CD8(+) T cell memory differentiation by inhibiting E2A-mediated direct activation of Tcf7 and that Id2 expression level mirrors T cell memory recall capacity. As a result of the defective effector differentiation, Id2-deficient CD8(+) T cells fail to induce sufficient Tbx21 expression to generate short-lived effector CD8(+) T cells. Our findings reveal that the Id2/E2A axis orchestrates T cell differentiation through the induction or repression of downstream transcription factors essential for effector and memory T cell differentiation.
The Journal of Immunology 03/2013; · 5.52 Impact Factor
[show abstract][hide abstract] ABSTRACT: Summary Gene expression profiling using microarrays and xenograft transplants of human cancer cell lines are both popular tools to investigate human cancer. However, the undefined degree of cross hybridization between the mouse and human genomes hinders the use of microarrays to characterize gene expression of both the host and the cancer cell within the xenograft. Since an increasingly recognized aspect of cancer is the host response (or cancer-stroma interaction), we describe here a bioinformatic manipulation of the Affymetrix profiling that allows interrogation of the gene expression of both the mouse host and the human tumour. Evidence of microenvironmental regulation of epithelial mesenchymal transition of the tumour component in vivo is resolved against a background of mesenchymal gene expression. This tool could allow deeper insight to the mechanism of action of anti-cancer drugs, as typically novel drug efficacy is being tested in xenograft systems.
[show abstract][hide abstract] ABSTRACT: Statistical matters form an integral part of a metabolomics experiment. In this chapter we describe several important aspects in the analysis of metabolomics data such as the removal of unwanted variation and the identification of differentially abundant metabolites, along with a number of other essential statistical considerations.
Methods in molecular biology (Clifton, N.J.) 01/2013; 1055:291-307.
[show abstract][hide abstract] ABSTRACT: Dark-grown seedlings exhibit skotomorphogenic development. Genetic and molecular evidence indicates that a quartet of Arabidopsis Phytochrome (phy)-Interacting bHLH Factors (PIF1, 3, 4, and 5) are critically necessary to maintaining this developmental state and that light activation of phy induces a switch to photomorphogenic development by inducing rapid degradation of the PIFs. Here, using integrated ChIP-seq and RNA-seq analyses, we have identified genes that are direct targets of PIF3 transcriptional regulation, exerted by sequence-specific binding to G-box (CACGTG) or PBE-box (CACATG) motifs in the target promoters genome-wide. In addition, expression analysis of selected genes in this set, in all triple pif-mutant combinations, provides evidence that the PIF quartet members collaborate to generate an expression pattern that is the product of a mosaic of differential transcriptional responsiveness of individual genes to the different PIFs and of differential regulatory activity of individual PIFs toward the different genes. Together with prior evidence that all four PIFs can bind to G-boxes, the data suggest that this collective activity may be exerted via shared occupancy of binding sites in target promoters.
[show abstract][hide abstract] ABSTRACT: When both parasite species are co-endemic, Plasmodium vivax incidence peaks in younger children compared to P. falciparum. To identify differences in the number of blood stage infections of these species and its potential link to acquisition of immunity, we have estimated the molecular force of blood-stage infection of P. vivax (molFOB, i.e. the number of genetically distinct blood-stage infections over time), and compared it to previously reported values for P. falciparum.
P. vivax molFOB was estimated by high resolution genotyping parasites in samples collected over 16 months in a cohort of 264 Papua New Guinean children living in an area highly endemic for P. falciparum and P. vivax. In this cohort, P. vivax episodes decreased three-fold over the age range of 1-4.5 years.
On average, children acquired 14.0 new P. vivax blood-stage clones/child/year-at-risk. While the incidence of clinical P. vivax illness was strongly associated with mol FOB (incidence rate ratio (IRR) = 1.99, 95% confidence interval (CI95) [1.80, 2.19]), molFOB did not change with age. The incidence of P. vivax showed a faster decrease with age in children with high (IRR = 0.49, CI95 [0.38, 0.64] p<0.001) compared to those with low exposure (IRR = 0.63, CI95[0.43, 0.93] p = 0.02).
P. vivax molFOB is considerably higher than P. falciparum molFOB (5.5 clones/child/year-at-risk). The high number of P. vivax clones that infect children in early childhood contribute to the rapid acquisition of immunity against clinical P. vivax malaria.
[show abstract][hide abstract] ABSTRACT: mir-17-92, a potent polycistronic oncomir, encodes six mature miRNAs with complex modes of interactions. In the Eμ-myc Burkitt's lymphoma model, mir-17-92 exhibits potent oncogenic activity by repressing c-Myc-induced apoptosis, primarily through its miR-19 components. Surprisingly, mir-17-92 also encodes the miR-92 component that negatively regulates its oncogenic cooperation with c-Myc. This miR-92 effect is, at least in part, mediated by its direct repression of Fbw7, which promotes the proteosomal degradation of c-Myc. Thus, overexpressing miR-92 leads to aberrant c-Myc increase, imposing a strong coupling between excessive proliferation and p53-dependent apoptosis. Interestingly, miR-92 antagonizes the oncogenic miR-19 miRNAs; and such functional interaction coordinates proliferation and apoptosis during c-Myc-induced oncogenesis. This miR-19:miR-92 antagonism is disrupted in B-lymphoma cells that favor a greater increase of miR-19 over miR-92. Altogether, we suggest a new paradigm whereby the unique gene structure of a polycistronic oncomir confers an intricate balance between oncogene and tumor suppressor crosstalk. DOI:http://dx.doi.org/10.7554/eLife.00822.001.
[show abstract][hide abstract] ABSTRACT: Metabolomics research often requires the use of multiple analytical platforms, batches of samples, and laboratories, any of which can introduce a component of unwanted variation. In addition, every experiment is subject to within platform and other experimental variation, which often includes unwanted biological variation. Such variation must be removed in order to focus on the biological information of interest. We present a broadly applicable method for the removal of unwanted variation arising from various sources for the identification of differentially abundant metabolites, and hence, for the systematic integration of data on the same quantities from different sources. We illustrate the versatility and the performance of the approach in four applications, and show that it has several advantages over the existing normalisation methods.
[show abstract][hide abstract] ABSTRACT: Emerging evidence suggests that Argonaute (Ago)/Piwi proteins have diverse functions in the nucleus and cytoplasm, but the molecular mechanisms employed in the nucleus remain poorly defined. The Tetrahymena thermophila Ago/Piwi protein Twi12 is essential for growth and functions in the nucleus. Twi12-bound small RNAs (sRNAs) are 3' tRNA fragments that contain modified bases and thus are attenuated for base pairing to targets. We show that Twi12 assembles an unexpected complex with the nuclear exonuclease Xrn2. Twi12 functions to stabilize and localize Xrn2, as well as to stimulate its exonuclease activity. Twi12 function depends on sRNA binding, which is required for its nuclear import. Depletion of Twi12 or Xrn2 induces a cellular ribosomal RNA processing defect known to result from limiting Xrn2 activity in other organisms. Our findings suggest a role for an Ago/Piwi protein and 3' tRNA fragments in nuclear RNA metabolism.
[show abstract][hide abstract] ABSTRACT: MOTIVATION: Protein signaling networks play a key role in cellular function, and their dysregulation is central to many diseases, including cancer. To shed light on signaling network topology in specific contexts, such as cancer, requires interrogation of multiple proteins through time and statistical approaches to make inferences regarding network structure. RESULTS: In this study, we use dynamic Bayesian networks to make inferences regarding network structure and thereby generate testable hypotheses. We incorporate existing biology using informative network priors, weighted objectively by an empirical Bayes approach, and exploit a connection between variable selection and network inference to enable exact calculation of posterior probabilities of interest. The approach is computationally efficient and essentially free of user-set tuning parameters. Results on data where the true, underlying network is known place the approach favorably relative to existing approaches. We apply these methods to reverse-phase protein array time-course data from a breast cancer cell line (MDA-MB-468) to predict signaling links that we independently validate using targeted inhibition. The methods proposed offer a general approach by which to elucidate molecular networks specific to biological context, including, but not limited to, human cancers. AVAILABILITY: http://mukherjeelab.nki.nl/DBN (code and data). CONTACT: firstname.lastname@example.org; email@example.com; firstname.lastname@example.org SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
[show abstract][hide abstract] ABSTRACT: Developments in microarray and high throughput sequencing (HTS) technologies have resulted in a rapid expansion of research into epigenomic changes that occur in normal development and in the progression of disease, such as cancer. Not surprisingly, copy number variation (CNV) has a direct effect on HTS read densities and can therefore bias differential detection results. We have developed a flexible approach called ABCD-DNA (Affinity Based Copy-number-aware Differential quantitative DNA sequencing analyses) that integrates CNV and other systematic factors directly into the differential enrichment engine.
[show abstract][hide abstract] ABSTRACT: Glucocorticoids elicit a variety of biological responses in skeletal muscle, including inhibiting protein synthesis and insulin-stimulated glucose uptake and promoting proteolysis. Thus, excess or chronic glucocorticoid exposure leads to muscle atrophy and insulin resistance. Glucocorticoids propagate their signal mainly through glucocorticoid receptors (GR), which, upon binding to ligands, translocate to the nucleus and bind to genomic glucocorticoid response elements to regulate the transcription of nearby genes. Using a combination of chromatin immunoprecipitation sequencing and microarray analysis, we identified 173 genes in mouse C2C12 myotubes. The mouse genome contains GR-binding regions in or near these genes, and gene expression is regulated by glucocorticoids. Eight of these genes encode proteins known to regulate distinct signaling events in insulin/insulin-like growth factor 1 pathways. We found that overexpression of p85α, one of these eight genes, caused a decrease in C2C12 myotube diameters, mimicking the effect of glucocorticoids. Moreover, reducing p85α expression by RNA interference in C2C12 myotubes significantly compromised the ability of glucocorticoids to inhibit Akt and p70 S6 kinase activity and reduced glucocorticoid induction of insulin receptor substrate 1 phosphorylation at serine 307. This phosphorylation is associated with insulin resistance. Furthermore, decreasing p85α expression abolished glucocorticoid inhibition of protein synthesis and compromised glucocorticoid-induced reduction of cell diameters in C2C12 myotubes. Finally, a glucocorticoid response element was identified in the p85α GR-binding regions. In summary, our studies identified GR-regulated transcriptional networks in myotubes and showed that p85α plays a critical role in glucocorticoid-induced insulin resistance and muscle atrophy in C2C12 myotubes.
Proceedings of the National Academy of Sciences 06/2012; 109(28):11160-5. · 9.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: Genotyping Plasmodium falciparum parasites in longitudinal studies provides a robust approach to estimating force of infection (FOI) in the presence of superinfections. The molecular parameter (mol)FOI, defined as the number of new P. falciparum clones acquired over time, describes basic malaria epidemiology and is suitable for measuring outcomes of interventions. This study was designed to test whether (mol)FOI influenced the risk of clinical malaria episodes and how far (mol)FOI reflected environmental determinants of transmission, such as seasonality and small-scale geographical variation or effects of insecticide-treated nets (ITNs). Two hundred sixty-four children 1-3 y of age from Papua New Guinea were followed over 16 mo. Individual parasite clones were tracked longitudinally by genotyping. On average, children acquired 5.9 (SD 9.6) new P. falciparum infections per child per y. (mol)FOI showed a pronounced seasonality, was strongly reduced in children using ITNs (incidence rate ratio, 0.49; 95% confidence interval, [0.38, 0.61]), increased with age, and significantly varied within villages (P = 0.001). The acquisition of new parasite clones was the major factor determining the risk of clinical illness (incidence rate ratio, 2.12; 95% confidence interval, [1.93, 2.31]). Adjusting for individual differences in (mol)FOI completely explained spatial variation, age trends, and the effect of ITN use. This study highlights the suitability of (mol)FOI as a measure of individual exposure and its central role in malaria epidemiology. It has substantial advantages over entomological measures in studies of transmission patterns, and could be used in analyses of host variation in susceptibility, in field efficacy trials of novel interventions or vaccines, and for evaluating intervention effects.
Proceedings of the National Academy of Sciences 06/2012; 109(25):10030-5. · 9.74 Impact Factor
[show abstract][hide abstract] ABSTRACT: Sequencing by hybridization to oligonucleotides has evolved into an inexpensive, reliable and fast technology for targeted sequencing. Hundreds of human genes can now be sequenced within a day using a single hybridization to a resequencing microarray. However, several issues inherent to these arrays (e.g. cross-hybridization, variable probe/target affinity) cause sequencing errors and have prevented more widespread applications. We developed an R package for resequencing microarray data analysis that integrates a novel statistical algorithm, sequence robust multi-array analysis (SRMA), for rare variant detection with high sensitivity (false negative rate, FNR 5%) and accuracy (false positive rate, FPR 1×10⁻⁵). The SRMA package consists of five modules for quality control, data normalization, single array analysis, multi-array analysis and output analysis. The entire workflow is efficient and identifies rare DNA single nucleotide variations and structural changes such as gene deletions with high accuracy and sensitivity. AVAILABILITY: http://cran.r-project.org/, http://odin.mdacc.tmc.edu/~wwang7/SRMAIndex.html
[show abstract][hide abstract] ABSTRACT: Suppressors of cytokine signaling (SOCS) proteins function as negative regulators of cytokine signaling and are involved in fine tuning the immune response. The structure and role of the SH2 domains and C-terminal SOCS box motifs of the SOCS proteins are well characterized, but the long N-terminal domains of SOCS4-7 remain poorly understood. Here, we present bioinformatic analyses of the N-terminal domains of the mammalian SOCS proteins, which indicate that these domains of SOCS4, 5, 6, and 7 are largely disordered. We have also identified a conserved region of about 70 residues in the N-terminal domains of SOCS4 and 5 that is predicted to be more ordered than the surrounding sequence. The conservation of this region can be traced as far back as lower vertebrates. As conserved regions with increased structural propensity that are located within long disordered regions often contain molecular recognition motifs, we expressed the N-terminal conserved region of mouse SOCS4 for further analysis. This region, mSOCS4₈₆₋₁₅₅, has been characterized by circular dichroism and nuclear magnetic resonance spectroscopy, both of which indicate that it is predominantly unstructured in aqueous solution, although it becomes helical in the presence of trifluoroethanol. The high degree of sequence conservation of this region across different species and between SOCS4 and SOCS5 nonetheless implies that it has an important functional role, and presumably this region adopts a more ordered conformation in complex with its partners. The recombinant protein will be a valuable tool in identifying these partners and defining the structures of these complexes.
Proteins Structure Function and Bioinformatics 03/2012; 80(3):946-57. · 3.34 Impact Factor
[show abstract][hide abstract] ABSTRACT: Breast cancers are comprised of molecularly distinct subtypes that may respond differently to pathway-targeted therapies now under development. Collections of breast cancer cell lines mirror many of the molecular subtypes and pathways found in tumors, suggesting that treatment of cell lines with candidate therapeutic compounds can guide identification of associations between molecular subtypes, pathways, and drug response. In a test of 77 therapeutic compounds, nearly all drugs showed differential responses across these cell lines, and approximately one third showed subtype-, pathway-, and/or genomic aberration-specific responses. These observations suggest mechanisms of response and resistance and may inform efforts to develop molecular assays that predict clinical response.
Proceedings of the National Academy of Sciences 02/2012; 109(8):2724-9. · 9.74 Impact Factor