Using GOstats to test gene lists for GO term association

Fred Hutchison Cancer Research Center, Program Computational Biology,1100 Fairview Avenue North P. O. Box 19024, Seattle, WA 98109, USA.
Bioinformatics (Impact Factor: 4.62). 02/2007; 23(2):257-8. DOI: 10.1093/bioinformatics/btl567
Source: PubMed

ABSTRACT Functional analyses based on the association of Gene Ontology (GO) terms to genes in a selected gene list are useful bioinformatic tools and the GOstats package has been widely used to perform such computations. In this paper we report significant improvements and extensions such as support for conditional testing.
We discuss the capabilities of GOstats, a Bioconductor package written in R, that allows users to test GO terms for over or under-representation using either a classical hypergeometric test or a conditional hypergeometric that uses the relationships among GO terms to decorrelate the results.
GOstats is available as an R package from the Bioconductor project:

1 Follower
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Protein phosphorylation plays a central role in creating a highly dynamic network of interacting proteins that reads and responds to signals from growth factors in the cellular microenvironment. Cells of the neural crest employ multiple signaling mechanisms to control migration and differentiation during development. It is known that defects in these mechanisms cause neuroblastoma, but how multiple signaling pathways interact to govern cell behavior is unknown. In a phosphoproteomic study of neuroblastoma cell lines and cell fractions, including endosomes and detergent-resistant membranes, 1622 phosphorylated proteins were detected, including more than half of the receptor tyrosine kinases in the human genome. Data were analyzed using a combination of graph theory and pattern recognition techniques that resolve data structure into networks that incorporate statistical relationships and protein-protein interaction data. Clusters of proteins in these networks are indicative of functional signaling pathways. The analysis indicates that receptor tyrosine kinases are functionally compartmentalized into distinct collaborative groups distinguished by activation and intracellular localization of SRC-family kinases, especially FYN and LYN. Changes in intracellular localization of activated FYN and LYN were observed in response to stimulation of the receptor tyrosine kinases, ALK and KIT. The results suggest a mechanism to distinguish signaling responses to activation of different receptors, or combinations of receptors, that govern the behavior of the neural crest, which gives rise to neuroblastoma.
    PLoS Computational Biology 04/2015; 11(4):e1004130. DOI:10.1371/journal.pcbi.1004130 · 4.83 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Polycystic ovarian syndrome (PCOS) is a spectrum of heterogeneous disorders of reproduction and metabolism in women with potential systemic sequel such as diabetes and obesity. Although, PCOS is believed to be caused by genetic abnormalities, the genetic background that can be associated with PCOS phenotypes remains unclear due to the complexity of the trait. In this study, we used a rat model which exhibits reproductive and metabolic abnormalities similar to the human PCOS to unravel the molecular mechanisms underlining this complex syndrome. Methods Female Sprague–Dawley rats were randomly assigned to DHT and control (CTL) groups. Rats in the DHT group were implanted with a silicone capsule continuous-releasing 83 μg 5α-dihydrotestosterone (DHT) per day for 12 weeks to mimic the hyperandrogenic state in women with PCOS. The animals were euthanized at 15 weeks of age and the pairs of ovaries were excised and the ovarian cortex tissues were used for gene expression analysis. Total RNA was from the ovarian cortex was amplified, labeled and hybridized to the Affymetrix GeneChip® Rat Genome 230 2.0 Array. A linear model system for microarray data analysis was used to identify genes affected in DHT treated rat ovaries and the molecular pathway of those genes were analyzed using the Database for Annotation, Visualization and Integrated Discovery (DAVID) analysis tool. Results A total of 573 gene transcripts, including CPA1, CDH1, INSL3, AMH, ALDH1B1, INHBA, CYP17A1, RBP4, GAS6, GAS7 and GATA4, were activated while 430 others including HSD17B7, HSD3B6, STAR, HMGCS1, HMGCR, CYP51, CYP11A1 and CYP19A1 were repressed in DHT-treated ovaries. Functional annotation of the dysregulated genes revealed that biosynthesis and metabolism of steroids, cholesterol and lipids to be the most top functions enriched by the repressed genes. However, cell differentiation/proliferation, transcriptional regulation, neurogenesis, cell adhesion and blood vessel development processes were enriched by activated genes. Conclusion The dysregulation of genes associated with biosynthesis and metabolism of steroids, cholesterol and lipids, cell differentiation/proliferation in DHT- treated ovaries could be a molecular clue for abnormal steroidogenesis, estrous cycle irregularity, abnormal folliculogenesis, anovulation and lipid metabolism in PCOS patients. Electronic supplementary material The online version of this article (doi:10.1186/s13048-015-0151-5) contains supplementary material, which is available to authorized users.
    Journal of Ovarian Research 04/2015; 8. DOI:10.1186/s13048-015-0151-5 · 2.03 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Patients suffering from cancer are often treated with a range of chemotherapeutic agents, but the treatment efficacy varies greatly between patients. Based on recent popularisation of regularised regression models the goal of this study was to establish workflows for pharmacogenomic predictors of response to standard multidrug regimens using baseline gene expression data and origin specific cell lines. The proposed workflows are tested on diffuse large B-cell lymphoma treated with R-CHOP first-line therapy. First, B-cell cancer cell lines were tested successively for resistance towards the chemotherapeutic components of R-CHOP: cyclophosphamide (C), doxorubicin (H), and vincristine (O). Second, baseline gene expression data were obtained for each cell line before treatment. Third, regularised multivariate regression models with cross-validated tuning parameters were used to generate classifier and predictor based resistance gene signatures (REGS) for the combination and individual chemotherapeutic drugs C, H, and O. Fourth, each developed REGS was used to assign resistance levels to individual patients in three clinical cohorts. Both classifier and predictor based REGS, for the combination CHO, were of prognostic value. For patients classified as resistant towards CHO the risk of progression was 2.33 (95% CI: 1.6, 3.3) times greater than for those classified as sensitive. Similarly, an increase in the predicted CHO resistance index of 10 was related to a 22% (9%, 36%) increased risk of progression. Furthermore, the REGS classifier performed significantly better than the REGS predictor. The regularised multivariate regression models provide a flexible workflow for drug resistance studies with promising potential. However, the gene expressions defining the REGSs should be functionally validated and correlated to known biomarkers to improve understanding of molecular mechanisms of drug resistance.
    BMC Cancer 12/2015; 15(1):235. DOI:10.1186/s12885-015-1237-6 · 3.32 Impact Factor


Seth Falcon