SIEGE: Smoking Induced Epithelial Gene Expression Database

Bioinformatics Program, College of Engineering, Boston University, 44 Cummington Street, Boston, MA 02215, USA.
Nucleic Acids Research (Impact Factor: 8.81). 02/2005; 33(Database issue):D573-9. DOI: 10.1093/nar/gki035
Source: PubMed

ABSTRACT The SIEGE (Smoking Induced Epithelial Gene Expression) database is a clinical resource for compiling and analyzing gene expression data from epithelial cells of the human intra-thoracic airway. This database supports a translational research study whose goal is to profile the changes in airway gene expression that are induced by cigarette smoke. RNA is isolated from airway epithelium obtained at bronchoscopy from current-, former- and never-smoker subjects, and hybridized to Affymetrix HG-U133A Genechips, which measure the level of expression of approximately 22,500 human transcripts. The microarray data generated along with relevant patient information is uploaded to SIEGE by study administrators using the database's web interface, found at PERL-coded scripts integrated with SIEGE perform various quality control functions including the processing, filtering and formatting of stored data. The R statistical package is used to import database expression values and execute a number of statistical analyses including t-tests, correlation coefficients and hierarchical clustering. Values from all statistical analyses can be queried through CGI-based tools and web forms found on the 'Search' section of the database website. Query results are embedded with graphical capabilities as well as with links to other databases containing valuable gene resources, including Entrez Gene, GO, Biocarta, GeneCards, dbSNP and the NCBI Map Viewer.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Healthy volunteers (n=50) were enrolled for studying the variation of gene expression induced by smoking in peripheral lymphocytes. RNAs from smokers (>3 cigarettes/day, n=20) and passive smokers (exposed to tobacco smoke >3 h/day, n=10) were hybridized versus a reference pool obtained by mixing equal amounts of RNA from 20 nonsmokers, and gene expression was analyzed using DNA microarrays containing 13,971 oligos. Principal component analysis showed that 99.7% of gene expression variability was related to plasma cotinine, age, and DNA oxidation damage. SAM and GenMAPP/MAPPFinder analyses showed that smokers, compared to nonsmokers, had 129 down-regulated and 87 up-regulated genes, whereas passive smokers, compared to nonsmokers, had 44 down-regulated and 159 up-regulated genes, mainly involved in pathways associated with the activation of defensive responses. Hierarchical cluster analysis identified two distinct clusters of smokers, characterized by different oxidative DNA damage: smokers with high DNA oxidation damage, compared to smokers with low DNA oxidation damage, had a large number (150) of down-regulated genes, mainly associated with xenobiotic metabolism, DNA damage and repair, inflammatory responses, lymphocyte activation, and cytokine activity, suggesting a reduced cellular response to toxic agents in this subset of smokers that could lead to an increased DNA oxidation damage.
    Free Radical Biology and Medicine 08/2007; 43(3):415-22. DOI:10.1016/j.freeradbiomed.2007.04.018 · 5.71 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Whole transcriptome shotgun sequencing (RNA-Seq) is a useful tool for analyzing the transcriptome of a biological sample. With appropriate statistical and bioinformatic processing, this platform is capable of identifying significant differences in gene expression within the transcriptome and permits pathway and network analyses to determine how these genes interact biologically. In this study, we examined gene expression in two lung adenocarcinoma cell lines (H358 and A459) that were treated with transforming growth factor-β (TGF-β) as a model for induction of the epithelial-to-mesenchymal transition (EMT), commonly associated with disease progression. We performed this study in order to illustrate a workflow for identifying interesting genes and processes that are regulated early in EMT and to determine their gene pathway/network relationships and regulation. With this, we identified 137 upregulated and 32 downregulated genes common to both cell lines after TGF-β treatment that represent components of multiple canonical pathways and biological networks associated with the induction of EMT. These findings were also verified against reposited Affymetrix U133a expression profiles from multiple trials examining metastatic progression in patient cohorts (n = 731 total) to further establish the clinical relevance and translational significance of the model system. Together, these findings help validate the relevance of the TGF-β model for the study of EMT and provide new insights into early events in EMT.
    Cancer informatics 01/2014; 13(Suppl 5):129-40. DOI:10.4137/CIN.S14073
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The retina is a multi-layered sensory tissue that lines the back of the eye and acts at the interface of input light and visual perception. Its main function is to capture photons and convert them into electrical impulses that travel along the optic nerve to the brain where they are turned into images. It consists of neurons, nourishing blood vessels and different cell types, of which neural cells predominate. Defects in any of these cells can lead to a variety of retinal diseases, including age-related macular degeneration, retinitis pigmentosa, Leber congenital amaurosis and glaucoma. Recent progress in genomics and microarray technology provides extensive opportunities to examine alterations in retinal gene expression profiles during development and diseases. However, there is no specific database that deals with retinal gene expression profiling. In this context we have built RETINOBASE, a dedicated microarray database for retina. RETINOBASE is a microarray relational database, analysis and visualization system that allows simple yet powerful queries to retrieve information about gene expression in retina. It provides access to gene expression meta-data and offers significant insights into gene networks in retina, resulting in better hypothesis framing for biological problems that can subsequently be tested in the laboratory. Public and proprietary data are automatically analyzed with 3 distinct methods, RMA, dChip and MAS5, then clustered using 2 different K-means and 1 mixture models method. Thus, RETINOBASE provides a framework to compare these methods and to optimize the retinal data analysis. RETINOBASE has three different modules, "Gene Information", "Raw Data System Analysis" and "Fold change system Analysis" that are interconnected in a relational schema, allowing efficient retrieval and cross comparison of data. Currently, RETINOBASE contains datasets from 28 different microarray experiments performed in 5 different model systems: drosophila, zebrafish, rat, mouse and human. The database is supported by a platform that is designed to easily integrate new functionalities and is also frequently updated. The results obtained from various biological scenarios can be visualized, compared and downloaded. The results of a case study are presented that highlight the utility of RETINOBASE. Overall, RETINOBASE provides efficient access to the global expression profiling of retinal genes from different organisms under various conditions.
    BMC Genomics 02/2008; 9:208. DOI:10.1186/1471-2164-9-208 · 4.04 Impact Factor

Preview (2 Sources)

Available from