[show abstract][hide abstract] ABSTRACT: The signaling cascade of the transcription factor vitamin D receptor (VDR) is triggered by its specific ligand 1α,25-dihydroxyvitamin D3 (1α,25(OH)2D3). In this study we demonstrate that in THP-1 human monocytic leukemia cells 87.4% of the 1,034 most prominent genome-wide VDR binding sites co-localize with loci of open chromatin. At 165 of them 1α,25(OH)2D3 strongly increases chromatin accessibility and has at further 217 sites weaker effects. Interestingly, VDR binding sites in 1α,25(OH)2D3-responsive chromatin regions are far more often composed of direct repeats with 3 intervening nucleotides (DR3s) than those in ligand insensitive regions. DR3-containing VDR sites are enriched in the neighborhood of genes that are involved in controling cellular growth, while non-DR3 VDR binding is often found close to genes related to immunity. At the example of six early VDR target genes we show that the slope of their 1α,25(OH)2D3-induced transcription correlates with the basal chromatin accessibility of their major VDR binding regions. However, the chromatin loci controlling these genes are indistinguishable in their VDR association kinetics. Taken together, ligand responsive chromatin loci represent dynamically regulated contact points of VDR with the genome, from where it controls early 1α,25(OH)2D3 target genes.
Biochimica et Biophysica Acta 10/2013; · 4.66 Impact Factor
[show abstract][hide abstract] ABSTRACT: The basic helix-loop-helix protein BHLHE40 functions as a transcriptional repressor and is involved in the control of cellular growth, development and circadian rhythms. By the use of genome-wide data on vitamin D receptor (VDR) location, open chromatin and histone modification backed-up by gene-specific mRNA expression studies we show that the human BHLHE40 gene is dynamically up-regulated by the VDR ligand 1α,25-dihydroxyvitamin D(3) (1α,25(OH)(2)D(3)) and down-regulated by the histone deactylase inhibitor trichostatin A. The VDR binding site is located 1.7kb upstream of the transcription start site of the BHLHE40 gene and the chromatin at this genomic site is significantly opened by treatment with 1α,25(OH)(2)D(3). The stair case style fluctuations in the BHLHE40 mRNA accumulation relate to the short half-live of the gene's mRNA of 0.9h. The identification of the widely expressed BHLHE40 gene as a primary VDR target may explain secondary effects of 1α,25(OH)(2)D(3) on BHLHE40 responding genes.
The Journal of steroid biochemistry and molecular biology 07/2013; 136:62-67. · 2.66 Impact Factor
[show abstract][hide abstract] ABSTRACT: The liver X receptors (LXRs) are oxysterol sensing nuclear receptors with multiple effects on metabolism and immune cells. However, the complete genome-wide cistrome of LXR in cells of human origin has not yet been provided.
We performed ChIP-seq in phorbol myristate acetate-differentiated THP-1 cells (macrophage-type) after stimulation with the potent synthetic LXR ligand T0901317 (T09). Microarray gene expression analysis was performed in the same cellular model. We identified 1357 genome-wide LXR locations (FDR < 1%), of which 526 were observed after T09 treatment. De novo analysis of LXR binding sequences identified a DR4-type element as the major motif. On mRNA level T09 up-regulated 1258 genes and repressed 455 genes. Our results show that LXR actions are focused on 112 genomic regions that contain up to 11 T09 target genes per region under the control of highly stringent LXR binding sites with individual constellations for each region. We could confirm that LXR controls lipid metabolism and transport and observed a strong association with apoptosis-related functions.
This first report on genome-wide binding of LXR in a human cell line provides new insights into the transcriptional network of LXR and its target genes with their link to physiological processes, such as apoptosis.The gene expression microarray and sequence data have been submitted collectively to the NCBI Gene Expression Omnibus http://www.ncbi.nlm.nih.gov/geo under accession number GSE28319.
[show abstract][hide abstract] ABSTRACT: A global understanding of the actions of the nuclear hormone 1α,25-dihydroxyvitamin D(3) (1α,25(OH)(2)D(3)) and its vitamin D receptor (VDR) requires a genome-wide analysis of VDR binding sites. In THP-1 human monocytic leukemia cells we identified by ChIP-seq 2340 VDR binding locations, of which 1171 and 520 occurred uniquely with and without 1α,25(OH)(2)D(3) treatment, respectively, while 649 were common. De novo identified direct repeat spaced by 3 nucleotides (DR3)-type response elements (REs) were strongly associated with the ligand-responsiveness of VDR occupation. Only 20% of the VDR peaks diminishing most after ligand treatment have a DR3-type RE, in contrast to 90% for the most growing peaks. Ligand treatment revealed 638 1α,25(OH)(2)D(3) target genes enriched in gene ontology categories associated with immunity and signaling. From the 408 upregulated genes, 72% showed VDR binding within 400 kb of their transcription start sites (TSSs), while this applied only for 43% of the 230 downregulated genes. The VDR loci showed considerable variation in gene regulatory scenarios ranging from a single VDR location near the target gene TSS to very complex clusters of multiple VDR locations and target genes. In conclusion, ligand binding shifts the locations of VDR occupation to DR3-type REs that surround its target genes and occur in a large variety of regulatory constellations.
Nucleic Acids Research 08/2011; 39(21):9181-93. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: A major challenge in genomic research is identifying significant biological processes and generating new hypotheses from large gene sets. Gene sets often consist of multiple separate biological pathways, controlled by distinct regulatory mechanisms. Many of these pathways and the associated regulatory mechanisms might be obscured by a large number of other significant processes and thus not identified as significant by standard gene set enrichment analysis tools.
We present a novel method called Independent Enrichment Analysis (IEA) and software TAFFEL that eases the task by clustering genes to subgroups using Gene Ontology categories and transcription regulators. IEA indicates transcriptional regulators putatively controlling biological functions in studied condition.
We demonstrate that the developed method and TAFFEL tool give new insight to the analysis of differentially expressed genes and can generate novel hypotheses. Our comparison to other popular methods showed that the IEA method implemented in TAFFEL can find important biological phenomena, which are not reported by other methods.
[show abstract][hide abstract] ABSTRACT: Small ubiquitin-related modifiers (SUMOs) are important regulator proteins. Caenorhabditis elegans contains a single SUMO ortholog, SMO-1, necessary for the reproduction of C. elegans. In this study, we constructed transgenic C. elegans strains expressing human SUMO-1 under the control of pan-neuronal (aex-3) or pan-muscular (myo-4) promoter and SUMO-2 under the control of myo-4 promoter. Interestingly, muscular overexpression of SUMO-1 or -2 resulted in morphological changes of the posterior part of the nematode. Movement, reproduction and aging of C. elegans were perturbed by the overexpression of SUMO-1 or -2. Genome-wide expression analyses revealed that several genes encoding components of SUMOylation pathway and ubiquitin-proteasome system were upregulated in SUMO-overexpressing nematodes. Since muscular overexpression of SMO-1 also brought up reproductive and mobility perturbations, our results imply that the phenotypes were largely due to an excess of SUMO, suggesting that a tight control of SUMO levels is important for the normal development of multicellular organisms.
Cellular and Molecular Life Sciences CMLS 01/2011; 68(19):3219-32. · 5.62 Impact Factor
[show abstract][hide abstract] ABSTRACT: Segmentation aims to separate homogeneous areas from the sequential data, and plays a central role in data mining. It has applications ranging from finance to molecular biology, where bioinformatics tasks such as genome data analysis are active application fields. In this paper, we present a novel application of segmentation in locating genomic regions with coexpressed genes. We aim at automated discovery of such regions without requirement for user-given parameters. In order to perform the segmentation within a reasonable time, we use heuristics. Most of the heuristic segmentation algorithms require some decision on the number of segments. This is usually accomplished by using asymptotic model selection methods like the Bayesian information criterion. Such methods are based on some simplification, which can limit their usage. In this paper, we propose a Bayesian model selection to choose the most proper result from heuristic segmentation. Our Bayesian model presents a simple prior for the segmentation solutions with various segment numbers and a modified Dirichlet prior for modeling multinomial data. We show with various artificial data sets in our benchmark system that our model selection criterion has the best overall performance. The application of our method in yeast cell-cycle gene expression data reveals potential active and passive regions of the genome.
IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM 01/2010; 7(1):37-49. · 2.25 Impact Factor
[show abstract][hide abstract] ABSTRACT: The analysis of over-represented functional classes in a list of genes is one of the most essential bioinformatics research topics. Typical examples of such lists are the differentially expressed genes from transcriptional analysis which need to be linked to functional information represented in the Gene Ontology (GO). Despite the importance of this procedure, there is a little work on consistent evaluation of various GO analysis methods. Especially, there is no literature on creating benchmark datasets for GO analysis tools.
We propose a methodology for the evaluation of GO analysis tools, which consists of creating gene lists with a selected signal level and a selected number of independent over-represented classes. The methodology starts with a real life GO data matrix, and therefore the generated datasets have similar features to real positive datasets. The user can select the signal level for over-representation, the number of independent positive classes in the dataset, and the size of the final gene list. We present the use of the effective number and various normalizations while embedding the signal to a selected class or classes and the use of binary correlation to ensure that the selected signal classes are independent with each other. The usefulness of generated datasets is demonstrated by comparing different GO class ranking and GO clustering methods.
The presented methods aid the development and evaluation of GO analysis methods as they enable thorough testing with different signal types and different signal levels. As an example, our comparisons reveal clear differences between compared GO clustering and GO de-correlation methods. The implementation is coded in Matlab and is freely available at the dedicated website http://ekhidna.biocenter.helsinki.fi/users/petri/public/POSGODA/POSGODA.html.
[show abstract][hide abstract] ABSTRACT: The interactions and functions of protein inhibitors of activated STAT (PIAS) proteins are not restricted to the signal transducers and activators of transcription (STATs), but PIAS1, -2, -3 and -4 interact with and regulate a variety of distinct proteins, especially transcription factors. Although the majority of PIAS-interacting proteins are prone to modification by small ubiquitin-related modifier (SUMO) proteins and the PIAS proteins have the capacity to promote the modification as RING-type SUMO ligases, they do not function solely as SUMO E3 ligases. Instead, their effects are often independent of their Siz/PIAS (SP)-RING finger, but dependent on their capability to noncovalently interact with SUMOs or DNA through their SUMO-interacting motif and scaffold attachment factor-A/B, acinus and PIAS domain, respectively. Here, we present an overview of the cellular regulation by PIAS proteins and propose that many of their functions are due to their capability to mediate and facilitate SUMO-linked protein assemblies.
Cellular and Molecular Life Sciences CMLS 07/2009; 66(18):3029-41. · 5.62 Impact Factor
[show abstract][hide abstract] ABSTRACT: One essential step in the massive analysis of transcriptomic profiles is the calculation of the correlation coefficient, a value used to select pairs of genes with similar or inverse transcriptional profiles across a large fraction of the biological conditions examined. Until now, the choice between the two available methods for calculating the coefficient has been dictated mainly by technological considerations. Specifically, in analyses based on double-channel techniques, researchers have been required to use covariation correlation, i.e. the correlation between gene expression changes measured between several pairs of biological conditions, expressed for example as fold-change. In contrast, in analyses of single-channel techniques scientists have been restricted to the use of coexpression correlation, i.e. correlation between gene expression levels. To our knowledge, nobody has ever examined the possible benefits of using covariation instead of coexpression in massive analyses of single channel microarray results.
We describe here how single-channel techniques can be treated like double-channel techniques and used to generate both gene expression changes and covariation measures. We also present a new method that allows the calculation of both positive and negative correlation coefficients between genes. First, we perform systematic comparisons between two given biological conditions and classify, for each comparison, genes as increased (I), decreased (D), or not changed (N). As a result, the original series of n gene expression level measures assigned to each gene is replaced by an ordered string of n(n-1)/2 symbols, e.g. IDDNNIDID....DNNNNNNID, with the length of the string corresponding to the number of comparisons. In a second step, positive and negative covariation matrices (CVM) are constructed by calculating statistically significant positive or negative correlation scores for any pair of genes by comparing their strings of symbols.
This new method, applied to four different large data sets, has allowed us to construct distinct covariation matrices with similar properties. We have also developed a technique to translate these covariation networks into graphical 3D representations and found that the local assignation of the probe sets was conserved across the four chip set models used which encompass three different species (humans, mice, and rats). The application of adapted clustering methods succeeded in delineating six conserved functional regions that we characterized using Gene Ontology information.
[show abstract][hide abstract] ABSTRACT: We present POXO, a comprehensive tool series to discover transcription factor binding sites from co-expressed genes (www.bioinfo.biocenter.helsinki.fi/poxo). POXO manages tasks such as functional evaluation and grouping of genes, sequence retrieval, pattern discovery and pattern verification. It also allows users to tailor analytical pipelines from these tools, with single mouse clicks. One typical pipeline of POXO begins by examining the biological functions that a set of co-expressed genes are involved in. In this examination, the functional coherence of the gene set is evaluated and representative functions are associated with the gene set. This examination can also be used to group genes into functionally similar subsets, if several biological processes are affected in the experiment. The next step in the pipeline is then to discover over-represented nucleotide patterns from the upstream sequences of the selected gene sets. This enables to investigate the possibility that the genes are co-regulated by common cis-elements. If over-represented patterns are found, similar ones can then be clustered together and be verified. The performance of POXO is demonstrated by analysing expression data from pathogen treated Arabidopsis thaliana. In this example, POXO detected activated gene sets and suggested transcription factors responsible for their regulation.
Nucleic Acids Research 08/2006; 34(Web Server issue):W534-40. · 8.28 Impact Factor
[show abstract][hide abstract] ABSTRACT: Alpha-synuclein containing cellular inclusions are a hallmark of Parkinson Disease, Lewy Body Dementia, and Multiple System Atrophy. A genome wide expression screen was performed in C. elegans overexpressing both wild-type and A53T human alpha-synuclein. 433 genes were up- and 67 genes down-regulated by statistical and fold change (> or <2) criteria. Gene ontology (GO) categories within the regulated gene lists indicated over-representation of development and reproduction, mitochondria, catalytic activity, and histone groups. Seven genes (pdr-1, ubc-7, pas-5, pas-7, pbs-4, RPT2, PSMD9) with function in the ubiquitin-proteasome system and 35 mitochondrial function genes were up-regulated. Nine genes that form histones H1, H2B, and H4 were down-regulated. These results demonstrate the effects of alpha-synuclein on proteasome and mitochondrial complex gene expression and provide further support for the role of these complexes in mediating neurotoxicity. The results also indicate an effect on nuclear protein genes that suggests a potential new avenue for investigation.
Neurobiology of Disease 07/2006; 22(3):477-86. · 5.62 Impact Factor
[show abstract][hide abstract] ABSTRACT: High throughput methods of the genome era produce vast amounts of data in the form of gene lists. These lists are large and difficult to interpret without advanced computational or bioinformatic tools. Most existing methods analyse a gene list as a single entity although it is comprised of multiple gene groups associated with separate biological functions. Therefore it is imperative to define and visualize gene groups with unique functionality within gene lists.
In order to analyse the functional heterogeneity within a gene list, we have developed a method that clusters genes to groups with homogenous functionalities. The method uses Non-negative Matrix Factorization (NMF) to create several clustering results with varying numbers of clusters. The obtained clustering results are combined into a simple graphical presentation showing the functional groups over-represented in the analyzed gene list. We demonstrate its performance on two data sets and show results that improve upon existing methods. The comparison also shows that our method creates a more simplified view that aids in discovery of biological themes within the list and discards less informative classes from the results.
The presented method and associated software are useful for the identification and interpretation of biological functions associated with gene lists and are especially useful for the analysis of large lists.
[show abstract][hide abstract] ABSTRACT: Microarray technologies are rapidly becoming available for new species including teleost fishes. We constructed a rainbow trout cDNA microarray targeted at the identification of genes which are differentially expressed in response to environmental stressors. This platform included clones from normalized and subtracted libraries and genes selected through functional annotation. Present study focused on time-course comparisons of stress responses in the brain and kidney and the identification of a set of genes which are diagnostic for stress response.
Fish were stressed with handling and samples were collected 1, 3 and 5 days after the first exposure. Gene expression profiles were analysed in terms of Gene Ontology categories. Stress affected different functional groups of genes in the tissues studied. Mitochondria, extracellular matrix and endopeptidases (especially collagenases) were the major targets in kidney. Stress response in brain was characterized with dramatic temporal alterations. Metal ion binding proteins, glycolytic enzymes and motor proteins were induced transiently, whereas expression of genes involved in stress and immune response, cell proliferation and growth, signal transduction and apoptosis, protein biosynthesis and folding changed in a reciprocal fashion. Despite dramatic difference between tissues and time-points, we were able to identify a group of 48 genes that showed strong correlation of expression profiles (Pearson r > /0.65/) in 35 microarray experiments being regulated by stress. We evaluated performance of the clone sets used for preparation of microarray. Overall, the number of differentially expressed genes was markedly higher in EST than in genes selected through Gene Ontology annotations, however 63% of stress-responsive genes were from this group.
1. Stress responses in fish brain and kidney are different in function and time-course. 2. Identification of stress-regulated genes provides the possibility for measuring stress responses in various conditions and further search for the functionally related genes.
[show abstract][hide abstract] ABSTRACT: We used high-density cDNA microarray in studies of responses of rainbow trout fry at sublethal ranges of beta-naphthoflavone, cadmium, carbon tetrachloride, and pyrene. The differentially expressed genes were grouped by the functional categories of Gene Ontology. Significantly different response to the studied compounds was shown by a number of classes, such as cell cycle, apoptosis, signal transduction, oxidative stress, subcellular and extracellular structures, protein biosynthesis, and modification. Cluster analysis separated responses to the contaminants at low and medium doses, whereas at high levels the adaptive reactions were masked with general unspecific response to toxicity. We found enhanced expression of many mitochondrial proteins as well as genes involved in metabolism of metal ions and protein biosynthesis. In parallel, genes related to stress and immune response, signal transduction, and nucleotide metabolism were down-regulated. We performed computer-assisted analyses of Medline abstracts retrieved for each compound, which helped us to indicate the expected and novel findings.
Biochemical and Biophysical Research Communications 08/2004; 320(3):745-53. · 2.41 Impact Factor