Fig 2 - uploaded by Jin Chen
Content may be subject to copyright.
Source publication
Living cells are realized by complex gene expression programs that are moderated by regulatory proteins called transcription factors (TFs). The TFs control the differential expression of target genes in the context of transcriptional regulatory networks (TRNs), either individually or in groups. Deciphering the mechanisms of how the TFs control the...
Contexts in source publication
Context 1
... Emission Probability. and Transition The initial Probabilities. emission probabilities We train of the the HMM states model in the HMM with dis- are set cretized equally gene (if expression we do not have data. any We prior discretized information), the gene or expression they can be levels determined as upregu- by prior lation knowledge or downregulation obtained instead from TF using perturbation absolute absence experiments. or presence, In addition, by comparing we use uniform the expression transition changes probabilities between (1/32), two consecutive if there is timepoints an arrow in to Fig. 2. determine whether there was increase ðþ 1 Þ /decrease ðÀ 1 Þ . 19 For the regulatory module with multiple target genes, all the gene expression changes are used sequentially for HMM training. In an mRNA transcriptional process, the expression change of a TF is usually earlier than the change of its targets. Therefore, we adopt the concept of time lagging in this HMM model training process. 20 At timepoint n , the input to the HMM is a set of gene expression changes of the TFs from timepoint n À l to n , where l is a time lag that is de ̄ned by the user to capture the e®ects of the TFs at the earlier timepoints ( n À l ; . . . ; n ) on the target gene at timepoint n . To update the emission probabilities properly, we design a set of constraints that relate the gene expression patterns of a TF and its target to the regulatory interaction model (see Table 2). The emission probability of a state (active/non-active) can then be updated with Eq. (2), where b and b 0 are the outputs of state k and E k ð b 0 Þ is the probability for emitting output b 0 from state k . 14 The ̄nal emission probability of each state re°ects the likelihood of the state being active or inactive. Emission and Transition Probabilities. We train the HMM model with discretized gene expression data. We discretized the gene expression levels as upregulation or downregulation instead using absolute absence or presence, by comparing the expression changes between two consecutive timepoints to determine whether there was increase ðþ 1 Þ /decrease ðÀ 1 Þ . 19 For the regulatory module with multiple target genes, all the gene expression changes are used sequentially for HMM training. In an mRNA transcriptional process, the expression change of a TF is usually earlier than the change of its targets. Therefore, we adopt the concept of time lagging in this HMM model training process. 20 At timepoint n , the input to the HMM is a set of gene expression changes of the TFs from timepoint n À l to n , where l is a time lag that is de ̄ned by the user to capture the e®ects of the TFs at the earlier timepoints ( n À l ; . . . ; n ) on the target gene at timepoint n . To update the emission probabilities properly, we design a set of constraints that relate the gene expression patterns of a TF and its target to the regulatory interaction model (see Table 2). The emission probability of a state (active/non-active) can then be updated with Eq. (2), where b and b 0 are the outputs of state k and E k ð b 0 Þ is the probability for emitting output b 0 from state k . 14 The ̄nal emission probability of each state re°ects the likelihood of the state being active or inactive. In an HMM, a path is a sequence of states that follows the Markov chain of hidden states, in which the probability of a state depends only on the probability of the previous state. In this way, we are able to consider the regulatory e®ects of the two TFs simultaneously, unlike previous works. The Viterbi algorithm is applied to e®ectively ̄nd the most probable path. 13,15 The ̄nal path contains two states with a state for each TF's TF À target interaction model. For example, a path \AS-RN" means that the ̄rst TF is activator su±cient and the second TF is repressor necessary for the same target genes. Note that the HMM is a probabilistic model which cannot assume combined events or none of the events to occur. Therefore, the model described above does not include a state for \Neither" or \Both Necessary and Su±cient (N+S)". To infer neither/N+S regulatory models, a post-processing step is required. We use the distribution of coe±cient of variation (CV) of all the emission probabilities to determine whether a regulatory interaction model can be N+S or neither: if none of the probabilities are signi ̄cant, the model outputs \neither"; if the probabilities of both N and S states are signi ̄cant, and if there is a signi ̄cant di®erence between the probabilities of the two states, TRIM outputs the more signi ̄cant state, otherwise our model outputs N+S. We show an illustrative example of our HMM in Table 3. In this example, TF 1 and TF 2 regulate the same target gene g . For simplicity, no time lag is used in the example ( l 1⁄4 0). Given the gene expression changes of the two TFs and the target gene, we can infer the regulatory interaction models for both of the TF À target pairs using the HMM as follows. We ̄rst initialize the active emission probabilities of all the states equally to 0.5. At time t 0 , as none of the expression changes is signi ̄cant, nothing is done. At time t 1 , the downregulation of both TF 2 and g triggers the active emission probability of state AN of TF 2 (Table 2, Row 10). The active emission probability of state ð TF 2 ; AN Þ is updated by adding the new frequency and then being normalized i.e. ð 0 : 5 þ 1 Þ = 2 1⁄4 0 : 75, while the inactive emission probability of ð TF 2 ; AN Þ is 0.25. Meanwhile, the active emission probabilities of all other states ...
Context 2
... In the next step To model ( Fig. 1, 2-TF step collaborative 2), we design modules, a new HMM our HMM model consists to infer of two the regu- TFs, TF latory 1 and interaction TF 2 , where models each for TF the has TF four À target states interactions (i.e., AS , AN in , every RS , and regulatory RN ), as module shown detected in Fig. 2. above. Each state For the emits regulatory two possible modules outputs, with active two TFs, or inactive. we run the One HMM can view model a directly. state as a For representation the regulatory of whether modules a with particular a single regulatory TF, we add interaction a dummy model TF with for an its expression individual value TF À target constantly interaction zero. Some is valid researchers (active) have or pointed invalid out (inactive). that designing In the training an HMM process, model is if a one sort of of the art. four 14 In states the following of TF text, emits we an describe active output, the details the of HMM how we design the structure, set the initial probabilities, and develop the updating method for emission probabilities and transition probabilities for our HMM for 2-TF collaborative regulatory modules. Structure. To model 2-TF collaborative modules, our HMM consists of two TFs, TF 1 and TF 2 , where each TF has four states (i.e., AS , AN , RS , and RN ), as shown in Fig. 2. Each state emits two possible outputs, active or inactive. One can view a state as a representation of whether a particular regulatory interaction model for an individual TF À target interaction is valid (active) or invalid (inactive). In the training process, if one of the four states of TF emits an active output, the ...
Context 3
... In the next step To model ( Fig. 1, 2-TF step collaborative 2), we design modules, a new HMM our HMM model consists to infer of two the regu- TFs, TF latory 1 and interaction TF 2 , where models each for TF the has TF four À target states interactions (i.e., AS , AN in , every RS , and regulatory RN ), as module shown detected in Fig. 2. above. Each state For the emits regulatory two possible modules outputs, with active two TFs, or inactive. we run the One HMM can view model a directly. state as a For representation the regulatory of whether modules a with particular a single regulatory TF, we add interaction a dummy model TF with for an its expression individual value TF À target constantly interaction zero. Some is valid researchers (active) have or pointed invalid out (inactive). that designing In the training an HMM process, model is if a one sort of of the art. four 14 In states the following of TF text, emits we an describe active output, the details the of HMM how we design the structure, set the initial probabilities, and develop the updating method for emission probabilities and transition probabilities for our HMM for 2-TF collaborative regulatory modules. Structure. To model 2-TF collaborative modules, our HMM consists of two TFs, TF 1 and TF 2 , where each TF has four states (i.e., AS , AN , RS , and RN ), as shown in Fig. 2. Each state emits two possible outputs, active or inactive. One can view a state as a representation of whether a particular regulatory interaction model for an individual TF À target interaction is valid (active) or invalid (inactive). In the training process, if one of the four states of TF emits an active output, the ...
Similar publications
: A key unanswered question in plant biology is how a plant regulates metabolism to maximize performance across an array of biotic and abiotic environmental stresses. In this study, we addressed the potential breadth of transcriptional regulation that can alter accumulation of the defensive glucosinolate metabolites in Arabidopsis. A systematic yea...
Cold acclimation is an important adaptive response of plants from temperate regions to increase their freezing tolerance after being exposed to low nonfreezing temperatures. The three CBF genes are well known to be involved in cold acclimation. As the three CBF genes are linked tandemly in the Arabidopsis genome, it is almost impossible to obtain c...
Pathogen attack leads to transcriptional changes and metabolic modifications allowing the establishment of appropriate plant
defences. Transcription factors (TFs) are key players in plant innate immunity. Notably, ethylene response factor (ERF) TFs
are integrators of hormonal pathways and are directly responsible for the transcriptional regulation...
Citations
... A prominent direction for addressing this problem is using computational data mining approaches for the analysis of high-throughput biological data, such as gene expression data [1][2][3][4]. In particular, analysis methods have been developed to infer regulatory interactions from transcriptome data [5][6][7][8][9][10][11][12][13][14]. These regulatory interactions link regulators, such as transcription factors and kinases, to their targets and may include the regulatory type of the interaction, which indicates whether there is an activating (positive) or inhibitory (negative) association between the interactor pair. ...
Knowledge of interaction types in biological networks is important for understanding the functional organization of the cell. Currently information-based approaches are widely used for inferring gene regulatory interactions from genomics data, such as gene expression profiles; however, these approaches do not provide evidence about the regulation type (positive or negative sign) of the interaction.
This paper describes a novel algorithm, "Signing of Regulatory Networks" (SIREN), which can infer the regulatory type of interactions in a known gene regulatory network (GRN) given corresponding genome-wide gene expression data. To assess our new approach, we applied it to three different benchmark gene regulatory networks, including Escherichia coli, prostate cancer, and an in silico constructed network. Our new method has approximately 68, 70, and 100 percent accuracy, respectively, for these networks. To showcase the utility of SIREN algorithm, we used it to predict previously unknown regulation types for 454 interactions related to the prostate cancer GRN.
SIREN is an efficient algorithm with low computational complexity; hence, it is applicable to large biological networks. It can serve as a complementary approach for a wide range of network reconstruction methods that do not provide information about the interaction type.
... In our previous research [19], a Hidden Markov model was developed to relate gene expression patterns to regulatory interactions, in order to solve a relatively simpler subproblem that considers only two TFs. To predict regulatory interactions for all possible collaborative TFs, we propose an algorithm called "mTRIM" (multiple Transcriptional Regulatory Interaction Mechanism) in this paper. ...
... mTRIM was applied on two independently-constructed yeast transcriptional regulatory networks (the Harbison dataset [15] and the Reimand dataset [12]) to identify regulatory interactions. For performance comparison, DREM v3.0 [17] and TRIM [19] were both applied on the same datasets. We did not compare mTRIM with Yeang's method [3] because the latter's objective is to build a reliable TRN instead of predicting regulatory interactions. ...
... In these experiments, yeast cells were first synchronized to the same cell cycle stage, released from synchronization, and then the total RNA samples were taken at even intervals for a period of time (Table SI in Additional file 1). In order to decide whether a gene is significantly up or down regulated, a gene expression change cutoff of 0.35 was applied (the same threshold used in [19]). ...
Living cells are realized by complex gene expression programs that are moderated by regulatory proteins called transcription factors (TFs). The TFs control the differential expression of target genes in the context of transcriptional regulatory networks (TRNs), either individually or in groups. Deciphering the mechanisms of how the TFs control the expression of target genes is a challenging task, especially when multiple TFs collaboratively participate in the transcriptional regulation.
We model the underlying regulatory interactions in terms of the directions (activation or repression) and their logical roles (necessary and/or sufficient) with a modified association rule mining approach, called mTRIM. The experiment on Yeast discovered 670 regulatory interactions, in which multiple TFs express their functions on common target genes collaboratively. The evaluation on yeast genetic interactions, TF knockouts and a synthetic dataset shows that our algorithm is significantly better than the existing ones.
mTRIM is a novel method to infer TF collaborations in transcriptional regulation networks. mTRIM is available at http://www.msu.edu/~jinchen/mTRIM.
This paper proposes a novel multi-Laplacian prior (MLP) and augmented Lagrangian method (ALM) approach for gene interactions and putative transcription factors (TFs) identification from time-course gene microarray data. It employs a non-linear time-varying auto-regressive (N-TVAR) model and the Maximum-A-Posteriori-Probability method for incorporating the multi-Laplacian prior and the continuity constraint. The MLP allows connections to/from a gene to be better preserved for putative TF identification in non-stationarity gene regulatory network as compared with conventional L
1
-based penalties. Moreover, the ALM allows the resultant non-smooth L
1
-based penalties to be decoupled from the remaining smooth terms, so that the former and latter can be efficiently solved using a low-complexity proximity operator and smooth optimization technique, respectively. Synthetic and real time-course gene microarray datasets are tested to evaluate the performance of the proposed method. Experimental results show that the proposed method gives better accuracy and higher computational speed than our previous work using smoothed approximation. Moreover, its performance, without the use of ChIP-chip data, is found to be highly comparable with other state-of-the-art methods integrating both ChIP-chip and gene microarray data. It suggests that the proposed method may serve as a useful exploratory tool for putative TF identification with reduced experimental cost.
Exploring the complex interactive mechanism in a Gene Regulatory Network (GRN) developed using transcriptome data obtained from standard microarray and/or RNA-seq experiments helps us to understand the triggering factors in cancer research. The Transcription Factor (TF) genes generate protein complexes which affect the transcription of various target genes. However, considering the mode of regulation in a time frame such transcriptional activities are dependent on some specific activation time points only. It is also crucial to check whether the regulating capabilities are uniform across varied conditions, especially when periodicity is a big issue. In this context, we propose an algorithm called RIFT which helps to monitor the temporal differential regulatory pattern of a Differentially Expressed (DE) target gene either by a TF gene or a group of TF genes from a large time series (TS) data. We have tested our algorithm on HeLa cell cycle data and compared the result with its most advanced state of the art counterpart proposed so far. As our algorithm yields up stringent mode and target specific significant valid TF genes for a DE gene, we can expect to have new forms of genetic interactions.
Identifying condition-specific co-expressed gene groups is critical for gene functional and regulatory analysis. However, given that genes with critical functions (such as transcription factors) may not co-express with their target genes, it is insufficient to uncover gene functional associations only from gene expression data. In this paper, we propose a novel integrative biclustering approach to build high quality biclusters from gene expression data, and to identify critical missing genes in biclusters based on Gene Ontology as well. Our approach delivers a complete inter- and intra-bicluster functional relationship, thus provides biologists a clear picture for gene functional association study. We experimented with the Yeast cell cycle and Arabidopsis cold-response gene expression datasets. Experimental results show that a clear inter- and intra-bicluster relationship is identified, and the biological significance of the biclusters is considerably improved.