-
[show abstract]
[hide abstract]
ABSTRACT: Some splicing isoform-specific transcriptional regulations are related to disease. Therefore, detection of disease specific splice variations is the first step for finding disease specific transcriptional regulations. Affymetrix Human Exon 1.0 ST Array can measure exon-level expression profiles that are suitable to find differentially expressed exons in genome-wide scale. However, exon array produces massive datasets that are more than we can handle and analyze on personal computer.
We have developed ExonMiner that is the first all-in-one web service for analysis of exon array data to detect transcripts that have significantly different splicing patterns in two cells, e.g. normal and cancer cells. ExonMiner can perform the following analyses: (1) data normalization, (2) statistical analysis based on two-way ANOVA, (3) finding transcripts with significantly different splice patterns, (4) efficient visualization based on heatmaps and barplots, and (5) meta-analysis to detect exon level biomarkers. We implemented ExonMiner on a supercomputer system in order to perform genome-wide analysis for more than 300,000 transcripts in exon array data, which has the potential to reveal the aberrant splice variations in cancer cells as exon level biomarkers.
ExonMiner is well suited for analysis of exon array data and does not require any installation of software except for internet browsers. What all users need to do is to access the ExonMiner URL http://ae.hgc.jp/exonminer. Users can analyze full dataset of exon array data within hours by high-level statistical analysis with sound theoretical basis that finds aberrant splice variants as biomarkers.
BMC Bioinformatics 12/2008; 9:494. · 2.75 Impact Factor
-
Yosuke Hatanaka,
Masao Nagasaki,
Rui Yamaguchi,
Takeshi Obayashi, Kazuyuki Numata,
Andrè Fujita,
Teppei Shimamura,
Yoshinori Tamada,
Seiya Imoto,
Kengo Kinoshita,
Kenta Nakai,
Satoru Miyano
[show abstract]
[hide abstract]
ABSTRACT: We report various transcription factor binding sites (TFBSs) conserved among co-expressed genes in human promoter region using expression and genomic data. Assuming similar promoter structure induces similar transcriptional regulation, hence induces similar expression profile, we compared the promoter structure similarities between co-expressed genes. Comprehensive TF binding site predictions for all human genes were conducted for 19,777 promoter regions around the transcription start site (TSS) given from DBTSS and promoter similarity search were conducted among coexpressing genes data provided from newly developed COXPRESdb. Combination of Position Weight Matrix (PWM) motif prediction and bootstrap method, 7,313 genes have at least one statistically significant conserved TFBS. We also applied basket method analysis for seeking combinatorial activities of those conserved TFBSs.
Genome informatics. International Conference on Genome Informatics 02/2008; 20:212-21.
-
[show abstract]
[hide abstract]
ABSTRACT: For learning Bayesian network structure from data, order-based algorithms such as K2 algorithm are widely used.In this paper, we consider a problem of constructing the order of nodes in such algorithms based on prior knowledge of gene networks. However, in many cases the prior knowledge is given as partial order of genes and we need to extend the order-based algorithm to partial order-based one. By extending our prior work we propose an efficient partial order-based algorithm for estimating gene networks based on Bayesian networks. The computational complexity of the proposed algorithm is shown.
2008 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2008, 3-5 November 2008, Philadephia, Pennsylvania, USA; 01/2008
-
Mining and Learning with Graphs, MLG 2007, Firence, Italy, August 1-3, 2007, Proceedings; 01/2007
-
Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE 2007, October 14-17, 2007, Harvard Medical School, Boston, MA, USA; 01/2007
-
[show abstract]
[hide abstract]
ABSTRACT: Estimation of gene networks based on microarray gene expression data is an important problem in systems biology. In this paper we use Bayesian networks as a mathematical model for reverse-engineering gene networks from microarray data. In such a case, structural learning of Bayesian networks is known as an NP-hard problem and we need to use heuristic algorithms to find better network structures. Recently, several algorithms have been proposed to estimate optimal Bayesian network structure, but the number of genes included in the network is limited less than 30 or so. In order to apply Bayesian network approach to drug target gene discovery, we need to consider gene networks with several hundreds of genes. Therefore we need to develop more efficient algorithms to learn Bayesian network structure based on observed data. In this paper we propose an efficient structural learning algorithm for Bayesian networks by extending K2 algorithm that is one of the standard learning algorithms in Bayesian networks. We conduct Monte Carlo simulations to examine the effectiveness of the proposed algorithm by comparing with greedy hill-climbing algorithm. We also show the application of yeast gene network estimation based on the proposed algorithm.
Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering, BIBE 2007, October 14-17, 2007, Harvard Medical School, Boston, MA, USA; 01/2007
-
[show abstract]
[hide abstract]
ABSTRACT: Alternative splicing is an important regulatory mechanism that generates multiple mRNA transcripts which are transcribed into functionally diverse proteins. According to the current studies, aberrant transcripts due to splicing mutations are known to cause for 15% of genetic diseases. Therefore understanding regulatory mechanism of alternative splicing is essential for identifying potential biomarkers for several types of human diseases. Most recently, advent of GeneChip Human Exon 1.0 ST Array enables us to measure genome-wide expression profiles of over one million exons. With this new microarray platform, analysis of functional gene expressions could be extended to detect not only differentially expressed genes, but also a set of specific-splicing events that are differentially observed between one or more experimental conditions, e.g. tumor or normal control cells. In this study, we address the statistical problems to identify differentially observed splicing variations from exon expression profiles. The proposed method is organized according to the following process: (1) Data preprocessing for removing systematic biases from the probe intensities. (2) Whole transcript analysis with the analysis of variance (ANOVA) to identify a set of loci that cause the alternative splicing-related to a certain disease. We test the proposed statistical approach on exon expression profiles of colorectal carcinoma. The applicability is verified and discussed in relation to the existing biological knowledge. This paper intends to highlight the potential role of statistical analysis of all exon microarray data. Our work is an important first step toward development of more advanced statistical technology. Supplementary information and materials are available from http://bonsai.ims.u-tokyo.ac.jp/~yoshidar/IBSB2006_ExonArray.htm.
Genome informatics. International Conference on Genome Informatics 02/2006; 17(1):88-99.
-
-