An improved scoring scheme for predicting glycan structures from gene expression data.

Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan.
Genome informatics. International Conference on Genome Informatics 02/2007; 18:237-46. DOI: 10.1142/9781860949920_0023
Source: PubMed


The prediction of glycan structures from gene expression of glycosyltransferases (GTs) is a challenging new area in computational biology because the biosynthesis of glycan chains is under the control of GT expression. In this paper we developed a new method for predicting glycan structures from gene expression data. There are two main original aspects of the proposed method. First, we proposed to increase the number of predictable glycan structure candidates by estimating missing glycans from a global glycan structure map, which enables us to predict new glycan structures that are not stored in the database. Second, we proposed a more general scoring scheme based on real-valued gene expression intensity rather than converting it into binary information. In the result we applied the proposed method to predicting cancer-specific glycan structures from gene expression profiles for patients of acute lymphocytic leukemia (ALL) and acute myelocytic leukemia (AML). We confirmed that several of the predicted glycan structures successfully correspond to known cancer-specific glycan structures according to the literature, and our method outperforms the previous methods at a statistically significant level.

Full-text preview

Available from:
  • [Show abstract] [Hide abstract]
    ABSTRACT: While much knowledge about glycans has accumulated, most of the primary data for both experimental and bioinformatics analyses are information on glycan structures and glycan-related genes. Therefore, it is necessary to aggregate all these chemical and genomic information together in an effort to predict complex cellular processes and to understand organism behaviors at a higher level. Following this perspective, we have been developing KEGG (Kyoto Encyclopedia of Genes and Genomes) (Kanehisa et al. 2008) to include three basic glycan resources: (1) GLYCAN, a database of glycan structures; (2) the glycosyltransferase and the glycosidase reaction library; and (3) glycanrelated pathways (Hashimoto et al. 2006). KEGG also includes three useful tools: (4) the Composite Structure Map (CSM), a map illustrating all possible variations in glycan structures within organisms; (5) KegDraw, an intuitive drawing tool for chemical structures; and (6) KCaM, a glycan search and alignment tool (Aoki et al. 2004). All resources are available at KEGG (
    Handbook of Glycomics, 12/2007: pages 441-444;
  • Source

    PLoS Computational Biology 06/2008; 4(5):e1000075. DOI:10.1371/journal.pcbi.1000075 · 4.62 Impact Factor

  • Seikagaku. The Journal of Japanese Biochemical Society 12/2008; 80(11):1038-41. · 0.04 Impact Factor
Show more