An improved scoring scheme for predicting glycan structures from gene expression data.

Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan.
Genome informatics. International Conference on Genome Informatics 02/2007; 18:237-46. DOI: 10.1142/9781860949920_0023
Source: PubMed


The prediction of glycan structures from gene expression of glycosyltransferases (GTs) is a challenging new area in computational biology because the biosynthesis of glycan chains is under the control of GT expression. In this paper we developed a new method for predicting glycan structures from gene expression data. There are two main original aspects of the proposed method. First, we proposed to increase the number of predictable glycan structure candidates by estimating missing glycans from a global glycan structure map, which enables us to predict new glycan structures that are not stored in the database. Second, we proposed a more general scoring scheme based on real-valued gene expression intensity rather than converting it into binary information. In the result we applied the proposed method to predicting cancer-specific glycan structures from gene expression profiles for patients of acute lymphocytic leukemia (ALL) and acute myelocytic leukemia (AML). We confirmed that several of the predicted glycan structures successfully correspond to known cancer-specific glycan structures according to the literature, and our method outperforms the previous methods at a statistically significant level.

6 Reads
  • [Show abstract] [Hide abstract]
    ABSTRACT: Abstract Glycoproteins for treating human diseases have revolutionized the health care industry. However, controlling glycosylation has been a challenge as small variations in glycan structure can be responsible for significant changes in key therapeutic properties. Manipulation of glycan biosynthesis can be particularly complex since the process is not directly encoded on the genome but depends on multiple variables such as enzymes’ activity, selectivity, localization, expression host, and process parameters and conditions. Furthermore, a particular glycoprotein may include many different glycan structures due to differences in processing that occur for each individual molecule. The present chapter focuses on experimental and computational approaches to direct N-glycosylation in expression systems for generation of biotherapeutics of superior value. Glycoengineering-based manipulations of glycan structures using glycosyltransferases, modification of precursor biosynthetic pathways, and predictions of glycosylation patterns using mathematical models are described including examples from the literature as a means of optimizing glycoform distributions in cells.
  • [Show abstract] [Hide abstract]
    ABSTRACT: While much knowledge about glycans has accumulated, most of the primary data for both experimental and bioinformatics analyses are information on glycan structures and glycan-related genes. Therefore, it is necessary to aggregate all these chemical and genomic information together in an effort to predict complex cellular processes and to understand organism behaviors at a higher level. Following this perspective, we have been developing KEGG (Kyoto Encyclopedia of Genes and Genomes) (Kanehisa et al. 2008) to include three basic glycan resources: (1) GLYCAN, a database of glycan structures; (2) the glycosyltransferase and the glycosidase reaction library; and (3) glycanrelated pathways (Hashimoto et al. 2006). KEGG also includes three useful tools: (4) the Composite Structure Map (CSM), a map illustrating all possible variations in glycan structures within organisms; (5) KegDraw, an intuitive drawing tool for chemical structures; and (6) KCaM, a glycan search and alignment tool (Aoki et al. 2004). All resources are available at KEGG (
    Handbook of Glycomics, 12/2007: pages 441-444;
  • Source
    PLoS Computational Biology 06/2008; 4(5):e1000075. DOI:10.1371/journal.pcbi.1000075 · 4.62 Impact Factor
Show more


6 Reads
Available from