A GO-driven semantic similarity measure for quantifying the biological relatedness of gene products.

Intelligent Decision Technologies 01/2009; 3:239-248. DOI: 10.3233/IDT-2009-0059
Source: DBLP

ABSTRACT Advances in biological experiments, such as DNA microarrays, have produced large multidimensional data sets for examination and retrospective analysis. Scientists however, heavily rely on existing biomedical knowledge in order to fully analyze and comprehend such datasets. Our proposed framework relies on the Gene Ontology for integrating a priori biomedical knowledge into traditional data analysis approaches. We explore the impact of considering each aspect of the Gene Ontology individually for quantifying the biological relatedness between gene products. We discuss two figure of merit scores for quantifying the pair-wise biological relatedness between gene products and the intra-cluster biological coherency of groups of gene products. Finally, we perform cluster deterioration simulation experiments on a well scrutinized Saccharomyces cerevisiae data set consisting of hybridization measurements. The results presented illustrate a strong correlation between the devised cluster coherency figure of merit and the randomization of cluster membership.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Research has been done to explore the relationships between the Gene Ontology-based similarity and gene expression profiles in the mammalian brain. However, little attention has been paid to the location information of a gene's expressions. Gene expression maps, which contain spatial information regarding the expression of genes in mice's brain, are obtained by combining voxelation and microarrays. Based on the hypothesis that genes with similar gene expression maps may have similar gene functions, we propose an approach to identify pair-wise gene functional similarities by gene expression maps. By considering pairs of genes from an original dataset as samples whose features are extracted from expression maps and labels are the functional similarities of pairs of genes, we explore the relationship between similarities of gene maps and gene functions. We restrict the dataset to genes that are associated with previously detected functional expression profiles to strengthen the relationship. We use AdaBoost, coupled with our proposed weak classifier, to analyze the dataset and predict the functional similarities. The experimental results show that with the increasing similarities of gene expression maps, the functional similarities are increased too. The boosting analysis can predict the functional similarities between genes to a certain degree. The weights of the features in the model indicate which features are significant for this prediction. These findings can potentially assist the biologists by providing helpful clues in predicting gene functions.

Full-text (2 Sources)

Available from
Jun 5, 2014