Extracting Metadata from Biological Experimental Data.
ABSTRACT The process of automatically extracting metadata from an experiment's dataset is an important stage in efficiently integrating this dataset with data available in public bioinformatics data sources. Metadata extracted from the experiment's dataset can be stored in databases and used to verify data extracted from other experiments' datasets. Moreover, the biologist can keep track of the dataset so that it can be easily retrieved next time. The extracted metadata can be mined to discover useful knowledge as well as integrated with other information using domain ontology to reveal hidden relationships. The experiment's dataset may contain several kinds of metadata that can be used to add semantic value to linked data. This paper describes an approach for extracting metadata from an experiment's dataset. This system has been used in a preliminary investigation of aging across species
- SourceAvailable from: Kara Dolinski[show abstract] [hide abstract]
ABSTRACT: Genomic sequencing is no longer a novelty, but gene function annotation remains a key challenge in modern biology. A variety of functional genomics experimental techniques are available, from classic methods such as affinity precipitation to advanced high-throughput techniques such as gene expression microarrays. In the future, more disparate methods will be developed, further increasing the need for integrated computational analysis of data generated by these studies. We address this problem with MAGIC (Multisource Association of Genes by Integration of Clusters), a general framework that uses formal Bayesian reasoning to integrate heterogeneous types of high-throughput biological data (such as large-scale two-hybrid screens and multiple microarray analyses) for accurate gene function prediction. The system formally incorporates expert knowledge about relative accuracies of data sources to combine them within a normative framework. MAGIC provides a belief level with its output that allows the user to vary the stringency of predictions. We applied MAGIC to Saccharomyces cerevisiae genetic and physical interactions, microarray, and transcription factor binding sites data and assessed the biological relevance of gene groupings using Gene Ontology annotations produced by the Saccharomyces Genome Database. We found that by creating functional groupings based on heterogeneous data types, MAGIC improved accuracy of the groupings compared with microarray analysis alone. We describe several of the biological gene groupings identified.Proceedings of the National Academy of Sciences 08/2003; 100(14):8348-53. · 9.74 Impact Factor
- [show abstract] [hide abstract]
ABSTRACT: A report on Barnett International's 4th annual Bioinformatics and Data Integration conference, Philadelphia, USA, 7-8 March 2002.Genome biology 08/2002; 3(8):REPORTS4027. · 10.30 Impact Factor
- Endocrinology 07/2002; 143(6):1983-9. · 4.72 Impact Factor