Article
Mining DNA sequences to predict sites which mutations cause genetic diseases.
Knowledge-Based Systems (impact factor:
2.42).
01/2002;
15(4):225-233.
pp.225-233
Source: DBLP
-
Article: Point mutations within 663-666 bp of intron 6 of the human TDO2 gene, associated with a number of psychiatric disorders, damage the YY-1 transcription factor binding site.
[show abstract] [hide abstract]
ABSTRACT: Single base mutations G-->A at position 663 and G-->T at position 666 of intron 6 of the human tryptophan oxygenase gene (TDO2) are associated with a variety of psychiatric disorders [Comings, D.E. et al. (1996) Pharmacogenetics 6, 307-318]. Binding of rat liver nuclear extract proteins to synthetic double-strand oligonucleotides corresponding to three allelic states of the region between 651 bp and 680 bp of human TDO2 intron 6 has been studied by gel shift assay. It has been demonstrated that to each allelic state of the region there corresponds a specific set of proteins that interacts with it. With the aid of computer analysis and using specific anti-YY-1 antibodies it has been shown that both mutations damage the YY-1 transcription factor binding site.FEBS Letters 11/1999; 462(1-2):85-8. · 3.54 Impact Factor -
Article: Automated extraction of information in molecular biology.
[show abstract] [hide abstract]
ABSTRACT: We review data mining techniques in molecular biology, specifically those that extract information from the scientific literature itself. As more of the biological literature is published electronically, there is an opportunity, and even a need, to automatically summarize the literature in a customized way, for example by associating keywords to a topic. These keywords can be extracted from relevant publications. The process of keyword extraction can be automated and optimized to keep literature pointers automatically up-to-date or to filter relevant information from the literature. To illustrate these points, OMIM (Online Mendelian Inheritance in Man), a database of human inherited diseases, was linked to the literature and keywords were derived that covered distinct aspects such as genetic information on the one hand and disease-specific protein and phenotypic information on the other. They were used to extract information that is helpful for keeping entries about disease up-to-date.FEBS Letters 07/2000; 476(1-2):12-7. · 3.54 Impact Factor -
Article: Mining of biological data II: assessing data structure and class homogeneity by cluster analysis.
[show abstract] [hide abstract]
ABSTRACT: An important step in data analysis is class assignment which is usually done on the basis of a macroscopic phenotypic or bioprocess characteristic, such as high vs low growth, healthy vs diseased state, or high vs. low productivity. Unfortunately, such an assignment may lump together samples, which when derived from a more detailed phenotypic or bioprocess description are dissimilar, giving rise to models of lower quality and predictive power. In this paper we present a clustering algorithm for data preprocessing which involves the identification of fundamentally similar lots on the basis of the extent of similarity among the system variables. The algorithm combines aspects of cluster analysis and principal component analysis by applying agglomerative clustering methods to the first principal component of the system data matrix. As part of a rational strategy for developing empirical models, this technique selects lots (samples) which are most appropriate for inclusion in a training set by analyzing multivariate data homogeneity. Samples with similar data structures are identified and grouped together into distinct clusters. This knowledge is used in the formation of potential training sets. Additionally, this technique can identify atypical lots, i.e., samples that are not simply outliers but exhibit the general properties of one class but have been given the assignment of the other. The method is presented along with examples from its application to fermentation data sets.Metabolic Engineering 08/2000; 2(3):228-38. · 5.61 Impact Factor
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed.
The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual
current impact factor.
Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence
agreement may be applicable.
Keywords
data mining system
disease-associated experimental data
fourth step
genetic disease penetration
genetic disease penetrations
genetic diseases
genomic DNA
last step
second step
single nucleotide polymorphism