-
Weidong Mao
[show abstract]
[hide abstract]
ABSTRACT: High-throughput single nucleotide polymorphism (SNP) genotyping technologies make massive genotype data, with a large number of individuals, publicly available. Accessibility of genetic data makes genome-wide association studies for complex diseases possible. The susceptibility to complex diseases can be predicted through the analysis of the genetic data and prospective patients can be helped to make informed decisions. With the development of DNA microarray technique, it is possible to access the human genetic information related to specific diseases, but most disease association studies are based on genotypes. This paper uses a combinatorial method to analyze the haplotype data for Crohn's disease and search disease-associated factors for given case/control samples. A Linear programming based method has been applied to publicly available genotype data on Crohn's disease for association study and achieved a promising result.
Bioinformatics and Biomedical Engineering, 2008. ICBBE 2008. The 2nd International Conference on; 06/2008
-
2006 IEEE International Conference on Granular Computing, GrC 2006, Atlanta, Georgia, USA, May 10-12, 2006; 01/2006
-
[show abstract]
[hide abstract]
ABSTRACT: Although there exist many phasing methods for unrelated adults or pedigrees, phasing and missing data recovery for data representing family trios is lagging behind. This paper is an attempt to fill this gap by considering the following problem. Given a set of genotypes partitioned into family trios, find for each trio a quartet of parent/offspring haplotypes explaining each trio without recombinations and recovering the SNP values missed in given genotype data. Our contributions include: formulating the pure-parsimony trio phasing without recombinations and the trio missing data recovery problems; proposing new greedy and integer linear programming based solution methods; extensive experimental validation of proposed methods showing advantage over the previously known methods.
International Journal of Bioinformatics Research and Applications 02/2005; 1(2):221-9.
-
[show abstract]
[hide abstract]
ABSTRACT: Recent improvements in the accessibility of high-throughput genotyping have brought a great deal of attention to disease association and susceptibility studies. This paper explores possibility of applying combinatorial methods to disease susceptibility prediction. The proposed combinatorial methods as well as standard statistical methods are applied to publicly available genotype data on Crohn's disease and autoimmune disorders for predicting susceptibility to these diseases. The quality of susceptibility prediction algorithm is assessed using leave-one-out and leave-many-out tests - the disease status of one or several individuals is predicted and compared to the their actual disease status which is initially made unknown to the algorithm. The best prediction rate achieved by the proposed algorithms is 77.78% for Crohn's disease and 64.99% for autoimmune disorders, respectively
Engineering in Medicine and Biology Society, 2005. IEEE-EMBS 2005. 27th Annual International Conference of the; 02/2005
-
[show abstract]
[hide abstract]
ABSTRACT: Recent improvements in the accessibility of high-throughput genotyping have brought a great deal of attention to disease association and susceptibility studies. This paper explores possibility of applying combinatorial methods to disease susceptibility prediction. The proposed combinatorial methods as well as standard statistical methods are applied to publicly available genotype data on Crohn's disease and autoimmune disorders for predicting susceptibility to these diseases. The quality of susceptibility prediction algorithm is assessed using leave-one-out and leave-many-out tests - the disease status of one or several individuals is predicted and compared to the their actual disease status which is initially made unknown to the algorithm. The best prediction rate achieved by the proposed algorithms is 77.78% for Crohn's disease and 64.99% for autoimmune disorders, respectively.
Conference proceedings: ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference 02/2005; 1:224-7.
-
Computational Science - ICCS 2005, 5th International Conference, Atlanta, GA, USA, May 22-25, 2005, Proceedings, Part II; 01/2005
-
Weidong. Mao
[show abstract]
[hide abstract]
ABSTRACT: The accessibility of high-throughput biology data brought a great deal of attention to disease association studies. High density maps of single nucleotide polymorphism (SNP's) as well as massive genotype data with large number of individuals and number of SNP's become publicly available. By now most analysis of the new data is undertaken by the statistics community. In this dissertation, we pursue a different line of attack on genetic susceptibility to complex disease that adheres to the computer science community with an emphasis on design rather than analytical methodology. The main goal of disease association analysis is to identify gene variations contributing to the risk of and/or susceptibility to a particular disease. There are basically two main steps in susceptibility: (i) haplotyping of the population and (ii) predicting the genetic susceptibility to diseases. Although there exist many phasing methods for step (i), phasing and missing data recovery for data representing family trios is lagging behind, and most disease association studies are based on family trios. This study is devoted to the problem of assessing accumulated information targeting to predict genotype susceptibility to complex diseases with significantly high accuracy and statistical power. The dissertation proposes two new greedy and integer linear programming based solution methods for step (i). We also proposed several universal and ad hoc methods for step (ii). The quality of susceptibility prediction algorithm has been assessed using leave-one-out and leave-many-out tests and shown to be statistically significant based on randomization tests. The prediction of disease status can also be viewed as an integrated risk factor. A combinatorial prediction complexity measure has been proposed for case/control studies. The best prediction rate achieved by the proposed algorithms is 69.5% for Crohn's disease and 61.3% for autoimmune disorder, respectively, which are significantly higher than those achieved by universal prediction methods such as Support Vector Machine (SVM) and known statistic methods. Text (Thesis). System requirements: PC, World Wide Web browser and PDF reader. Title from title screen. Alex Zelikovsky, committee chair; Andrey Perelygin, Robert Harrison, Anu Bourgeois , committee member. Electronic text (138 p. : ill. (some col.)) : digital, PDF file. Thesis (Ph. D.)--Georgia State University, 2006. Includes bibliographical references (p. 133-138). Description based on contents viewed July 9, 2007.