Publications (9) View all
-
Article: Small Ancestry Informative Marker panels for complete classification between the original four HapMap populations.
Damrongrit Setsirichok, Theera Piroonratana, Anunchai Assawamakin, Touchpong Usavanarong, Chanin Limwongse, Waranyu Wongseree, Chatchawit Aporntewan, Nachol Chaiyaratana[show abstract] [hide abstract]
ABSTRACT: A protocol for the identification of Ancestry Informative Markers (AIMs) from genome-wide Single Nucleotide Polymorphism (SNP) data is proposed. The protocol consists of three main steps: identification of potential positive selection regions via F(ST) extremity measurement, SNP screening via two-stage attribute selection and classification model construction using a Naïve Bayes classifier. The two-stage attribute selection is composed of a newly developed round robin Symmetrical Uncertainty (SU) ranking technique and a wrapper embedded with a Naïve Bayes classifier. The protocol has been applied to the HapMap Phase II data. Two AIM panels, which consist of 10 and 16 SNPs that lead to complete classification between CEU, CHB, JPT and YRI populations, are identified. Moreover, the panels are at least four times smaller than those reported in previous studies. The results suggest that the protocol could be useful in a scenario involving a larger number of populations.International Journal of Data Mining and Bioinformatics 01/2012; 6(6):651-74. · 0.43 Impact Factor -
SourceAvailable from: Anunchai Assawamakin
Conference Proceeding: Identification of Ancestry Informative Markers from Chromosome-Wide Single Nucleotide Polymorphisms Using Symmetrical Uncertainty Ranking.
Theera Piroonratana, Waranyu Wongseree, Touchpong Usavanarong, Anunchai Assawamakin, Chanin Limwongse, Nachol Chaiyaratana20th International Conference on Pattern Recognition, ICPR 2010, Istanbul, Turkey, 23-26 August 2010; 01/2010 -
SourceAvailable from: Anunchai Assawamakin
Chapter: An Omnibus Permutation Test on Ensembles of Two-Locus Analyses for the Detection of Purely Epistatic Multi-locus Interactions
Waranyu Wongseree, Anunchai Assawamakin, Theera Piroonratana, Saravudh Sinsomros, Chanin Limwongse, Nachol Chaiyaratana[show abstract] [hide abstract]
ABSTRACT: Purely epistatic multi-locus interactions cannot generally be detected via single-locus analysis in case-control studies of complex diseases. Recently, many two-locus and multi-locus analysis techniques have been shown to be promising for the epistasis detection. However, exhaustive multi-locus analysis requires prohibitively large computational efforts when problems involve large-scale or genome-wide data. Furthermore, there is no explicit proof that a combination of multiple two-locus analyses can lead to the correct identification of multi-locus interactions. 2LOmb which performs an omnibus permutation test on ensembles of two-locus analyses is proposed. The algorithm consists of four main steps: two-locus analysis, a permutation test, global p-value determination and a progressive search for the best ensemble. 2LOmb is benchmarked against a set association approach, a correlation-based feature selection technique and a tuned ReliefF technique. The simulation results from multi-locus interaction problems indicate that 2LOmb has a low false-positive error. Moreover, 2LOmb has the best performance in terms of an ability to identify all causative single nucleotide polymorphisms (SNPs), which signifies a high detection power. 2LOmb is subsequently applied to type 1 and type 2 diabetes mellitus (T1D and T2D) data sets, which are obtained as a part of the UK genome-wide genetic epidemiology study by the Wellcome Trust Case Control Consortium. After primarily screening for SNPs that locate within or near candidate genes and exhibit no marginal single-locus effects, the T1D and T2D data sets are reduced to 2,359 SNPs from 350 genes and 7,065 SNPs from 370 genes, respectively. The 2LOmb search reveals that 28 SNPs in 21 genes are associated with T1D while 11 SNPs in four genes are associated with T2D. The findings provide an alternative explanation for the aetiology of T1D and T2D in a UK population. Keywordsbioinformatics-epistasis-genetic association study-genetic epidemiology-machine learning12/2009: pages 493-502; -
SourceAvailable from: Anunchai Assawamakin
Article: Detecting purely epistatic multi-locus interactions by an omnibus permutation test on ensembles of two-locus analyses.
Waranyu Wongseree, Anunchai Assawamakin, Theera Piroonratana, Saravudh Sinsomros, Chanin Limwongse, Nachol Chaiyaratana[show abstract] [hide abstract]
ABSTRACT: Purely epistatic multi-locus interactions cannot generally be detected via single-locus analysis in case-control studies of complex diseases. Recently, many two-locus and multi-locus analysis techniques have been shown to be promising for the epistasis detection. However, exhaustive multi-locus analysis requires prohibitively large computational efforts when problems involve large-scale or genome-wide data. Furthermore, there is no explicit proof that a combination of multiple two-locus analyses can lead to the correct identification of multi-locus interactions. The proposed 2LOmb algorithm performs an omnibus permutation test on ensembles of two-locus analyses. The algorithm consists of four main steps: two-locus analysis, a permutation test, global p-value determination and a progressive search for the best ensemble. 2LOmb is benchmarked against an exhaustive two-locus analysis technique, a set association approach, a correlation-based feature selection (CFS) technique and a tuned ReliefF (TuRF) technique. The simulation results indicate that 2LOmb produces a low false-positive error. Moreover, 2LOmb has the best performance in terms of an ability to identify all causative single nucleotide polymorphisms (SNPs) and a low number of output SNPs in purely epistatic two-, three- and four-locus interaction problems. The interaction models constructed from the 2LOmb outputs via a multifactor dimensionality reduction (MDR) method are also included for the confirmation of epistasis detection. 2LOmb is subsequently applied to a type 2 diabetes mellitus (T2D) data set, which is obtained as a part of the UK genome-wide genetic epidemiology study by the Wellcome Trust Case Control Consortium (WTCCC). After primarily screening for SNPs that locate within or near 372 candidate genes and exhibit no marginal single-locus effects, the T2D data set is reduced to 7,065 SNPs from 370 genes. The 2LOmb search in the reduced T2D data reveals that four intronic SNPs in PGM1 (phosphoglucomutase 1), two intronic SNPs in LMX1A (LIM homeobox transcription factor 1, alpha), two intronic SNPs in PARK2 (Parkinson disease (autosomal recessive, juvenile) 2, parkin) and three intronic SNPs in GYS2 (glycogen synthase 2 (liver)) are associated with the disease. The 2LOmb result suggests that there is no interaction between each pair of the identified genes that can be described by purely epistatic two-locus interaction models. Moreover, there are no interactions between these four genes that can be described by purely epistatic multi-locus interaction models with marginal two-locus effects. The findings provide an alternative explanation for the aetiology of T2D in a UK population. An omnibus permutation test on ensembles of two-locus analyses can detect purely epistatic multi-locus interactions with marginal two-locus effects. The study also reveals that SNPs from large-scale or genome-wide case-control data which are discarded after single-locus analysis detects no association can still be useful for genetic epidemiology studies.BMC Bioinformatics 09/2009; 10:294. · 2.75 Impact Factor -
SourceAvailable from: Anunchai Assawamakin
Conference Proceeding: An Omnibus Permutation Test on Ensembles of Two-Locus Analyses for the Detection of Purely Epistatic Multi-locus Interactions.
Waranyu Wongseree, Anunchai Assawamakin, Theera Piroonratana, Saravudh Sinsomros, Chanin Limwongse, Nachol ChaiyaratanaNeural Information Processing, 16th International Conference, ICONIP 2009, Bangkok, Thailand, December 1-5, 2009, Proceedings, Part II; 01/2009