Article

A hybrid clustering and graph based algorithm for tagSNP selection

Soft Computing (impact factor: 1.88). 04/2012; 13(12):1143-1151. DOI:10.1007/s00500-009-0419-z pp.1143-1151

ABSTRACT TagSNP selection, which aims to select a small subset of informative single nucleotide polymorphisms (SNPs) to represent the
whole large SNP set, has played an important role in current genomic research. Not only can this cut down the cost of genotyping
by filtering a large number of redundant SNPs, but also it can accelerate the study of genome-wide disease association. In
this paper, we propose a new hybrid method called CMDStagger that combines the ideas of the clustering and the graph algorithm,
to find the minimum set of tagSNPs. The proposed algorithm uses the information of the linkage disequilibrium association
and the haplotype diversity to reduce the information loss in tagSNP selection, and has no limit of block partition. The approach
is tested on eight benchmark datasets from Hapmap and chromosome 5q31. Experimental results show that the algorithm in this
paper can reduce the selection time and obtain less tagSNPs with high prediction accuracy. It indicates that this method has
better performance than previous ones.

0 0
 · 
0 Bookmarks
 · 
39 Views

Keywords

benchmark datasets
 
block partition
 
chromosome 5q31
 
Experimental results
 
genome-wide disease association
 
haplotype diversity
 
Hapmap
 
ideas
 
information loss
 
informative single nucleotide polymorphisms
 
linkage disequilibrium association
 
new hybrid method
 
previous ones
 
selection time
 
small subset
 
tagSNPs
 
whole large SNP