Novel generating protective single nucleotide polymorphism barcode for breast cancer using particle swarm optimization

Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan.
Cancer epidemiology 09/2009; 33(2):147-54. DOI: 10.1016/j.canep.2009.07.001
Source: PubMed


High-throughput single nucleotide polymorphism (SNP) genotyping generates a huge amount of SNP data in genome-wide association studies. Simultaneous analyses for multiple SNP interactions associated with many diseases and cancers are essential; however, these analyses are still computationally challenging.
In this study, we propose an odds ratio-based binary particle swarm optimization (OR-BPSO) method to evaluate the risk of breast cancer.
BPSO provides the combinational SNPs with their corresponding genotype, called SNP barcodes, with the maximal difference of occurrence between the control and breast cancer groups. A specific SNP barcode with an optimized fitness value was identified among seven SNP combinations within the space of one minute. The identified SNP barcodes with the best performance between control and breast cancer groups were found to be control-dominant, suggesting that these SNP barcodes may prove protective against breast cancer. After statistical analysis, these control-dominant SNP barcodes were processed for odds ratio analysis for quantitative measurement with regard to the risk of breast cancer.
This study proposes an effective high-speed method to analyze the SNP-SNP interactions for breast cancer association study.

Download full-text


Available from: Yu-Huei Cheng,
  • Source
    • "However, the possible SNP-SNP interactions of ORAI1 gene associated with breast cancer were not addressed. Different computational analyses have been introduced to examine SNP-SNP interaction in many association studies [14,16-23]. Genetic algorithm (GA) is potential for feature selection for genome-wide scale datasets [24] and may apply to compute the difference between case and control groups to identify good models from the huge SNP combinations as well as tagSNP selection [25]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: ORAI1 channels play an important role for breast cancer progression and metastasis. Previous studies indicated the strong correlation between breast cancer and individual single nucleotide polymorphisms (SNPs) of ORAI1 gene. However, the possible SNP-SNP interaction of ORAI1 gene was not investigated. To develop the complex analyses of SNP-SNP interaction, we propose a genetic algorithm (GA) to detect the model of breast cancer association between five SNPs (rs12320939, rs12313273, rs7135617, rs6486795 and rs712853) of ORAI1 gene. For individual SNPs, the differences between case and control groups in five SNPs of ORAI1 gene were not significant. In contrast, GA-generated SNP models show that 2-SNP (rs12320939-GT/rs6486795-CT), 3-SNP (rs12320939-GT/rs12313273-TT/rs6486795-TC), 5-SNP (rs12320939-GG/rs12313273-TC/rs7135617-TT/rs6486795-TT/rs712853-TT) have higher risks for breast cancer in terms of odds ratio analysis (1.357, 1.689, and 13.148, respectively). Taken together, the cumulative effects of SNPs of ORAI1 gene in breast cancer association study were well demonstrated in terms of GA-generated SNP models.
    Cancer Cell International 03/2014; 14(1):29. DOI:10.1186/1475-2867-14-29 · 2.77 Impact Factor
  • Source
    • "However, SNP–SNP interaction analysis requires to simultaneously evaluate the complex interactions for all tested SNPs and thus computational help is needed (Chang et al., 2009; Chuang et al., 2012a, 2012b; Lane et al., 2012; Steen 2012; Yang et al., 2009, 2011, 2012). Genetic algorithm (GA) is an evolutionary algorithm loosely based on processes of biological evolution; its main components are comprised by encoding schemes, a fitness evaluation, population initialization, selection operation, and a crossover operation. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background and aims: Single nucleotide polymorphism (SNP) interaction analysis can simultaneously evaluate the complex SNP interactions present in complex diseases. However, it is less commonly applied to evaluate the predisposition of chronic dialysis and its computational analysis remains challenging. In this study, we aimed to improve the analysis of SNP-SNP interactions within the mitochondrial D-loop in chronic dialysis. Material & method: The SNP-SNP interactions between 77 reported SNPs within the mitochondrial D-loop in chronic dialysis study were evaluated in terms of SNP barcodes (different SNP combinations with their corresponding genotypes). We propose a genetic algorithm (GA) to generate SNP barcodes. The χ(2) values were then calculated by the occurrences of the specific SNP barcodes and their non-specific combinations between cases and controls. Results: Each SNP barcode (2- to 7-SNP) with the highest value in the χ(2) test was regarded as the best SNP barcode (11.304 to 23.310; p < 0.001). The best GA-generated SNP barcodes (2- to 7-SNP) were significantly associated with chronic dialysis (odds ratio [OR] = 1.998 to 3.139; p < 0.001). The order of influence for SNPs was the same as the order of their OR values for chronic dialysis in terms of 2- to 7-SNP barcodes. Conclusion: Taken together, we propose an effective algorithm to address the SNP-SNP interactions and demonstrated that many non-significant SNPs within the mitochondrial D-loop may play a role in jointed effects to chronic dialysis susceptibility.
    Mitochondrial DNA 06/2013; 25(3). DOI:10.3109/19401736.2013.796513 · 1.21 Impact Factor
  • Source
    • "At present, artificial intelligence (AI) algorithms are rarely used to identify SNP-SNP interaction combinations. Although some methods have previously been used to identify SNP combinations (e.g., MDR [7], machine learning [26], particle swarm optimization (PSO) [27], and genetic algorithms (GA) [12], these methods can still be improved upon. MDR, for example, has three distinct disadvantages. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Single nucleotide polymorphisms (SNPs) in genes derived from distinct pathways are associated with a breast cancer risk. Identifying possible SNP-SNP interactions in genome-wide case–control studies is an important task when investigating genetic factors that influence common complex traits; the effects of SNP-SNP interaction need to be characterized. Furthermore, observations of the complex interplay (interactions) between SNPs for high-dimensional combinations are still computationally and methodologically challenging. An improved branch and bound algorithm with feature selection (IBBFS) is introduced to identify SNP combinations with a maximal difference of allele frequencies between the case and control groups in breast cancer, i.e., the high/low risk combinations of SNPs. Results A total of 220 real case and 334 real control breast cancer data are used to test IBBFS and identify significant SNP combinations. We used the odds ratio (OR) as a quantitative measure to estimate the associated cancer risk of multiple SNP combinations to identify the complex biological relationships underlying the progression of breast cancer, i.e., the most likely SNP combinations. Experimental results show the estimated odds ratio of the best SNP combination with genotypes is significantly smaller than 1 (between 0.165 and 0.657) for specific SNP combinations of the tested SNPs in the low risk groups. In the high risk groups, predicted SNP combinations with genotypes are significantly greater than 1 (between 2.384 and 6.167) for specific SNP combinations of the tested SNPs. Conclusions This study proposes an effective high-speed method to analyze SNP-SNP interactions in breast cancer association studies. A number of important SNPs are found to be significant for the high/low risk group. They can thus be considered a potential predictor for breast cancer association.
    Journal of Clinical Bioinformatics 02/2013; 3(1):4. DOI:10.1186/2043-9113-3-4
Show more