Silke Szymczak

Universität zu Lübeck, Lübeck, Schleswig-Holstein, Germany

Are you Silke Szymczak?

Claim your profile

Publications (20)143.52 Total impact

  • Article: Adaptive linear rank tests for eQTL studies.
    [show abstract] [hide abstract]
    ABSTRACT: Expression quantitative trait loci (eQTL) studies are performed to identify single-nucleotide polymorphisms that modify average expression values of genes, proteins, or metabolites, depending on the genotype. As expression values are often not normally distributed, statistical methods for eQTL studies should be valid and powerful in these situations. Adaptive tests are promising alternatives to standard approaches, such as the analysis of variance or the Kruskal-Wallis test. In a two-stage procedure, skewness and tail length of the distributions are estimated and used to select one of several linear rank tests. In this study, we compare two adaptive tests that were proposed in the literature using extensive Monte Carlo simulations of a wide range of different symmetric and skewed distributions. We derive a new adaptive test that combines the advantages of both literature-based approaches. The new test does not require the user to specify a distribution. It is slightly less powerful than the locally most powerful rank test for the correct distribution and at least as powerful as the maximin efficiency robust rank test. We illustrate the application of all tests using two examples from different eQTL studies. Copyright © 2012 John Wiley & Sons, Ltd.
    Statistics in Medicine 08/2012; · 1.88 Impact Factor
  • Article: Deregulation of a distinct set of microRNAs is associated with transformation of gastritis into MALT lymphoma.
    [show abstract] [hide abstract]
    ABSTRACT: The mechanisms underlying the transformation from chronic Helicobacter pylori gastritis to gastric extranodal marginal zone lymphoma (MALT lymphoma) are poorly understood. This study aims to identify microRNAs that might be involved in the process of neoplastic transformation. We generated microRNA signatures by RT-PCR in 68 gastric biopsy samples representing normal mucosa, gastritis, suspicious lymphoid infiltrates, and overt MALT lymphoma according to Wotherspoon criteria. Analyses revealed a total of 41 microRNAs that were significantly upregulated (n = 33) or downregulated (n = 8) in succession from normal mucosa to gastritis and to MALT lymphoma. While some of these merely reflect the presence of lymphocytes (e.g. miR-566 and miR-212) or H. pylori infection (e.g. miR-155 and let7f), a distinct set of five microRNAs (miR-150, miR-550, miR-124a, miR-518b and miR-539) was shown to be differentially expressed in gastritis as opposed to MALT lymphoma. This differential expression might therefore indicate a central role of these microRNAs in the process of malignant transformation.
    Archiv für Pathologische Anatomie und Physiologie und für Klinische Medicin 03/2012; 460(4):371-7. · 2.49 Impact Factor
  • Source
    Article: Integrating genome-wide genetic variations and monocyte expression data reveals trans-regulated gene modules in humans.
    [show abstract] [hide abstract]
    ABSTRACT: One major expectation from the transcriptome in humans is to characterize the biological basis of associations identified by genome-wide association studies. So far, few cis expression quantitative trait loci (eQTLs) have been reliably related to disease susceptibility. Trans-regulating mechanisms may play a more prominent role in disease susceptibility. We analyzed 12,808 genes detected in at least 5% of circulating monocyte samples from a population-based sample of 1,490 European unrelated subjects. We applied a method of extraction of expression patterns-independent component analysis-to identify sets of co-regulated genes. These patterns were then related to 675,350 SNPs to identify major trans-acting regulators. We detected three genomic regions significantly associated with co-regulated gene modules. Association of these loci with multiple expression traits was replicated in Cardiogenics, an independent study in which expression profiles of monocytes were available in 758 subjects. The locus 12q13 (lead SNP rs11171739), previously identified as a type 1 diabetes locus, was associated with a pattern including two cis eQTLs, RPS26 and SUOX, and 5 trans eQTLs, one of which (MADCAM1) is a potential candidate for mediating T1D susceptibility. The locus 12q24 (lead SNP rs653178), which has demonstrated extensive disease pleiotropy, including type 1 diabetes, hypertension, and celiac disease, was associated to a pattern strongly correlating to blood pressure level. The strongest trans eQTL in this pattern was CRIP1, a known marker of cellular proliferation in cancer. The locus 12q15 (lead SNP rs11177644) was associated with a pattern driven by two cis eQTLs, LYZ and YEATS4, and including 34 trans eQTLs, several of them tumor-related genes. This study shows that a method exploiting the structure of co-expressions among genes can help identify genomic regions involved in trans regulation of sets of genes and can provide clues for understanding the mechanisms linking genome-wide association loci to disease.
    PLoS Genetics 12/2011; 7(12):e1002367. · 8.69 Impact Factor
  • Article: Protein profiling of genomic instability in endometrial cancer.
    [show abstract] [hide abstract]
    ABSTRACT: DNA aneuploidy has been identified as a prognostic factor in the majority of epithelial malignancies. We aimed at identifying ploidy-associated protein expression in endometrial cancer of different prognostic subgroups. Comparison of gel electrophoresis-based protein expression patterns between normal endometrium (n = 5), diploid (n = 7), and aneuploid (n = 7) endometrial carcinoma detected 121 ploidy-associated protein forms, 42 differentially expressed between normal endometrium and diploid endometrioid carcinomas, 37 between diploid and aneuploid endometrioid carcinomas, and 41 between diploid endometrioid and aneuploid uterine papillary serous cancer. Proteins were identified by mass spectrometry and evaluated by Ingenuity Pathway Analysis. Targets were confirmed by liquid chromatography/mass spectrometry. Mass spectrometry identified 41 distinct polypeptides and pathway analysis resulted in high-ranked networks with vimentin and Nf-κB as central nodes. These results identify ploidy-associated protein expression differences that overrule histopathology-associated expression differences and emphasize particular protein networks in genomic stability of endometrial cancer.
    Cellular and Molecular Life Sciences CMLS 07/2011; 69(2):325-33. · 6.57 Impact Factor
  • Article: Influence of sex and genetic variability on expression of X-linked genes in human monocytes.
    [show abstract] [hide abstract]
    ABSTRACT: In humans, the fraction of X-linked genes with higher expression in females has been estimated to be 5% from microarray studies, a proportion lower than the 25% of genes thought to escape X inactivation. We analyzed 715 X-linked transcripts in circulating monocytes from 1,467 subjects and found an excess of female-biased transcripts on the X compared to autosomes (9.4% vs 5.5%, p<2×10(-5)). Among the genes not previously known to escape inactivation, the most significant one was EFHC2 whose 20% of variability was explained by sex. We also investigated cis expression quantitative trait loci (eQTLs) by analyzing 15,703 X-linked SNPs. The frequency and magnitude of X-linked cis eQTLs were quite similar in males and females. Few genes exhibited a stronger genetic effect in females than in males (ARSD, DCX, POLA1 and ITM2A). These genes would deserve further investigation since they may contribute to sex pathophysiological differences.
    Genomics 07/2011; 98(5):320-6. · 3.02 Impact Factor
  • Article: A genome-wide association study identifies LIPA as a susceptibility gene for coronary artery disease.
    [show abstract] [hide abstract]
    ABSTRACT: eQTL analyses are important to improve the understanding of genetic association results. We performed a genome-wide association and global gene expression study to identify functionally relevant variants affecting the risk of coronary artery disease (CAD). In a genome-wide association analysis of 2078 CAD cases and 2953 control subjects, we identified 950 single-nucleotide polymorphisms (SNPs) that were associated with CAD at P<10(-3). Subsequent in silico and wet-laboratory replication stages and a final meta-analysis of 21 428 CAD cases and 38 361 control subjects revealed a novel association signal at chromosome 10q23.31 within the LIPA (lysosomal acid lipase A) gene (P=3.7×10(-8); odds ratio, 1.1; 95% confidence interval, 1.07 to 1.14). The association of this locus with global gene expression was assessed by genome-wide expression analyses in the monocyte transcriptome of 1494 individuals. The results showed a strong association of this locus with expression of the LIPA transcript (P=1.3×10(-96)). An assessment of LIPA SNPs and transcript with cardiovascular phenotypes revealed an association of LIPA transcript levels with impaired endothelial function (P=4.4×10(-3)). The use of data on genetic variants and the addition of data on global monocytic gene expression led to the identification of the novel functional CAD susceptibility locus LIPA, located on chromosome 10q23.31. The respective eSNPs associated with CAD strongly affect LIPA gene expression level, which was related to endothelial dysfunction, a precursor of CAD.
    Circulation Cardiovascular Genetics 05/2011; 4(4):403-12. · 6.11 Impact Factor
  • Article: HDAC2 and TXNL1 distinguish aneuploid from diploid colorectal cancers.
    [show abstract] [hide abstract]
    ABSTRACT: DNA aneuploidy has been identified as a prognostic factor for epithelial malignancies. Further understanding of the translation of DNA aneuploidy into protein expression will help to define novel biomarkers to improve therapies and prognosis. DNA ploidy was assessed by image cytometry. Comparison of gel-electrophoresis-based protein expression patterns of three diploid and four aneuploid colorectal cancer cell lines detected 64 ploidy-associated proteins. Proteins were identified by mass spectrometry and subjected to Ingenuity Pathway Analysis resulting in two overlapping high-ranked networks maintaining Cellular Assembly and Organization, Cell Cycle, and Cellular Growth and Proliferation. CAPZA1, TXNL1, and HDAC2 were significantly validated by Western blotting in cell lines and the latter two showed expression differences also in clinical samples using a tissue microarray of normal mucosa (n=19), diploid (n=31), and aneuploid (n=47) carcinomas. The results suggest that distinct protein expression patterns, affecting TXNL1 and HDAC2, distinguish aneuploid with poor prognosis from diploid colorectal cancers.
    Cellular and Molecular Life Sciences CMLS 02/2011; 68(19):3261-74. · 6.57 Impact Factor
  • Article: A trans-acting locus regulates an anti-viral expression network and type 1 diabetes risk.
    [show abstract] [hide abstract]
    ABSTRACT: Combined analyses of gene networks and DNA sequence variation can provide new insights into the aetiology of common diseases that may not be apparent from genome-wide association studies alone. Recent advances in rat genomics are facilitating systems-genetics approaches. Here we report the use of integrated genome-wide approaches across seven rat tissues to identify gene networks and the loci underlying their regulation. We defined an interferon regulatory factor 7 (IRF7)-driven inflammatory network (IDIN) enriched for viral response genes, which represents a molecular biomarker for macrophages and which was regulated in multiple tissues by a locus on rat chromosome 15q25. We show that Epstein-Barr virus induced gene 2 (Ebi2, also known as Gpr183), which lies at this locus and controls B lymphocyte migration, is expressed in macrophages and regulates the IDIN. The human orthologous locus on chromosome 13q32 controlled the human equivalent of the IDIN, which was conserved in monocytes. IDIN genes were more likely to associate with susceptibility to type 1 diabetes (T1D)-a macrophage-associated autoimmune disease-than randomly selected immune response genes (P = 8.85 × 10(-6)). The human locus controlling the IDIN was associated with the risk of T1D at single nucleotide polymorphism rs9585056 (P = 7.0 × 10(-10); odds ratio, 1.15), which was one of five single nucleotide polymorphisms in this region associated with EBI2 (GPR183) expression. These data implicate IRF7 network genes and their regulatory locus in the pathogenesis of T1D.
    Nature 09/2010; 467(7314):460-4. · 36.28 Impact Factor
  • Source
    Article: Genetics and Beyond – The Transcriptome of Human Monocytes and Disease Susceptibility
    [show abstract] [hide abstract]
    ABSTRACT: Background Variability of gene expression in human may link gene sequence variability and phenotypes; however, non-genetic variations, alone or in combination with genetics, may also influence expression traits and have a critical role in physiological and disease processes. Methodology/Principal Findings To get better insight into the overall variability of gene expression, we assessed the transcriptome of circulating monocytes, a key cell involved in immunity-related diseases and atherosclerosis, in 1,490 unrelated individuals and investigated its association with >675,000 SNPs and 10 common cardiovascular risk factors. Out of 12,808 expressed genes, 2,745 expression quantitative trait loci were detected (P
    PLoS ONE 05/2010; 5(5). · 4.09 Impact Factor
  • Source
    Article: Genetics and beyond--the transcriptome of human monocytes and disease susceptibility.
    [show abstract] [hide abstract]
    ABSTRACT: Variability of gene expression in human may link gene sequence variability and phenotypes; however, non-genetic variations, alone or in combination with genetics, may also influence expression traits and have a critical role in physiological and disease processes. To get better insight into the overall variability of gene expression, we assessed the transcriptome of circulating monocytes, a key cell involved in immunity-related diseases and atherosclerosis, in 1,490 unrelated individuals and investigated its association with >675,000 SNPs and 10 common cardiovascular risk factors. Out of 12,808 expressed genes, 2,745 expression quantitative trait loci were detected (P<5.78x10(-12)), most of them (90%) being cis-modulated. Extensive analyses showed that associations identified by genome-wide association studies of lipids, body mass index or blood pressure were rarely compatible with a mediation by monocyte expression level at the locus. At a study-wide level (P<3.9x10(-7)), 1,662 expression traits (13.0%) were significantly associated with at least one risk factor. Genome-wide interaction analyses suggested that genetic variability and risk factors mostly acted additively on gene expression. Because of the structure of correlation among expression traits, the variability of risk factors could be characterized by a limited set of independent gene expressions which may have biological and clinical relevance. For example expression traits associated with cigarette smoking were more strongly associated with carotid atherosclerosis than smoking itself. This study demonstrates that the monocyte transcriptome is a potent integrator of genetic and non-genetic influences of relevance for disease pathophysiology and risk assessment.
    PLoS ONE 01/2010; 5(5):e10693. · 4.09 Impact Factor
  • Source
    Article: Machine learning in genome-wide association studies.
    [show abstract] [hide abstract]
    ABSTRACT: Recently, genome-wide association studies have substantially expanded our knowledge about genetic variants that influence the susceptibility to complex diseases. Although standard statistical tests for each single-nucleotide polymorphism (SNP) separately are able to capture main genetic effects, different approaches are necessary to identify SNPs that influence disease risk jointly or in complex interactions. Experimental and simulated genome-wide SNP data provided by the Genetic Analysis Workshop 16 afforded an opportunity to analyze the applicability and benefit of several machine learning methods. Penalized regression, ensemble methods, and network analyses resulted in several new findings while known and simulated genetic risk variants were also identified. In conclusion, machine learning approaches are promising complements to standard single-and multi-SNP analysis methods for understanding the overall genetic architecture of complex human diseases. However, because they are not optimized for genome-wide SNP data, improved implementations and new variable selection procedures are required.
    Genetic Epidemiology 11/2009; 33 Suppl 1:S51-7. · 3.44 Impact Factor
  • Article: Stress sensitivity is increased in transgenic rats with low brain angiotensinogen.
    [show abstract] [hide abstract]
    ABSTRACT: AT(1) blockers attenuate hypothalamo-pituitary-adrenal (HPA) axis reactivity in hypertension independently of their potency to lower blood pressure. A reduced pituitary sensitivity to CRH and a downregulation of hypothalamic CRH expression have been suggested to influence HPA axis activity during chronic AT(1) blockade. This study was aimed at confirming the role of central angiotensin II in regulating HPA reactivity by using the transgenic rat TGR(ASrAOGEN), a model featuring low levels of brain angiotensinogen. Different stress tests were performed to determine HPA reactivity in TGR(ASrAOGEN) and appropriate controls. In TGR(ASrAOGEN), blood pressure was diminished compared to controls. The corticosterone response to a CRH or ACTH challenge and a forced swim test was more distinct in TGR(ASrAOGEN) than it was in controls and occurred independently of a concurrent enhancement in ACTH. Using quantitative real-time PCR, we found increased mRNA levels of melanocortin 2 (Mc2r) and AT(2) receptors (Agtr2) in the adrenals of TGR(ASrAOGEN), whereas mRNA levels of Crh, Pomc, and AT(1) receptors (Agtr1) remained unchanged in hypothalami and pituitary glands. Since stress responses were increased rather than attenuated in TGR(ASrAOGEN), we conclude that the reduced HPA reactivity during AT(1) blockade could not be mimicked in a specific transgenic rat model featuring a centrally inactivated renin-angiotensin-aldosterone system. The ACTH independency of the enhanced corticosterone release during CRH test and the enhanced corticosterone response to ACTH rather indicates an adrenal mechanism. The upregulation of adrenal MC2 and AT(2) receptors seems to be involved in the stimulated facilitation of adrenal corticosterone release for effectuating the stimulated stress responses.
    Journal of Endocrinology 10/2009; 204(1):85-92. · 3.55 Impact Factor
  • Source
    Article: Genetic Analysis Workshop 16: Strategies for genome-wide association study analyses.
    BMC proceedings 01/2009; 3 Suppl 7:S1.
  • Source
    Article: ACPA: automated cluster plot analysis of genotype data.
    [show abstract] [hide abstract]
    ABSTRACT: ABSTRACT : Genome-wide association studies have become standard in genetic epidemiology. Analyzing hundreds of thousands of markers simultaneously imposes some challenges for statisticians. One issue is the problem of multiplicity, which has been compared with the search for the needle in a haystack. To reduce the number of false-positive findings, a number of quality filters such as exclusion of single-nucleotide polymorphisms (SNPs) with a high missing fraction are employed. Another filter is exclusion of SNPs for which the calling algorithm had difficulties in assigning the genotypes. The only way to do this is the visual inspection of the cluster plots, also termed signal intensity plots, but this approach is often neglected. We developed an algorithm ACPA (automated cluster plot analysis), which performs this task automatically for autosomal SNPs. It is based on counting samples that lie too close to the cluster of a different genotype; SNPs are excluded when a certain threshold is exceeded. We evaluated ACPA using 1,000 randomly selected quality controlled SNPs from the Framingham Heart Study data that were provided for the Genetic Analysis Workshop 16. We compared the decision of ACPA with the decision made by two independent readers. We achieved a sensitivity of 88% (95% CI: 81%-93%) and a specificity of 86% (95% CI: 83%-89%). In a screening setting in which one aims at not losing any good SNP, we achieved 99% (95% CI: 98%-100%) specificity and still detected every second low-quality SNP.
    BMC proceedings 01/2009; 3 Suppl 7:S58.
  • Source
    Article: Evaluation of single-nucleotide polymorphism imputation using random forests.
    [show abstract] [hide abstract]
    ABSTRACT: ABSTRACT : Genome-wide association studies (GWAS) have helped to reveal genetic mechanisms of complex diseases. Although commonly used genotyping technology enables us to determine up to a million single-nucleotide polymorphisms (SNPs), causative variants are typically not genotyped directly. A favored approach to increase the power of genome-wide association studies is to impute the untyped SNPs using more complete genotype data of a reference population.Random forests (RF) provides an internal method for replacing missing genotypes. A forest of classification trees is used to determine similarities of probands regarding their genotypes. These proximities are then used to impute genotypes of untyped SNPs.We evaluated this approach using genotype data of the Framingham Heart Study provided as Problem 2 for Genetic Analysis Workshop 16 and the Caucasian HapMap samples as reference population. Our results indicate that RFs are faster but less accurate than alternative approaches for imputing untyped SNPs.
    BMC proceedings 01/2009; 3 Suppl 7:S65.
  • Article: Genomewide association analysis of coronary artery disease.
    [show abstract] [hide abstract]
    ABSTRACT: Modern genotyping platforms permit a systematic search for inherited components of complex diseases. We performed a joint analysis of two genomewide association studies of coronary artery disease. We first identified chromosomal loci that were strongly associated with coronary artery disease in the Wellcome Trust Case Control Consortium (WTCCC) study (which involved 1926 case subjects with coronary artery disease and 2938 controls) and looked for replication in the German MI [Myocardial Infarction] Family Study (which involved 875 case subjects with myocardial infarction and 1644 controls). Data on other single-nucleotide polymorphisms (SNPs) that were significantly associated with coronary artery disease in either study (P<0.001) were then combined to identify additional loci with a high probability of true association. Genotyping in both studies was performed with the use of the GeneChip Human Mapping 500K Array Set (Affymetrix). Of thousands of chromosomal loci studied, the same locus had the strongest association with coronary artery disease in both the WTCCC and the German studies: chromosome 9p21.3 (SNP, rs1333049) (P=1.80x10(-14) and P=3.40x10(-6), respectively). Overall, the WTCCC study revealed nine loci that were strongly associated with coronary artery disease (P<1.2x10(-5) and less than a 50% chance of being falsely positive). In addition to chromosome 9p21.3, two of these loci were successfully replicated (adjusted P<0.05) in the German study: chromosome 6q25.1 (rs6922269) and chromosome 2q36.3 (rs2943634). The combined analysis of the two studies identified four additional loci significantly associated with coronary artery disease (P<1.3x10(-6)) and a high probability (>80%) of a true association: chromosomes 1p13.3 (rs599839), 1q41 (rs17465637), 10q11.21 (rs501120), and 15q22.33 (rs17228212). We identified several genetic loci that, individually and in aggregate, substantially affect the risk of development of coronary artery disease.
    New England Journal of Medicine 08/2007; 357(5):443-53. · 53.30 Impact Factor
  • Source
    Article: Genetic association studies for gene expressions: permutation-based mutual information in a comparison with standard ANOVA and as a novel approach for feature selection.
    [show abstract] [hide abstract]
    ABSTRACT: Mutual information (MI) is a robust nonparametric statistical approach for identifying associations between genotypes and gene expression levels. Using the data of Problem 1 provided for the Genetic Analysis Workshop 15, we first compared a quantitative MI (Tsalenko et al. 2006 J Bioinform Comput Biol 4:259-4) with the standard analysis of variance (ANOVA) and the nonparametric Kruskal-Wallis (KW) test. We then proposed a novel feature selection approach using MI in a classification scenario to address the small n - large p problem and compared it with a feature selection that relies on an asymptotic chi2 distribution. In both applications, we used a permutation-based approach for evaluating the significance of MI. Substantial discrepancies in significance were observed between MI, ANOVA, and KW that can be explained by different empirical distributions of the data. In contrast to ANOVA and KW, MI detects shifts in location when the data are non-normally distributed, skewed, or contaminated with outliers. ANOVA but not MI is often significant if one genotype with a small frequency had a remarkable difference in the average gene expression level relative to the other two genotypes. MI depends on genotype frequencies and cannot detect these differences. In the classification scenario, we show that our novel approach for feature selection identifies a smaller list of markers with higher accuracy compared to the standard method. In conclusion, permutation-based MI approaches provide reliable and flexible statistical frameworks which seem to be well suited for data that are non-normal, skewed, or have an otherwise peculiar distribution. They merit further methodological investigation.
    BMC proceedings 02/2007; 1 Suppl 1:S9.
  • Article: Genome-wide association analyses of expression phenotypes.
    [show abstract] [hide abstract]
    ABSTRACT: A number of issues arise when analyzing the large amount of data from high-throughput genotype and expression microarray experiments, including design and interpretation of genome-wide association studies of expression phenotypes. These issues were considered by contributions submitted to Group 1 of the Genetic Analysis Workshop 15 (GAW15), which focused on the association of quantitative expression data. These contributions evaluated diverse hypotheses, including those relevant to cancer and obesity research, and used various analytic techniques, many of which were derived from information theory. Several observations from these reports stand out. First, one needs to consider the genetic model of the trait of interest and carefully select which single nucleotide polymorphisms and individuals are included early in the design stage of a study. Second, by targeting specific pathways when analyzing genome-wide data, one can generate more interpretable results than agnostic approaches. Finally, for datasets with small sample sizes but a large number of features like the Genetic Analysis Workshop 15 dataset, machine learning approaches may be more practical than traditional parametric approaches.
    Genetic Epidemiology 02/2007; 31 Suppl 1:S7-S11. · 3.44 Impact Factor
  • Source
    Article: Picking single-nucleotide polymorphisms in forests.
    [show abstract] [hide abstract]
    ABSTRACT: With the development of high-throughput single-nucleotide polymorphism (SNP) technologies, the vast number of SNPs in smaller samples poses a challenge to the application of classical statistical procedures. A possible solution is to use a two-stage approach for case-control data in which, in the first stage, a screening test selects a small number of SNPs for further analysis. The second stage then estimates the effects of the selected variables using logistic regression (logReg). Here, we introduce a novel approach in which the selection of SNPs is based on the permutation importance estimated by random forests (RFs). For this, we used the simulated data provided for the Genetic Analysis Workshop 15 without knowledge of the true model.The data set was randomly split into a first and a second data set. In the first stage, RFs were grown to pre-select the 37 most important variables, and these were reduced to 32 variables by haplotype tagging. In the second stage, we estimated parameters using logReg.The highest effect estimates were obtained for five simulated loci. We detected smoking, gender, and the parental DR alleles as covariates. After correction for multiple testing, we identified two out of four genes simulated with a direct effect on rheumatoid arthritis risk and all covariates without any false positive.We showed that a two-staged approach with a screening of SNPs by RFs is suitable to detect candidate SNPs in genome-wide association studies for complex diseases.
    BMC proceedings 02/2007; 1 Suppl 1:S59.
  • Article: Genomewide Association Analysis of Coronary Artery Disease
    [show abstract] [hide abstract]
    ABSTRACT: This paper was published as New England Journal of Medicine, 2007, 357 (5), pp. 443-453. Copyright © 2007 Massachusetts Medical Society. It is available from http://content.nejm.org/cgi/content/abstract/357/5/443. Doi: 10.1056/NEJMoa072366 Background - Modern genotyping platforms permit a systematic search for inherited components of complex diseases. We performed a joint analysis of two genomewide association studies of coronary artery disease. Methods - We first identified chromosomal loci that were strongly associated with coronary artery disease in the Wellcome Trust Case Control Consortium (WTCCC) study (which involved 1926 case subjects with coronary artery disease and 2938 controls) and looked for replication in the German MI [Myocardial Infarction] Family Study (which involved 875 case subjects with myocardial infarction and 1644 controls). Data on other single-nucleotide polymorphisms (SNPs) that were significantly associated with coronary artery disease in either study (P<0.001) were then combined to identify additional loci with a high probability of true association. Genotyping in both studies was performed with the use of the GeneChip Human Mapping 500K Array Set (Affymetrix). Results - Of thousands of chromosomal loci studied, the same locus had the strongest association with coronary artery disease in both the WTCCC and the German studies: chromosome 9p21.3 (SNP, rs1333049) (P=1.80x10–14 and P=3.40x10–6, respectively). Overall, the WTCCC study revealed nine loci that were strongly associated with coronary artery disease (P<1.2x10–5 and less than a 50% chance of being falsely positive). In addition to chromosome 9p21.3, two of these loci were successfully replicated (adjusted P<0.05) in the German study: chromosome 6q25.1 (rs6922269) and chromosome 2q36.3 (rs2943634). The combined analysis of the two studies identified four additional loci significantly associated with coronary artery disease (P<1.3x10–6) and a high probability (>80%) of a true association: chromosomes 1p13.3 (rs599839), 1q41 (rs17465637), 10q11.21 (rs501120), and 15q22.33 (rs17228212). Conclusions - We identified several genetic loci that, individually and in aggregate, substantially affect the risk of development of coronary artery disease.