Genome-wide genotyping in Parkinson's disease and neurologically normal controls: first stage analysis and public release of data

Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA.
The Lancet Neurology (Impact Factor: 21.82). 12/2006; 5(11):911-6. DOI: 10.1016/S1474-4422(06)70578-6
Source: PubMed

ABSTRACT Several genes underlying rare monogenic forms of Parkinson's disease have been identified over the past decade. Despite evidence for a role for genetics in sporadic Parkinson's disease, few common genetic variants have been unequivocally linked to this disorder. We sought to identify any common genetic variability exerting a large effect in risk for Parkinson's disease in a population cohort and to produce publicly available genome-wide genotype data that can be openly mined by interested researchers and readily augmented by genotyping of additional repository subjects.
We did genome-wide, single-nucleotide-polymorphism (SNP) genotyping of publicly available samples from a cohort of Parkinson's disease patients (n=267) and neurologically normal controls (n=270). More than 408,000 unique SNPs were used from the Illumina Infinium I and HumanHap300 assays.
We have produced around 220 million genotypes in 537 participants. This raw genotype data has been and as such is the first publicly accessible high-density SNP data outside of the International HapMap Project. We also provide here the results of genotype and allele association tests.
We generated publicly available genotype data for Parkinson's disease patients and controls so that these data can be mined and augmented by other researchers to identify common genetic variability that results in minor and moderate risk for disease.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Imputation is a commonly used technique that exploits linkage disequilibrium to infer missing genotypes in genetic datasets, using a well-characterized reference population. While there is agreement that the reference population has to match the ethnicity of the query dataset, it is common practice to use the same reference to impute genotypes for a wide variety of phenotypes. We hypothesized that using a reference composed of samples with a different phenotype than the query dataset would introduce imputation bias. To test this hypothesis we used GWAS datasets from Amyotrophic Lateral Sclerosis (ALS), Parkinson Disease (PD), and Crohn's Disease (CD). First, we masked and then performed imputation of 100 disease-associated markers and 100 non-associated markers from each study. Two references for imputation were used in parallel: one consisting of healthy controls and another consisting of patients with the same disease. We assessed the discordance (imprecision) and bias (inaccuracy) of imputation by comparing predicted genotypes to those assayed by SNP-chip. We also assessed the bias on the observed effect size when the predicted genotypes were used in a GWAS study. When healthy controls were used as reference for imputation, a significant bias was observed, particularly in the disease-associated markers. Using cases as reference significantly attenuated this bias. For nearly all markers, the direction of the bias favored the non-risk allele. In GWAS studies of the three diseases (with healthy reference controls from the 1000 genomes as reference), the mean OR for disease-associated markers obtained by imputation was lower than that obtained using original assayed genotypes. We found that the bias is inherent to imputation as using different methods did not alter the results. In conclusion, imputation is a powerful method to predict genotypes and estimate genetic risk for GWAS. However, a careful choice of reference population is needed to minimize biases inherent to this approach.
    Frontiers in Genetics 02/2015; 6. DOI:10.3389/fgene.2015.00030
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent GWASs have implicated many novel SNPs in the development of Parkinson's disease (PD). Single nucleotide polymorphism (SNP) rs2046571 of the HSA2 (encoding hyaluronan synthase 2) was reported to have marginal association with PD. Herein, we conducted a case-control study to evaluate the possible association between SNP rs2046571 and PD in Chinese. All subject (1043 PD patient and 1044 normal control) were successfully genotyped using polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) analysis. No statistically significant difference in genotype frequency between cases and controls was observed (P=0.074), No statistically significant difference in genotype frequency between early-onset and late-onset was observed (p=0.264 and p=0.120, respectively). No statistically significant difference in genotype frequency between male cases and controls (p=0.108). But surprisingly, there was statistically marginal significant difference in genotype frequency between female cases and controls (p=0.042). Our findings suggested that rs2046571 of the HSA2 has marginal association with PD in Chinese population.
    Neuroscience Letters 07/2013; 552. DOI:10.1016/j.neulet.2013.07.031 · 2.06 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Genome-wide association studies (GWAS) have become the method of choice for identifying disease susceptibility genes in common disease genetics research. Despite successes in these studies, much of the heritability remains unexplained due to lack of power and low resolution. High-density genotyping arrays can now screen more than 5 million genetic markers. As a result, multiple comparison has become an important issue especially in the era of next-generation sequencing. We propose to use a two-stage maximal segmental score procedure (MSS) which uses region-specific empirical P-values to identify genomic segments most likely harboring the disease gene. We develop scoring systems based on Fisher's P-value combining method to convert locus-specific significance levels into region-specific scores. Through simulations, our result indicated that MSS increased the power to detect genetic association as compared with conventional methods provided type I error was at 5%. We demonstrated the application of MSS on a publicly available case-control dataset of Parkinson's disease and replicated the findings in the literature. MSS provides an efficient exploratory tool for high-density association data in the current era of next-generation sequencing. R source codes to implement the MSS procedure are freely available at
    Genetic Epidemiology 09/2012; 36(6):594-601. DOI:10.1002/gepi.21652 · 2.95 Impact Factor