Genome-wide genotyping in Parkinson’s disease and neurologically normal controls: first stage analysis and public release of data. Lancet Neurol 5:911-916
ABSTRACT Several genes underlying rare monogenic forms of Parkinson's disease have been identified over the past decade. Despite evidence for a role for genetics in sporadic Parkinson's disease, few common genetic variants have been unequivocally linked to this disorder. We sought to identify any common genetic variability exerting a large effect in risk for Parkinson's disease in a population cohort and to produce publicly available genome-wide genotype data that can be openly mined by interested researchers and readily augmented by genotyping of additional repository subjects.
We did genome-wide, single-nucleotide-polymorphism (SNP) genotyping of publicly available samples from a cohort of Parkinson's disease patients (n=267) and neurologically normal controls (n=270). More than 408,000 unique SNPs were used from the Illumina Infinium I and HumanHap300 assays.
We have produced around 220 million genotypes in 537 participants. This raw genotype data has been and as such is the first publicly accessible high-density SNP data outside of the International HapMap Project. We also provide here the results of genotype and allele association tests.
We generated publicly available genotype data for Parkinson's disease patients and controls so that these data can be mined and augmented by other researchers to identify common genetic variability that results in minor and moderate risk for disease.
- SourceAvailable from: Sergio E Baranzini
[Show abstract] [Hide abstract]
- "Quality controlled, genotype-level data from three previously published independent case-control GWAS in individuals of European ancestry in Amyotrophic Lateral Sclerosis (ALS), Crohn's Disease (CD), and Parkinson's Disease (PD) were obtained from dbGAP (Mailman et al., 2007) (Supplementary Table S1). In CD (Rioux et al., 2007) and PD (Fung et al., 2006), cases and controls were matched by sex, age (or year of birth), and ancestry (Rioux et al., 2007; Simon-Sanchez et al., 2009). For ALS cases (Schymick et al., 2007), a sample from neurologically normal controls (Simon-Sanchez et al., 2007) were matched for age and gender and ancestry (Schymick et al., 2007). "
ABSTRACT: Imputation is a commonly used technique that exploits linkage disequilibrium to infer missing genotypes in genetic datasets, using a well-characterized reference population. While there is agreement that the reference population has to match the ethnicity of the query dataset, it is common practice to use the same reference to impute genotypes for a wide variety of phenotypes. We hypothesized that using a reference composed of samples with a different phenotype than the query dataset would introduce imputation bias. To test this hypothesis we used GWAS datasets from Amyotrophic Lateral Sclerosis (ALS), Parkinson Disease (PD), and Crohn's Disease (CD). First, we masked and then performed imputation of 100 disease-associated markers and 100 non-associated markers from each study. Two references for imputation were used in parallel: one consisting of healthy controls and another consisting of patients with the same disease. We assessed the discordance (imprecision) and bias (inaccuracy) of imputation by comparing predicted genotypes to those assayed by SNP-chip. We also assessed the bias on the observed effect size when the predicted genotypes were used in a GWAS study. When healthy controls were used as reference for imputation, a significant bias was observed, particularly in the disease-associated markers. Using cases as reference significantly attenuated this bias. For nearly all markers, the direction of the bias favored the non-risk allele. In GWAS studies of the three diseases (with healthy reference controls from the 1000 genomes as reference), the mean OR for disease-associated markers obtained by imputation was lower than that obtained using original assayed genotypes. We found that the bias is inherent to imputation as using different methods did not alter the results. In conclusion, imputation is a powerful method to predict genotypes and estimate genetic risk for GWAS. However, a careful choice of reference population is needed to minimize biases inherent to this approach.Frontiers in Genetics 02/2015; 6. DOI:10.3389/fgene.2015.00030
[Show abstract] [Hide abstract]
- "Despite this finding, data generated in whole-genome association studies should be interpreted cautiously because of the inevitably high false-positive rate that occurs whenever several hundred thousand tests are undertaken on the same dataset. For example, Maraganore, et al.20 identified 13 putative SNPs that seemed to be associated with Parkinson's disease, but none have been confirmed in subsequent independent studies.21,22,23,24,25 "
ABSTRACT: The underlying cause of myasthenia gravis (MG) is unknown, although it likely involves a genetic component. However, no common genetic variants have been unequivocally linked to autoimmune MG. We sought to identify the genetic variants associated with an increased or decreased risk of developing MG in samples from a Korean Multicenter MG Cohort. To determine new genetic targets related to autoimmune MG, a whole genome-based single nucleotide polymorphisms (SNP) analysis was conducted using an Axiom™ Genome-Wide ASI 1 Array, comprising 598375 SNPs and samples from 109 MG patients and 150 neurologically normal controls. In total, 641 SNPs from five case-control associations showed p-values of less than 10⁻⁵. From regional analysis, we selected seven candidate genes (RYR3, CACNA1S, SLAMF1, SOX5, FHOD3, GABRB1, and SACS) for further analysis. The present study suggests that a few genetic polymorphisms, such as in RYR3, CACNA1S, and SLAMF1, might be related to autoimmune MG. Our findings also encourage further studies, particularly confirmatory studies with larger samples, to validate and analyze the association between these SNPs and autoimmune MG.Yonsei medical journal 05/2014; 55(3):660-8. DOI:10.3349/ymj.2014.55.3.660 · 1.29 Impact Factor
[Show abstract] [Hide abstract]
- "Initially, genome-wide, SNP genotyping of these samples was carried out in 267 PD subjects and 270 controls, and later extended to include genotyping in 939 PD cases and 802 controls. This collection was included in the first stage study by Fung et al. , and the expanded study by Simon-Sanchez et al. [57,58]. A total of 7,943 SNPs (stage I) were selected for further analysis, with p value < 0.01, from raw data comprising total of 453,217 SNPs. "
ABSTRACT: Alzheimer's disease (AD) is one of the leading genetically complex and heterogeneous disorder that is influenced by both genetic and environmental factors. The underlying risk factors remain largely unclear for this heterogeneous disorder. In recent years, high throughput methodologies, such as genome-wide linkage analysis (GWL), genome-wide association (GWA) studies, and genome-wide expression profiling (GWE), have led to the identification of several candidate genes associated with AD. However, due to lack of consistency within their findings, an integrative approach is warranted. Here, we have designed a rank based gene prioritization approach involving convergent analysis of multi-dimensional data and protein-protein interaction (PPI) network modelling. Our approach employs integration of three different AD datasets- GWL,GWA and GWE to identify overlapping candidate genes ranked using a novel cumulative rank score (SR) based method followed by prioritization using clusters derived from PPI network. SR for each gene is calculated by addition of rank assigned to individual gene based on either p value or score in three datasets. This analysis yielded 108 plausible AD genes. Network modelling by creating PPI using proteins encoded by these genes and their direct interactors resulted in a layered network of 640 proteins. Clustering of these proteins further helped us in identifying 6 significant clusters with 7 proteins (EGFR, ACTB, CDC2, IRAK1, APOE, ABCA1 and AMPH) forming the central hub nodes. Functional annotation of 108 genes revealed their role in several biological activities such as neurogenesis, regulation of MAP kinase activity, response to calcium ion, endocytosis paralleling the AD specific attributes. Finally, 3 potential biochemical biomarkers were found from the overlap of 108 AD proteins with proteins from CSF and plasma proteome. EGFR and ACTB were found to be the two most significant AD risk genes. With the assumption that common genetic signals obtained from different methodological platforms might serve as robust AD risk markers then candidates identified using single dimension approach, here we demonstrated an integrated genomic convergence approach for disease candidate gene prioritization from heterogeneous data sources linked to AD.BMC Genomics 03/2014; 15(1):199. DOI:10.1186/1471-2164-15-199 · 3.99 Impact Factor