[Show abstract][Hide abstract] ABSTRACT: Objective: To determine whether distinct single nucleotide polymorphisms (SNPs) within the glu-tamate receptor ionotropic NMDA 1 gene (GRIN1) are associated with NMDA receptor (NMDAR) encephalitis and whether these same variants are associated with variability in the clinical presentation and course of affected patients. Methods: We performed clinical follow-up on 48 patients with NMDAR encephalitis and NMDAR autoantibodies detected in serum or CSF. All RefSeq GRIN1 coding exons were sequenced in 39 Caucasian-European patients, and the frequencies of SNPs were compared with those of an ethnically similar population using a case-control study design. Predetermined clinical variables were compared between patients with and without identified SNPs.
[Show abstract][Hide abstract] ABSTRACT: The "winner's curse" is a subtle and difficult problem in interpretation of genetic association, in which association estimates from large-scale gene detection studies are larger in magnitude than those from subsequent replication studies. This is practically important because use of a biased estimate from the original study will yield an underestimate of sample size requirements for replication, leaving the investigators with an underpowered study. Motivated by investigation of the genetics of type 1 diabetes complications in a longitudinal cohort of participants in the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications (DCCT/EDIC) Genetics Study, we apply a bootstrap resampling method in analysis of time to nephropathy under a Cox proportional hazards model, examining 1,213 single-nucleotide polymorphisms (SNPs) in 201 candidate genes custom genotyped in 1,361 white probands. Among 15 top-ranked SNPs, bias reduction in log hazard ratio estimates ranges from 43.1% to 80.5%. In simulation studies based on the observed DCCT/EDIC genotype data, genome-wide bootstrap estimates for false-positive SNPs and for true-positive SNPs with low-to-moderate power are closer to the true values than uncorrected naïve estimates, but tend to overcorrect SNPs with high power. This bias-reduction technique is generally applicable for complex trait studies including quantitative, binary, and time-to-event traits.
[Show abstract][Hide abstract] ABSTRACT: A variant (rs1495741) in the gene for the N-acetyltransferase 2 (NAT2) protein is associated with skin intrinsic fluorescence (SIF), a noninvasive measure of advanced glycation end products and other fluorophores in the skin. Because NAT2 is involved in caffeine metabolism, we aimed to determine whether caffeine consumption is associated with SIF and whether rs1495741 is associated with SIF independently of caffeine.
SIF was measured in 1,181 participants with type 1 diabetes from the Epidemiology of Diabetes Interventions and Complications study. Two measures of SIF were used: SIF1, using a 375-nm excitation light-emitting diode (LED), and SIF14 (456-nm LED). Food frequency questionnaires were used to estimate mean caffeine intake. To establish replication, we examined a second type 1 diabetes cohort.
Higher caffeine intake was significantly associated with higher SIF1LED 375 nm[0.6, 0.2] (P=2×10(-32)) and SIF14LED 456 nm[0.4, 0.8] (P=7×10(-31)) and accounted for 4% of the variance in each after adjusting for covariates. When analyzed together, caffeine intake and rs1495741 both remained highly significantly associated with SIF1LED 375 nm[0.6, 0.2] and SIF14LED 456 nm[0.4, 0.8]. Mean caffeinated coffee intake was also positively associated with SIF1LED 375 nm[0.6, 0.2] (P=9×10(-12)) and SIF14LED 456 nm[0.4, 0.8] (P=4×10(-12)), but no association was observed for decaffeinated coffee intake. Finally, caffeine was also positively associated with SIF1LED 375 nm[0.6, 0.2] and SIF14LED 456 nm[0.4, 0.8] (P<0.0001) in the replication cohort.
Caffeine contributes to SIF. The effect of rs1495741 on SIF appears to be partially independent of caffeine consumption. Because SIF and coffee intake are each associated with cardiovascular disease, our findings suggest that accounting for coffee and/or caffeine intake may improve risk prediction models for SIF and cardiovascular disease in individuals with diabetes.
[Show abstract][Hide abstract] ABSTRACT: Studies have shown oxidized low-density lipoprotein to be associated with the incidence of proliferative retinopathy and other complications of type 1 diabetes mellitus. Because low-risk interventions are available to modify oxidized low-density lipoprotein, it is important to examine the relationships between this factor and the incidence of proliferative retinopathy and of macular edema, 2 important causes of visual impairment in people with type 1 diabetes.
To determine the association of oxidized low-density lipoprotein with the worsening of diabetic retinopathy and the incidence of proliferative retinopathy and of macular edema.
Of 996 participants with type 1 diabetes in the Wisconsin Epidemiologic Study of Diabetic Retinopathy, 730 were examined up to 4 times (1990-1992, 1994-1996, 2005-2007, and 2012-2014) over 24 years and had assays of oxidized low-density lipoprotein and fundus photographs gradable for diabetic retinopathy and macular edema. Analyses started July 2014 and ended February 2015.
Worsening of diabetic retinopathy, incidence of proliferative diabetic retinopathy, and incidence of macular edema as assessed via grading of color stereo film fundus photographs. The levels of oxidized low-density lipoprotein collected from serum samples at the time of each examination were measured in 2013 and 2014 from frozen serum.
The cohort at baseline had a mean (SD) level of oxidized low-density lipoprotein of 30.0 (8.5) U/L. While adjusting for duration of diabetes, glycated hemoglobin A1c level, and other factors, we found that neither the level of oxidized low-density lipoprotein at the beginning of a period nor the change in it over a certain period was associated with the incidence of proliferative diabetic retinopathy (hazard ratio [HR], 1.11 [95% CI, 0.91-1.35], P = .30; odds ratio [OR], 1.77 [95% CI, 0.99-3.17], P = .06), the incidence of macular edema (HR, 1.04 [95% CI, 0.83-1.29], P = .74; OR, 1.08 [95% CI, 0.44-2.61], P = .87), or the worsening of diabetic retinopathy (HR, 0.94 [95% CI, 0.83-1.07], P = .34; OR, 1.32 [95% CI, 0.83-2.09], P = .24).
Our findings do not provide evidence for a relationship between increasing levels of serum oxidized low-density lipoprotein and the incidence of macular edema or the worsening of diabetic retinopathy in persons with type 1 diabetes. The potential increase in the HR for incident proliferative retinopathy, with an increase in oxidized low-density lipoprotein level over the preceding period, warrants further investigation of this relationship.
[Show abstract][Hide abstract] ABSTRACT: Typical data in a microbiome study consist of the operational taxonomic unit (OTU) counts that have the characteristic of excess zeros, which are often ignored by investigators. In this paper, we compare the performance of different competing methods to model data with zero inflated features through extensive simulations and application to a microbiome study. These methods include standard parametric and non-parametric models, hurdle models, and zero inflated models. We examine varying degrees of zero inflation, with or without dispersion in the count component, as well as different magnitude and direction of the covariate effect on structural zeros and the count components. We focus on the assessment of type I error, power to detect the overall covariate effect, measures of model fit, and bias and effectiveness of parameter estimations. We also evaluate the abilities of model selection strategies using Akaike information criterion (AIC) or Vuong test to identify the correct model. The simulation studies show that hurdle and zero inflated models have well controlled type I errors, higher power, better goodness of fit measures, and are more accurate and efficient in the parameter estimation. Besides that, the hurdle models have similar goodness of fit and parameter estimation for the count component as their corresponding zero inflated models. However, the estimation and interpretation of the parameters for the zero components differs, and hurdle models are more stable when structural zeros are absent. We then discuss the model selection strategy for zero inflated data and implement it in a gut microbiome study of > 400 independent subjects.
PLoS ONE 07/2015; 10(7). DOI:10.1371/journal.pone.0129606 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Background
In silico models have recently been created in order to predict which genetic variants are more likely to contribute to the risk of a complex trait given their functional characteristics. However, there has been no comprehensive review as to which type of predictive accuracy measures and data visualization techniques are most useful for assessing these models.
We assessed the performance of the models for predicting risk using various methodologies, some of which include: receiver operating characteristic (ROC) curves, histograms of classification probability, and the novel use of the quantile-quantile plot. These measures have variable interpretability depending on factors such as whether the dataset is balanced in terms of numbers of genetic variants classified as risk variants versus those that are not.
We conclude that the area under the curve (AUC) is a suitable starting place, and for models with similar AUCs, violin plots are particularly useful for examining the distribution of the risk scores.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-1616-z) contains supplementary material, which is available to authorized users.
[Show abstract][Hide abstract] ABSTRACT: It has been suggested that inflammatory bowel disease (IBD) is due to a genetically determined abnormal interaction between gut immune responses and gut bacteria. In order to determine if the composition and diversity of gut microbiota are associated with host genetic makeup we assessed the stool microbiome in a cohort of 918 FDR of CD patients.
The V4 hypervariable regions of 16S rRNA were sequenced from bacterial DNA extracted from the stool of 918 unrelated healthy Caucasian FDRs. MiSeq Sequences were processed using PANDAseq and the QIIME pipeline. Non-chimeric sequences were clustered into operational taxonomic units (OTUs) at 97.0% sequence identity using USEARCH and GreenGenes. Single nucleotide polymorphisms (SNPs) were determined with the HumanCoreEXOME chip with imputations using IMPUTE2 2.3.0 to the 1000 Genomes panel of March 2012. Associations between SNPs and phylotypes were estimated using linear regression adjusting for the total number of reads, age, sex, and the first three genetic principal components. Raw SNP p-values of <10-5 are presented.
Overall, dominant phyla in these samples were Firmicutes (relative abundance of 64.2% ± 14.1, mean ± SD), Bacteroidetes (26.9% ± 15.0), and Actinobacteria (5.0% ± 5.2). 3,727,707 genetic markers with high quality were imputed. rs2882345, located in a non-coding region of chr 7 was significantly associated with bacterial diversity (p=4.8×10-7). No associations were observed among the 163 SNPs associated with IBD. However, a total of 146 SNPs were associated with the relative abundance of microbial phyla (p<10-6). The strongest signal observed was located on chr 12 with rs145366794 associated with the relative abundance of Firmicutes (p=6.2×10-9); however, its minor allele frequency was low (1.6%). SNPs in LINC01446 and CAPRIN1 were associated with the relative abundance of Actinobacteria (p=5.9×10-6), SNPs in PHLPP1 and MKL1 were associated with Bacteroidetes (p=5.9×10-6 and p=3.9×10-6), SNPs in RAB27B and PM20D1 as well as non-coding regions of chr 1 and 18 with Firmicutes (p=8.0×10-6), and SNPs in MET were associated with Proteobacteria (p=1.5×10-6).
These results indicate that host genetic polymorphisms are associated with differences in intestinal microbiota diversity and composition at the phylum level in cohort of healthy FDR. These results differ from prior studies of microbiota in patients with CD, potentially indicating that genetic associations with microbiota are difficult to evaluate in the context of established inflammation. It remains to be shown if any of these genetic associations with microbiome differences are related to the risk of developing Crohn’s’ disease. Our results represent the largest study evaluating the association between host genetics and the microbiota and in asymptomatic individuals.
[Show abstract][Hide abstract] ABSTRACT: The Genetics, Environmental, Microbial Project is a multicenter study assessing etiological factors in Crohn's disease by studying healthy first-degree relatives (FDRs) of individuals affected by Crohn's disease. We aimed to evaluate the contribution of genetic, microbial, and environmental factors to the determination of intestinal permeability in healthy FDRs.
IP was assessed using the lactulose-mannitol ratio (LacMan ratio). FDRs were genotyped for 167 inflammatory bowel disease-associated single nucleotide polymorphisms. Taxonomic profile of the fecal microbiota was determined by Illumina MiSeq pyrosequencing of 16S ribosomal RNA. The associations of LacMan ratio with demographic factors, inflammatory bowel disease-associated single nucleotide polymorphisms and the fecal microbiota were assessed.
One hundred ninety-six white FDRs were included. Eleven percent of FDRs had an elevated LacMan ratio (≥0.03). A multivariate analysis demonstrated that younger subjects and nonsmokers had higher LacMan ratios, P = 3.62 × 10 and P = 0.03, respectively. The LacMan ratio was not significantly heritable, H2r, 0.13, P = 0.13. There was no association between any of the 167 inflammatory bowel disease-associated risk variants and LacMan ratio nor was there a correlation between fecal microbial composition and the LacMan ratio.
We did not find LacMan ratio to be significantly heritable suggesting that the contribution of genetic factors to the determination of intestinal permeability in healthy FDRs is modest. Environmental factors, such as smoking, are likely more important determinants. The effect of age on intestinal barrier function has been underappreciated.
[Show abstract][Hide abstract] ABSTRACT: Motivated by genetic association studies of pleiotropy, we propose a Bayesian latent variable approach to jointly study multiple outcomes. The models studied here can incorporate both continuous and binary responses, and can account for serial and cluster correlations. We consider Bayesian estimation for the model parameters, and we develop a novel MCMC algorithm that builds upon hierarchical centering and parameter expansion techniques to efficiently sample from the posterior distribution. We evaluate the proposed method via extensive simulations and demonstrate its utility with an application to an association study of various complication outcomes related to type 1 diabetes. This article has supplementary material online.
Journal of Computational and Graphical Statistics 01/2015; · 1.22 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We investigated the association of signals from previous GWAS and candidate gene meta-analyses for diabetic retinopathy (DR) or nephropathy (DN), as well as an EPO variant in meta-analyses of severe (SDR) and mild diabetic retinopathy (MDR). Meta-analyses of SDR (≥severe non-proliferative diabetic retinopathy (NPDR) or history of panretinal photocoagulation) and MDR (≥mild NPDR), defined based on seven-field stereoscopic fundus photographs, were performed in two well-characterized type 1 diabetes (T1D) cohorts: the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications (DCCT/EDIC, n = 1,304) and Wisconsin Epidemiologic Study of Diabetic Retinopathy (WESDR, n = 603). Among 34 previous signals for DR, after controlling for multiple testing, no association was replicated in our meta-analyses. rs1571942 and rs12219125 at PLXDC2 locus showed nominally significant (<0.05) association with SDR in the same direction as previous report, as did rs1801282 in PPARG gene with MDR. Among 55 loci previously associated with DN, three showed suggestive associations with SDR in our study without maintaining significance after correction for multiple testing. Of particular interest, rs1617640 (EPO) was not significantly associated with DR status, combined SDR–DN phenotype, time to SDR or time to DN (all P > 0.05). Lack of replication of previous DR hits and EPO despite reasonable statistical power implies that many of these may be false positives. Consistent with pleiotropy, we provide suggestive collective evidence for association between DR and variants previously associated with DN without reaching statistical significance at any single locus.
Electronic supplementary material
The online version of this article (doi:10.1007/s00439-014-1517-2) contains supplementary material, which is available to authorized users.
Human Genetics 12/2014; 134(2). DOI:10.1007/s00439-014-1517-2 · 4.82 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The QT interval, an electrocardiographic measure reflecting myocardial repolarization, is a heritable trait. QT prolongation is a risk factor for ventricular arrhythmias and sudden cardiac death (SCD) and could indicate the presence of the potentially lethal mendelian long-QT syndrome (LQTS). Using a genome-wide association and replication study in up to 100,000 individuals, we identified 35 common variant loci associated with QT interval that collectively explain ∼8-10% of QT-interval variation and highlight the importance of calcium regulation in myocardial repolarization. Rare variant analysis of 6 new QT interval-associated loci in 298 unrelated probands with LQTS identified coding variants not found in controls but of uncertain causality and therefore requiring validation. Several newly identified loci encode proteins that physically interact with other recognized repolarization proteins. Our integration of common variant association, expression and orthogonal protein-protein interaction screens provides new insights into cardiac electrophysiology and identifies new candidate genes for ventricular arrhythmias, LQTS and SCD.
[Show abstract][Hide abstract] ABSTRACT: Copy number variation has emerged as an important cause of phenotypic variation, particularly in relation to some complex disorders. Autism spectrum disorder (ASD) is one such disorder, in which evidence is emerging for an etiological role for some rare penetrant de novo and rare inherited copy number variants (CNVs). De novo variation, however, does not always explain the familial nature of ASD, leaving a gap in our knowledge concerning the heritable genetic causes of this disorder. Extended pedigrees, in which several members have ASD, provide an opportunity to investigate inherited genetic risk factors. In this current study, we recruited 19 extended ASD pedigrees, and, using the Illumina HumanOmni2.5 BeadChip, conducted genome-wide CNV interrogation. We found no definitive evidence of an etiological role for segregating CNVs in these pedigrees, and no evidence that linkage signals in these pedigrees are explained by segregating CNVs. However, a small number of putative de novo variants were transmitted from BAP parents to their ASD offspring, and evidence emerged for a rare duplication CNV at 11p13.3 harboring two putative 'developmental/neuropsychiatric' susceptibility gene(s), GSTP1 and NDUFV1.
Human Genetics 11/2014; 134(2). DOI:10.1007/s00439-014-1513-6 · 4.82 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Significant evidence exists for the association between copy number variants (CNVs) and Autism Spectrum Disorder (ASD); however, most of this work has focused solely on the diagnosis of ASD. There is limited understanding of the impact of CNVs on the 'sub-phenotypes' of ASD. The objective of this paper is to evaluate associations between CNVs in differentially brain expressed (DBE) genes or genes previously implicated in ASD/intellectual disability (ASD/ID) and specific sub-phenotypes of ASD. The sample consisted of 1590 cases of European ancestry from the Autism Genome Project (AGP) with a diagnosis of an ASD and at least one rare CNV impacting any gene and a core set of phenotypic measures, including symptom severity, language impairments, seizures, gait disturbances, intelligence quotient (IQ) and adaptive function, as well as paternal and maternal age. Classification analyses using a non-parametric recursive partitioning method (random forests) were employed to define sets of phenotypic characteristics that best classify the CNV-defined groups. There was substantial variation in the classification accuracy of the two sets of genes. The best variables for classification were verbal IQ for the ASD/ID genes, paternal age at birth for the DBE genes and adaptive function for de novo CNVs. CNVs in the ASD/ID list were primarily associated with communication and language domains, whereas CNVs in DBE genes were related to broader manifestations of adaptive function. To our knowledge, this is the first study to examine the associations between sub-phenotypes and CNVs genome-wide in ASD. This work highlights the importance of examining the diverse sub-phenotypic manifestations of CNVs in ASD, including the specific features, comorbid conditions and clinical correlates of ASD that comprise underlying characteristics of the disorder.Molecular Psychiatry advance online publication, 25 November 2014; doi:10.1038/mp.2014.150.
[Show abstract][Hide abstract] ABSTRACT: We propose a novel and easy-to-implement joint location-scale association
testing procedure that can account for complex genetic architecture without
explicitly modeling interaction effects, and is suitable for large-scale
whole-genome scans and meta-analyses. We focus on Fisher's method and use it to
combine evidence from the standard location test and the more recent scale
test, and we describe its use for single-variant, gene-set and pathway