Accounting for Linkage in Family-Based Tests of Association with Missing Parental Genotypes

Department of Medicine, Duke University Medical Center, Durham, NC 27710, USA.
The American Journal of Human Genetics (Impact Factor: 10.99). 12/2003; 73(5):1016-26. DOI: 10.1086/378779
Source: PubMed

ABSTRACT In studies of complex diseases, a common paradigm is to conduct association analysis at markers in regions identified by linkage analysis, to attempt to narrow the region of interest. Family-based tests for association based on parental transmissions to affected offspring are often used in fine-mapping studies. However, for diseases with late onset, parental genotypes are often missing. Without parental genotypes, family-based tests either compare allele frequencies in affected individuals with those in their unaffected siblings or use siblings to infer missing parental genotypes. An example of the latter approach is the score test implemented in the computer program TRANSMIT. The inference of missing parental genotypes in TRANSMIT assumes that transmissions from parents to affected siblings are independent, which is appropriate when there is no linkage. However, using computer simulations, we show that, when the marker and disease locus are linked and the data set consists of families with multiple affected siblings, this assumption leads to a bias in the score statistic under the null hypothesis of no association between the marker and disease alleles. This bias leads to an inflated type I error rate for the score test in regions of linkage. We present a novel test for association in the presence of linkage (APL) that correctly infers missing parental genotypes in regions of linkage by estimating identity-by-descent parameters, to adjust for correlation between parental transmissions to affected siblings. In simulated data, we demonstrate the validity of the APL test under the null hypothesis of no association and show that the test can be more powerful than the pedigree disequilibrium test and family-based association test. As an example, we compare the performance of the tests in a candidate-gene study in families with Parkinson disease.

Download full-text


Available from: Eden R Martin, Jul 01, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We performed a gene-smoking interaction analysis using families from an early-onset coronary artery disease cohort (GENECARD). This analysis was focused on validating and expanding results from previous studies implicating single nucleotide polymorphisms (SNPs) on chromosome 3 in smoking-mediated coronary artery disease. We analyzed 430 SNPs on chromosome 3 and identified 16 SNPs that showed a gene-smoking interaction at P < 0.05 using association in the presence of linkage-ordered subset analysis, a method that uses permutations of the data to empirically estimate the strength of the association signal. Seven of the 16 SNPs were in the Rho-GTPase pathway indicating a 1.87-fold enrichment for this pathway. A meta-analysis of gene-smoking interactions in three independent studies revealed that rs9289231 in KALRN had a Fisher's combined P value of 0.0017 for the interaction with smoking. In a gene-based meta-analysis KALRN had a P value of 0.026. Finally, a pathway-based analysis of the association results using WebGestalt revealed several enriched pathways including the regulation of the actin cytoskeleton pathway as defined by the Kyoto Encyclopedia of Genes and Genomes.
    Human Genetics 08/2013; 132(12). DOI:10.1007/s00439-013-1339-7 · 4.52 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Recently interest has been increasing in genetic association studies using several closely linked loci. The HAP-TDT method, which uses case-parents trios is powerful for such a task. However, it is not uncommon in practice that one parent is missing for some reason, such as late onset. The case-parents trios are thus reduced to case-parent pairs. Discarding such data could lead to a severe loss of power. In this paper, we propose the HAP-1-TDT method based on case-parent pairs to detect haplotype/disease association. A permutation-based randomisation technique is devised to assess the significance of the test statistic. Furthermore, the combined statistic HAP-C-TDT is developed to use jointly case-parents trios and case-parent pairs. These test statistics can be applied to either phase-known or phase-unknown data. A number of simulation studies are conducted to investigate the validity of the proposed tests; these studies show that the statistics are robust to population structure. Using several disease genes from the literature, we illustrate that incorporating case-parent pairs into an association study leads to noticeable power gain. Moreover, our simulation results suggest that our method has better size and power than UNPHASED. Finally, in simulated scenarios where there are only a few SNPs and risk is determined by two haplotypes that are complementary or near-complementary, our method has better power than TRIMM.
    Annals of Human Genetics 05/2010; 74(3):263-74. DOI:10.1111/j.1469-1809.2010.00563.x · 1.93 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A broad region of chromosome 10 (chr10) has engendered continued interest in the etiology of late-onset Alzheimer Disease (LOAD) from both linkage and candidate gene studies. However, there is a very extensive heterogeneity on chr10. We converged linkage analysis and gene expression data using the concept of genomic convergence that suggests that genes showing positive results across multiple different data types are more likely to be involved in AD. We identified and examined 28 genes on chr10 for association with AD in a Caucasian case-control dataset of 506 cases and 558 controls with substantial clinical information. The cases were all LOAD (minimum age at onset > or = 60 years). Both single marker and haplotypic associations were tested in the overall dataset and 8 subsets defined by age, gender, ApoE and clinical status. PTPLA showed allelic, genotypic and haplotypic association in the overall dataset. SORCS1 was significant in the overall data sets (p=0.0025) and most significant in the female subset (allelic association p=0.00002, a 3-locus haplotype had p=0.0005). Odds Ratio of SORCS1 in the female subset was 1.7 (p<0.0001). SORCS1 is an interesting candidate gene involved in the Abeta pathway. Therefore, genetic variations in PTPLA and SORCS1 may be associated and have modest effect to the risk of AD by affecting Abeta pathway. The replication of the effect of these genes in different study populations and search for susceptible variants and functional studies of these genes are necessary to get a better understanding of the roles of the genes in Alzheimer disease.
    Human Mutation 01/2009; 30(3):463-71. DOI:10.1002/humu.20953 · 5.05 Impact Factor