Sibship analysis of associations between SNP haplotypes and a continuous trait with application to mammographic density.
ABSTRACT Haplotype-based association studies have been proposed as a powerful comprehensive approach to identify causal genetic variation underlying complex diseases. Data comparisons within families offer the additional advantage of dealing naturally with complex sources of noise, confounding and population stratification. Two problems encountered when investigating associations between haplotypes and a continuous trait using data from sibships are (i) the need to define within-sibship comparisons for sibships of size greater than two and (ii) the difficulty of resolving the joint distribution of haplotype pairs within sibships in the absence of parental genotypes. We therefore propose first a method of orthogonal transformation of both outcomes and exposures that allow the decomposition of between- and within-sibship regression effects when sibship size is greater than two. We conducted a simulation study, which confirmed analysis using all members of a sibship is statistically more powerful than methods based on cross-sectional analysis or using subsets of sib-pairs. Second, we propose a simple permutation approach to avoid errors of inference due to the within-sibship correlation of any errors in haplotype assignment. These methods were applied to investigate the association between mammographic density (MD), a continuously distributed and heritable risk factor for breast cancer, and single nucleotide polymorphisms (SNPs) and haplotypes from the VDR gene using data from a study of 430 twins and sisters. We found evidence of association between MD and a 4-SNP VDR haplotype. In conclusion, our proposed method retains the benefits of the between- and within-pair analysis for pairs of siblings and can be implemented in standard software.
- SourceAvailable from: Scott G Wilson[Show abstract] [Hide abstract]
ABSTRACT: A common design in family-based association studies consists of siblings without parents. Several methods have been proposed for analysis of sibship data, but they mostly do not allow for missing data, such as haplotype phase or untyped markers. On the other hand, general methods for nuclear families with missing data are computationally intensive when applied to sibships, since every family has missing parents that could have many possible genotypes. We propose a computationally efficient model for sibships by conditioning on the sets of alleles transmitted into the sibship by each parent. This means that the likelihood can be written only in terms of transmitted alleles and we do not have to sum over all possible untransmitted alleles when they cannot be deduced from the siblings. The model naturally accommodates missing data and admits standard theory of estimation, testing, and inclusion of covariates. Our model is quite robust to population stratification and can test for association in the presence of linkage. We show that our model has similar power to FBAT for single marker analysis and improved power for haplotype analysis. Compared to summing over all possible untransmitted alleles, we achieve similar power with considerable reductions in computation time.Annals of Human Genetics 01/2011; 75(3):428-38. · 2.22 Impact Factor