Ignoring Intermarker Linkage Disequilibrium Induces False-Positive Evidence of Linkage for Consanguineous Pedigrees when Genotype Data Is Missing for Any Pedigree Member
ABSTRACT Missing genotype data can increase false-positive evidence for linkage when either parametric or nonparametric analysis is carried out ignoring intermarker linkage disequilibrium (LD). Previously it was demonstrated by Huang et al.  that no bias occurs in this situation for affected sib-pairs with unrelated parents when either both parents are genotyped or genotype data is available for two additional unaffected siblings when parental genotypes are missing. However, this is not the case for autosomal recessive consanguineous pedigrees, where missing genotype data for any pedigree member within a consanguinity loop can increase false-positive evidence of linkage. False-positive evidence for linkage is further increased when cryptic consanguinity is present. The amount of false-positive evidence for linkage, and which family members aid in its reduction, is highly dependent on which family members are genotyped. When parental genotype data is available, the false-positive evidence for linkage is usually not as strong as when parental genotype data is unavailable. For a pedigree with an affected proband whose first-cousin parents have been genotyped, further reduction in the false-positive evidence of linkage can be obtained by including genotype data from additional affected siblings of the proband or genotype data from the proband's sibling-grandparents. For the situation, when parental genotypes are unavailable, false-positive evidence for linkage can be reduced by including genotype data from either unaffected siblings of the proband or the proband's married-in-grandparents in the analysis.
- SourceAvailable from: Catherine Bourgain[Show abstract] [Hide abstract]
ABSTRACT: SNP maps are becoming the gold standard for genetic markers, even for linkage analyses. However, because of the density of SNPs on most high throughput platforms, the resulting significant linkage disequilibrium (LD) can bias classical nonparametric multipoint linkage analyses. This problem may be even stronger in population isolates where LD can extend over larger distances and with a more stochastic pattern. We investigate the issue of linkage analysis with SNPs from the Affymetrix 500K GeneChip array in extended families from the isolated Hutterite population. We minimized LD between SNPs by two methods based on a LD block pattern (Merlin and SNPLINK) and by MASEL, a new algorithm that we proposed to select SNP subsets with minimum LD and with no prior hypothesis about the LD pattern. Simulations, performed using the real LD pattern observed in the Hutterite population, show that sizeable inflation of linkage statistics persist when LD between SNPs is minimized by Merlin and SNPLINK. Inflation of linkage statistics is better controlled with MASEL. In this population, it may be difficult to extract from standard GeneChip arrays a SNP map without LD-driven bias that is more informative than a dense microsatellite map.Human Heredity 05/2009; 68(2):87-97. DOI:10.1159/000212501 · 1.64 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Familial pulmonary arterial hypertension (FPAH) is a rare, autosomal-dominant, inherited disease with low penetrance. Mutations in the bone morphogenetic protein receptor 2 (BMPR2) have been identified in at least 70% of FPAH patients. However, the lifetime penetrance of these BMPR2 mutations is 10% to 20%, suggesting that genetic and/or environmental modifiers are required for disease expression. Our goal in this study was to identify genetic loci that may influence FPAH expression in BMPR2 mutation carriers. We performed a genome-wide linkage scan in 15 FPAH families segregating for BMPR2 mutations. We used a dense single-nucleotide polymorphism (SNP) array and a novel multi-scan linkage procedure that provides increased power and precision for the localization of linked loci. We observed linkage evidence in four regions: 3q22 ([median log of the odds (LOD) = 3.43]), 3p12 (median LOD) = 2.35), 2p22 (median LOD = 2.21), and 13q21 (median LOD = 2.09). When used in conjunction with the non-parametric bootstrap, our approach yields high-resolution to identify candidate gene regions containing putative BMPR2-interacting genes. Imputation of the disease model by LOD-score maximization indicates that the 3q22 locus alone predicts most FPAH cases in BMPR2 mutation carriers, providing strong evidence that BMPR2 and the 3q22 locus interact epistatically. Our findings suggest that genotypes at loci in the newly identified regions, especially at 3q22, could improve FPAH risk prediction in FPAH families. We also suggest other targets for therapeutic intervention.The Journal of heart and lung transplantation: the official publication of the International Society for Heart Transplantation 10/2009; 29(2):174-80. DOI:10.1016/j.healun.2009.08.022 · 5.61 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Current linkage studies detect and localize trait loci using genotypes sampled at hundreds of thousands of single nucleotide polymorphisms (SNPs). Such data should provide precise estimates of trait location once linkage has been established. However, correlations between nearby SNPs can distort the information about trait location. Traditionally, when faced with this dilemma, three approaches have been used: (1) ignore the correlation; (2) approximate the correlation; or, (3) analyze a single, approximately uncorrelated subset of the original dense data. Here, we examine and test a simple and efficient estimator of trait location that averages location estimates across random subsamples of the original dense data. Based on pairwise estimates of correlation, we ensure that the SNPs within each subsample are approximately uncorrelated. In addition, we use the nonparametric bootstrap procedure to compute narrow, high-resolution candidate gene regions (i.e. confidence intervals for the true trait location). Using simulated data, we show that the three existing approaches to dense SNP linkage analysis (described above) can yield biased and/or inefficient estimation depending on the underlying correlation structure. With respect to mean squared error, our estimator outperforms the third approach, and is as good as, but usually better than the first and second approaches. Relative to the third approach, our estimator led to a 47.5% reduction in the candidate gene region length based on the analysis of 15 hypertension families genotyped at approximately 500,000 SNPs. The method we developed will be an important tool for constructing high-resolution candidate gene regions that could ultimately aid in targeting regions for sequencing projects.Human Heredity 12/2009; 69(3):152-9. DOI:10.1159/000267995 · 1.64 Impact Factor