Haplotype sharing correlation analysis using family data: a comparison with family-based association test in the presence of allelic heterogeneity.

Department of Biostatistics, City of Hope National Medical Center, Duarte, California 91010-3000, USA.
Genetic Epidemiology (Impact Factor: 2.95). 08/2004; 27(1):43-52. DOI: 10.1002/gepi.20005
Source: PubMed

ABSTRACT The haplotype-sharing correlation (HSC) method for association analysis using family data is revisited by introducing a permutation procedure for estimating region-wise significance at each marker on a study segment. In simulation studies, the HSC method has a correct type 1 error rate in both unstructured and structured populations. The HSC signals on disease segments occur in the vicinity of a true disease locus on a restricted region without recombination hotspots. However, the peak signal may not pinpoint the true disease location in a small region with dense markers. The HSC method is shown to have higher power than single- and multilocus family-based association test (FBAT) methods when the true disease locus is unobserved among the study markers, and especially under conditions of weak linkage disequilibrium and multiple ancestral disease alleles. These simulation results suggest that the HSC method has the capacity to identify true disease-associated segments under allelic heterogeneity that go undetected by the FBAT method that compares allelic or haplotypic frequencies.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Haplotype sharing analysis is a well-established option for the investigation of the etiology of complex diseases. The statistical power of haplotype association methods depends strongly on how the information of unobserved haplotypes can be captured by multilocus genotypes. In this study we combine an entropy-based marker selection algorithm (EMS), with a haplotype sharing-based Mantel statistics into a new algorithm. Genetic markers are iteratively selected by their multilocus linkage disequilibrium (LD), which is assessed by a normalized entropy difference. The initial marker set is gradually enlarged to increase the available information on the amount of sharing around a potential susceptibility marker. Markers are rejected from joint phasing if they do not increase the multilocus LD. In simulated candidate gene studies, the Mantel statistics combined with the new EMS performs as well or better at detecting the disease single nucleotide polymorphism-or in indirect association analysis its flanking markers-than the Mantel statistics without selection of markers prior to haplotype estimation and the Mantel statistics using sliding windows of size five. It is therefore appealing to apply our selection approach for haplotype-based association analysis, since marker selection driven by the observed data avoids both the arbitrary choice of markers when using a fixed window size, as well as the estimation of haplotype block structure.
    Genetic Epidemiology 02/2010; 34(4):354-63. · 2.95 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: With the advent of novel sequencing technologies, interest in the identification of rare variants that influence common traits has increased rapidly. Standard statistical methods, such as the Cochrane-Armitage trend test or logistic regression, fail in this setting for the analysis of unrelated subjects because of the rareness of the variants. Recently, various alternative approaches have been proposed that circumvent the rareness problem by collapsing rare variants in a defined genetic region or sets of regions. We provide an overview of these collapsing methods for association analysis and discuss the use of permutation approaches for significance testing of the data-adaptive methods.
    Genetic Epidemiology 01/2011; 35 Suppl 1:S12-7. · 2.95 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Haplotype-based approaches have been extensively studied for case-control association mapping in recent years. It has been shown that haplotype methods can provide more consistent results comparing to single-locus based approaches, especially in cases where causal variants are not typed. Improved power has been observed by clustering similar or rare haplotypes into groups to reduce the degrees of freedom of association tests. For family-based association studies, one commonly used strategy is Transmission Disequilibrium Tests (TDT), which examine the imbalanced transmission of alleles/haplotypes to affected and normal children. Many extensions have been developed to deal with general pedigrees and continuous traits. In this paper, we propose a new haplotype-based association method for family data that is different from the TDT framework. Our approach (termed F_HapMiner) is based on our previous successful experiences on haplotype inference from pedigree data and haplotype-based association mapping. It first infers diplotype pairs of each individual in each pedigree assuming no recombination within a family. A phenotype score is then defined for each founder haplotype. Finally, F_HapMiner applies a clustering algorithm on those founder haplotypes based on their similarities and identifies haplotype clusters that show significant associations with diseases/traits. We have performed extensive simulations based on realistic assumptions to evaluate the effectiveness of the proposed approach by considering different factors such as allele frequency, linkage disequilibrium (LD) structure, disease model and sample size. Comparisons with single-locus and haplotype-based TDT methods demonstrate that our approach consistently outperforms the TDT-based approaches regardless of disease models, local LD structures or allele/haplotype frequencies. We present a novel haplotype-based association approach using family data. Experiment results demonstrate that it achieves significantly higher power than TDT-based approaches.
    BMC Bioinformatics 01/2010; 11 Suppl 1:S45. · 2.67 Impact Factor