Ruzong Fan

Texas A&M University, College Station, TX, USA

Are you Ruzong Fan?

Claim your profile

Publications (22)67.35 Total impact

  • Source
    Article: A powerful score test to detect positive selection in genome-wide scans.
    [show abstract] [hide abstract]
    ABSTRACT: One of the surest signatures of recent positive selection is a local elevation of advantageous allele frequency and linkage disequilibrium (LD). We proposed to detect such hitchhiking effects by using extended stretches of homozygosity as a surrogate indicator of recent positive selection. An extended haplotype-based homozygosity score test (EHHST) was developed to detect excess homozygosity. The EHHST conditioned on existing LD and it tested the haplotype version of the Hardy-Weinberg equilibrium. Compared with existing popular tests, which usually lack clear distribution, the EHHST is asymptotically normal, which makes analysis and applications easier. In particular, the EHHST facilitates the computation of an asymptotic P-value instead of an empirical P-value, using simulations. We evaluated by simulation that the EHHST led to appropriate false-positive rates, and it had higher or similar power as the existing popular methods. The method was applied to HapMap Phase II data. We were able to replicate previous findings of strong positive selection in 17 autosome genomic regions out of 20 reported candidates. On the basis of high EHHST values and population differentiations, we identified 15 new candidate regions that could undergo recent selection.
    European journal of human genetics: EJHG 05/2010; 18(10):1148-59. · 3.56 Impact Factor
  • Article: Quantitative trait loci for exercise training responses in FVB/NJ and C57BL/6J mice.
    [show abstract] [hide abstract]
    ABSTRACT: The genetic factors determining the magnitude of the response to exercise training are poorly understood. The aim of this study was to identify quantitative trait loci (QTL) associated with adaptation to exercise training in a cross between FVB/NJ (FVB) and C57BL/6J (B6) mice. Mice completed an exercise performance test before and after a 4-wk treadmill running program, and changes in exercise capacity, expressed as work (kg.m), were calculated. Changes in work in F(2) mice averaged 1.51 +/- 0.08 kg.m (94.3 +/- 7.3%), with a range of -1.67 to +4.55 kg.m. All F(2) mice (n = 188) were genotyped at 20-cM intervals with 103 single nucleotide polymorphisms (SNPs), and genomewide linkage scans were performed for pretraining, posttraining, and change in work. Significant QTL for pretraining work were located on chromosomes 14 at 4.0 cM [3.72 logarithm of odds (LOD)] and 19 at 34.4 cM (3.63 LOD). For posttraining work significant QTL were located on chromosomes 3 at 60 cM (4.66 LOD) and 14 at 26 cM (4.99 LOD). Suggestive QTL for changes in work were found on chromosomes 11 at 44.6 cM (2.30 LOD) and 14 at 36 cM (2.25 LOD). When pretraining work was used as a covariate, a potential QTL for change in work was identified on chromosome 6 at 68 cM (3.56 LOD). These data indicate that one or more QTL determine exercise capacity and training responses in mice. Furthermore, these data suggest that the genes that determine pretraining work and training responses may differ.
    Physiological Genomics 09/2009; 40(1):15-22. · 2.73 Impact Factor
  • Article: Transmission of surfactant protein variants and haplotypes in children hospitalized with respiratory syncytial virus.
    [show abstract] [hide abstract]
    ABSTRACT: Severity of lung injury with respiratory syncytial virus (RSV) infection is variable and may be related to genetic variations. This preliminary report describes a prospective, family-based association study of children hospitalized secondary to RSV, aimed to determine whether intragenic and other haplotypes of surfactant proteins (SP)-A and SP-D are transmitted disproportionately from parents to offspring with RSV disease. Genomic DNA was genotyped for several SP-A and SP-D single nucleotide polymorphisms (SNPs). Transmission disequilibrium test analysis was used to determine transmission of variants and haplotypes from parents to affected offspring. Three hundred seventy-five individuals were studied, including 148 children with active RSV disease and one or both parents. The SP-A2 intragenic haplotype 1A was found to be protective (p = 0.013). The SP-D SNP DA160_A may possibly be an "at-risk" marker (p = 0.0058). Additional two- and three-marker haplotypes were associated with severe RSV disease, with two being protective (DA11_T/DA160_G and DA160_G/SP-A2 1A/SP-A1 6A). We conclude that there may be associations between SP-A and SP-D and RSV disease. Further study is required to determine whether these variants can be used to target a high-risk patient population in clinical trials aimed at reducing either the symptoms of acute infection or long-term pulmonary sequelae.
    Pediatric Research 04/2009; 66(1):70-3. · 2.70 Impact Factor
  • Source
    Article: A genome-wide association scan for rheumatoid arthritis data by Hotelling's T2 tests.
    [show abstract] [hide abstract]
    ABSTRACT: ABSTRACT : We performed a genome-wide association scan on the North American Rheumatoid Arthritis Consortium (NARAC) data using Hotelling's T2 tests, i.e., TH based on allele coding and TG based on genotype coding. The objective was to identify associations between single-nucleotide polymorphisms (SNPs) or markers and rheumatoid arthritis. In specific candidate gene regions, we evaluated the performance of Hotelling's T2 tests. Then Hotelling's T2 tests were used as a tool to identify new regions that contain SNPs showing strong associations with disease. As expected, the strongest association evidence was found in the region of the HLA-DRB1 locus on chromosome 6. In the region of the TRAF1-C5 genes, we identified two SNPs, rs2900180 and rs3761847, with the largest and the second largest TH and TG scores among all SNPs on chromosome 9. We also identified one SNP, rs2476601, in the region of the PTPN22 gene that had the largest TH score and the second largest TG score among all SNPs on chromosome 1. In addition, SNPs with the largest TH score on each chromosome were identified. These SNPs may be located in the regions of genes that have modest effects on rheumatoid arthritis. These regions deserve further investigation.
    BMC proceedings 01/2009; 3 Suppl 7:S6.
  • Source
    Article: Bivariate combined linkage and association mapping of quantitative trait loci.
    [show abstract] [hide abstract]
    ABSTRACT: In this paper, bivariate/multivariate variance component models are proposed for high-resolution combined linkage and association mapping of quantitative trait loci (QTL), based on combinations of pedigree and population data. Suppose that a quantitative trait locus is located in a chromosome region that exerts pleiotropic effects on multiple quantitative traits. In the region, multiple markers such as single nucleotide polymorphisms are typed. Two regression models, "genotype effect model" and "additive effect model", are proposed to model the association between the markers and the trait locus. The linkage information, i.e., recombination fractions between the QTL and the markers, is modeled in the variance and covariance matrix. By analytical formulae, we show that the "genotype effect model" can be used to model the additive and dominant effects simultaneously; the "additive effect model" only takes care of additive effect. Based on the two models, F-test statistics are proposed to test association between the QTL and markers. By analytical power analysis, we show that bivariate models can be more powerful than univariate models. For moderate-sized samples, the proposed models lead to correct type I error rates; and so the models are reasonably robust. As a practical example, the method is applied to analyze the genetic inheritance of rheumatoid arthritis for the data of The North American Rheumatoid Arthritis Consortium, Problem 2, Genetic Analysis Workshop 15, which confirms the advantage of the proposed bivariate models.
    Genetic Epidemiology 08/2008; 32(5):396-412. · 3.44 Impact Factor
  • Article: Combined linkage and association mapping of quantitative trait Loci with missing completely at random genotype data.
    [show abstract] [hide abstract]
    ABSTRACT: In genetics study, the genotypes or phenotypes can be missing due to various reasons. In this paper, the impact of missing genotypes is investigated for high resolution combined linkage and association mapping of quantitative trait loci (QTL). We assume that the genotype data are missing completely at random (MCAR). Two regression models, "genotype effect model" and "additive effect model", are proposed to model the association between the markers and the trait locus. If the marker genotype is not missing, the model is exactly the same as those of our previous study, i.e., the number of genotype or allele is used as weight to model the effect of the genotype or allele in single marker case. If the marker genotype is missing, the expected number of genotype or allele is used as weight to model the effect of the genotype or allele. By analytical formulae, we show that the "genotype effect model" can be used to model the additive and dominance effects simultaneously, and the "additive effect model" can only be used to model the additive effect. Based on the two models, F-test statistics are proposed to test association between the QTL and markers. The non-centrality parameter approximations of F-test statistics are derived to calculate power and to compare power, which show that the power of the F-tests is reduced due to the missingness. By simulation study, we show that the two models have reasonable type I error rates for a dataset of moderate sample size. However, the type I error rates can be very slightly inflated if all individuals with missing genotypes are removed from analysis. Hence, the proposed method can help to get correct type I error rates although it does not improve power. As a practical example, the method is applied to analyze the angiotensin-1 converting enzyme (ACE) data.
    Behavior Genetics 06/2008; 38(3):316-36. · 2.52 Impact Factor
  • Article: Using linkage and association to identify and model genetic effects: summary of GAW15 Group 4
    [show abstract] [hide abstract]
    ABSTRACT: Group 4 at Genetic Analysis Workshop 15 focused on methods that exploited both linkage and association information to map disease loci. All contributions considered the dichotomous trait of rheumatoid arthritis, using either affected sibpairs and/or unrelated controls. While one contribution investigated linkage and association approaches separately in genome-wide analyses, the remaining others focused on joint linkage and association methods in specific genomic regions. The latter contributions proposed new methods and/or examined existing methods that addressed whether one or more polymorphisms partially or fully explained a linkage signal, particularly the methods proposed by Li et al. that are implemented in the computer program Linkage and Association Modeling in Pedigrees (LAMP). Using simulated SNP data under linkage peaks, several contributions found that existing family-based association approaches such as those of Martin et al. and Lake et al. had power similar to LAMP and to several methods proposed by the contributors for testing that a single nucleotide polymorphism partially explains a linkage peak. In evaluating methods for identifying if a polymorphism or a set of polymorphisms fully accounted for a linkage signal, several contributions found that it was important to understand that these methods may be subject to low power in some situations and thus, a non-significant result was not necessarily indicative of the polymorphism(s) being fully responsible for the linkage signal. Finally, modeling the disease using association evidence conditional on linkage may improve understanding of the etiology of disease. Genet. Epidemiol. 31 (Suppl. 1):S34–S42, 2007. © 2007 Wiley-Liss, Inc.
    Genetic Epidemiology 11/2007; 31(S1):S34 - S42. · 3.44 Impact Factor
  • Source
    Article: Haplotypes of the surfactant protein genes A and D as susceptibility factors for the development of respiratory distress syndrome.
    [show abstract] [hide abstract]
    ABSTRACT: Polymorphisms of genes are transmitted together in haplotypes, which can be used in the study of the development of complex diseases such as respiratory distress syndrome (RDS). The surfactant proteins (SPs) play important roles in lung function, and genetic variants of these proteins have been linked with lung diseases, including RDS. To determine whether haplotypes of SP-A and SP-D are transmitted disproportionately from parents to offspring with RDS, we hypothesized that previously unstudied genetic haplotypes of these SP genes are associated with the development of RDS. DNA was collected from 132 families of neonates with RDS. Genotyping was performed, and haplotype transmission from parent to offspring was determined by transmission disequilibrium test. The two-marker SP-D/SP-A haplotype DA160_A/SP-A2 1A(1) is protective against the development of RDS (p = 0.035). Four three- and four-marker haplotypes containing one or both loci from the significant two-marker haplotype are also protective against the development of RDS. These data identify protective haplotypes against RDS and support findings related to SP genetic differences in children who develop RDS. Study of haplotypes in complex diseases with both genetic and environmental risk factors may lead to better understanding of these types of diseases.
    Acta Paediatrica 08/2007; 96(7):985-9. · 2.07 Impact Factor
  • Article: Using linkage and association to identify and model genetic effects: summary of GAW15 Group 4.
    [show abstract] [hide abstract]
    ABSTRACT: Group 4 at Genetic Analysis Workshop 15 focused on methods that exploited both linkage and association information to map disease loci. All contributions considered the dichotomous trait of rheumatoid arthritis, using either affected sibpairs and/or unrelated controls. While one contribution investigated linkage and association approaches separately in genome-wide analyses, the remaining others focused on joint linkage and association methods in specific genomic regions. The latter contributions proposed new methods and/or examined existing methods that addressed whether one or more polymorphisms partially or fully explained a linkage signal, particularly the methods proposed by Li et al. that are implemented in the computer program Linkage and Association Modeling in Pedigrees (LAMP). Using simulated SNP data under linkage peaks, several contributions found that existing family-based association approaches such as those of Martin et al. and Lake et al. had power similar to LAMP and to several methods proposed by the contributors for testing that a single nucleotide polymorphism partially explains a linkage peak. In evaluating methods for identifying if a polymorphism or a set of polymorphisms fully accounted for a linkage signal, several contributions found that it was important to understand that these methods may be subject to low power in some situations and thus, a non-significant result was not necessarily indicative of the polymorphism(s) being fully responsible for the linkage signal. Finally, modeling the disease using association evidence conditional on linkage may improve understanding of the etiology of disease.
    Genetic Epidemiology 02/2007; 31 Suppl 1:S34-42. · 3.44 Impact Factor
  • Article: Family-based association tests suggest linkage between surfactant protein B (SP-B) (and flanking region) and respiratory distress syndrome (RDS): SP-B haplotypes and alleles from SP-B-linked loci are risk factors for RDS.
    [show abstract] [hide abstract]
    ABSTRACT: Genetic variants of surfactant protein B (SP-B) have been associated with respiratory distress syndrome (RDS) in the prematurely born infant. We wished to determine linkage between RDS and SP-B single nucleotide polymorphisms (SNPs) [-18 (A/C), 1013 (A/C), 1580 (C/T), and 9306 (A/G)] or SP-B-linked microsatellite [(D2S388, D2S2232, (AAGG)n, and GATA41E01 (or D2S1331)] loci and identify susceptibility or protective alleles and haplotypes. We genotyped 132 families consisting of one or two parents and at least one child affected with RDS and performed biallelic and multiallelic family-based association test (FBAT) analysis, and extended transmission disequilibrium test (ETDT). ETDT analysis identified the microsatellite SP-B-linked loci (except D2S2232) to be linked to RDS. One allele from each of these three marker loci contributes to the risk of RDS. Multiallelic FBAT analysis detected a signal of linkage for the region of the four SNP loci. Three haplotypes within this region contribute to RDS risk. Although no other region showed significant linkage as judged by multiallelic FBAT, biallelic FBAT analysis revealed three potential susceptibility haplotypes formed by two to four loci within the SP-B and SP-B-linked microsatellite region. Each haplotype included GATA41E01, which was identified by ETDT analysis to be linked to RDS. We conclude that SP-B or SP-B-linked loci are linked to RDS and certain alleles or haplotypes are susceptibility or protective factors for the development of RDS in infants born prematurely.
    Pediatric Research 05/2006; 59(4 Pt 1):616-21. · 2.70 Impact Factor
  • Source
    Article: High-resolution association mapping of quantitative trait loci: a population-based approach.
    Ruzong Fan, Jeesun Jung, Lei Jin
    [show abstract] [hide abstract]
    ABSTRACT: In this article, population-based regression models are proposed for high-resolution linkage disequilibrium mapping of quantitative trait loci (QTL). Two regression models, the "genotype effect model" and the "additive effect model," are proposed to model the association between the markers and the trait locus. The marker can be either diallelic or multiallelic. If only one marker is used, the method is similar to a classical setting by Nielsen and Weir, and the additive effect model is equivalent to the haplotype trend regression (HTR) method by Zaykin et al. If two/multiple marker data with phase ambiguity are used in the analysis, the proposed models can be used to analyze the data directly. By analytical formulas, we show that the genotype effect model can be used to model the additive and dominance effects simultaneously; the additive effect model takes care of the additive effect only. On the basis of the two models, F-test statistics are proposed to test association between the QTL and markers. By a simulation study, we show that the two models have reasonable type I error rates for a data set of moderate sample size. The noncentrality parameter approximations of F-test statistics are derived to make power calculation and comparison. By a simulation study, it is found that the noncentrality parameter approximations of F-test statistics work very well. Using the noncentrality parameter approximations, we compare the power of the two models with that of the HTR. In addition, a simulation study is performed to make a comparison on the basis of the haplotype frequencies of 10 SNPs of angiotensin-1 converting enzyme (ACE) genes.
    Genetics 02/2006; 172(1):663-86. · 4.01 Impact Factor
  • Article: Sibship T2 association tests of complex diseases for tightly linked markers.
    Ruzong Fan, Michael Knapp
    [show abstract] [hide abstract]
    ABSTRACT: For population case-control association studies, the false-positive rates can be high due to inappropriate controls, which can occur if there is population admixture or stratification. Moreover, it is not always clear how to choose appropriate controls. Alternatively, the parents or normal sibs can be used as controls of affected sibs. For late-onset complex diseases, parental data are not usually available. One way to study late-onset disorders is to perform sib-pair or sibship analyses. This paper proposes sibship-based Hotelling's T2 test statistics for high-resolution linkage disequilibrium mapping of complex diseases. For a sample of sibships, suppose that each sibship consists of at least one affected sib and at least one normal sib. Assume that genotype data of multiple tightly linked markers/haplotypes are available for each individual in the sample. Paired Hotelling's T2 test statistics are proposed for high-resolution association studies using normal sibs as controls for affected sibs, based on two coding methods: 'haplotype/allele coding' and 'genotype coding'. The paired Hotelling's T2 tests take into account not only the correlation among the markers, but also take the correlation within each sib-pair. The validity of the proposed method is justified by rigorous mathematical and statistical proofs under the large sample theory. The non-centrality parameter approximations of the test statistics are calculated for power and sample size calculations. By carrying out power and simulation studies, it was found that the non-centrality parameter approximations of the test statistics were accurate. By power and type I error analysis, the test statistics based on the 'haplotype/allele coding' method were found to be advantageous in comparison to the test statistics based on the 'genotype coding' method. The test statistics based on multiple markers can have higher power than those based on a single marker. The test statistics can be applied not only for bi-allelic markers, but also for multi-allelic markers. In addition, the test statistics can be applied to analyse the genetic data of multiple markers which contain double heterozygotes--that is, unknown linkage phase data. An SAS macro, Hotel_sibs.sas, is written to implement the method for data analysis.
    Human genomics 07/2005; 2(2):90-112.
  • Source
    Article: Combined linkage and association mapping of quantitative trait loci by multiple markers.
    Jeesun Jung, Ruzong Fan, Lei Jin
    [show abstract] [hide abstract]
    ABSTRACT: Using multiple diallelic markers, variance component models are proposed for high-resolution combined linkage and association mapping of quantitative trait loci (QTL) based on nuclear families. The objective is to build a model that may fully use marker information for fine association mapping of QTL in the presence of prior linkage. The measures of linkage disequilibrium and the genetic effects are incorporated in the mean coefficients and are decomposed into orthogonal additive and dominance effects. The linkage information is modeled in variance-covariance matrices. Hence, the proposed methods model both association and linkage in a unified model. On the basis of marker information, a multipoint interval mapping method is provided to estimate the proportion of allele sharing identical by descent (IBD) and the probability of sharing two alleles IBD at a putative QTL for a sib-pair. To test the association between the trait locus and the markers, both likelihood-ratio tests and F-tests can be constructed on the basis of the proposed models. In addition, analytical formulas of noncentrality parameter approximations of the F-test statistics are provided. Type I error rates of the proposed test statistics are calculated to show their robustness. After comparing with the association between-family and association within-family (AbAw) approach by Abecasis and Fulker et al., it is found that the method proposed in this article is more powerful and advantageous based on simulation study and power calculation. By power and sample size comparison, it is shown that models that use more markers may have higher power than models that use fewer markers. The multiple-marker analysis can be more advantageous and has higher power in fine mapping QTL. As an application, the Genetic Analysis Workshop 12 German asthma data are analyzed using the proposed methods.
    Genetics 07/2005; 170(2):881-98. · 4.01 Impact Factor
  • Source
    Article: High resolution T association tests of complex diseases based on family data.
    [show abstract] [hide abstract]
    ABSTRACT: This paper proposes family based Hotelling's T(2) tests for high resolution linkage disequilibrium (LD) mapping or association studies of complex diseases. Assume that genotype data of multiple markers or haplotype blocks are available for a sample of nuclear families, in which some offspring are affected. Paired Hotelling's T(2) test statistics are proposed for a high resolution association study using parents as controls for affected offspring, based on two coding methods: haplotype/allele coding and genotype coding. The paired Hotelling's T(2) tests take not only the correlation between the haplotype blocks or markers into account, but also take the correlation within each parent-offspring pair into account. The method extends two sample Hotelling's T(2) test statistics for population case control association studies, which are not valid for family data due to correlation of genetic data among family members. The validity of the proposed method is justified by rigorous mathematical and statistical proof under the large sample theory. The non-centrality parameter approximations of the test statistics are calculated for power and sample size calculations. From power comparison and type I error calculations, it is shown that the test statistic based on haplotype/allele coding is advantageous over the test statistic of genotype coding. Analysis using multiple markers may provide higher power than single marker analysis. If only one marker is utilized the power of the test statistic based on haplotype/allele coding is nearly identical to that of 1-TDT. Moreover, a permutation procedure is provided for data analysis. The method is applied to data from a German asthma family study. The results based on the paired Hotelling's T(2) statistic tests confirm the previous findings. However, the paired Hotelling's T(2) tests produce much smaller P-values than those of the previous study. The permutation tests produce similar results to those of the previous study; moreover, additional marker combinations are shown to be significant by permutation tests. The proposed paired Hotelling's T(2) statistic tests are potentially powerful in mapping complex diseases. A SAS Macro, Hotel_fam.sas, has been written to implement the method for data analysis.
    Annals of Human Genetics 04/2005; 69(Pt 2):187-208. · 2.57 Impact Factor
  • Source
    Article: Pedigree linkage disequilibrium mapping of quantitative trait loci.
    [show abstract] [hide abstract]
    ABSTRACT: In this paper, we propose to use pedigrees of any size and any types of relatives in joint high-resolution linkage disequilibrium (LD) and linkage mapping of quantitative trait loci (QTL) by variance component models. Two or multiple markers can be simultaneously used in modeling association with the trait locus, instead of using one marker a time in the analysis. The proposed method can provide a unified result by using two or multiple markers in the modeling. This may avoid the complications of different results obtained from the separate analysis of marker by marker. The models simultaneously incorporate both linkage and LD information. The measures of LD are modeled by mean coefficients, and linkage information is modeled by variance-covariance matrix. Using analytical formulas to calculate the regression coefficients, the genetic effects are shown to be decomposed into additive and dominance components. The noncentrality parameter approximations of test statistics of LD are provided to make power calculations. Power and type I error rates are explored to investigate the merit of the proposed method by both the analytical formulas and simulations. Comparing with the association between-family and association within-family ('AbAw') approach of Fulker and Abecasis et al, it is evident that the method proposed in this article is more powerful. The method is applied to investigate the relation between polymorphisms in the angiotensin 1-converting enzyme (ACE) genes and circulating ACE levels, with a better result than that of the 'AbAw' approach. Moreover, two markers I/D and 4656(CT)3/2 can fully interpret association with the trait locus at a 0.01 significance level, which provides a unique result for the ACE data.
    European Journal of HumanGenetics 03/2005; 13(2):216-31. · 4.40 Impact Factor
  • Article: Genome association studies of complex diseases by case-control designs.
    Ruzong Fan, Michael Knapp
    [show abstract] [hide abstract]
    ABSTRACT: One way to perform linkage-disequilibrium (LD) mapping of genetic traits is to use single markers. Since dense marker maps-such as single-nucleotide polymorphism and high-resolution microsatellite maps-are available, it is natural and practical to generalize single-marker LD mapping to high-resolution haplotype or multiple-marker LD mapping. This article investigates high-resolution LD-mapping methods, for complex diseases, based on haplotype maps or microsatellite marker maps. The objective is to explore test statistics that combine information from haplotype blocks or multiple markers. Based on two coding methods, genotype coding and haplotype coding, Hotelling's T2 statistics TG and TH are proposed to test the association between a disease locus and two haplotype blocks or two markers. The validity of the two T2 statistics is proved by theoretical calculations. A statistic TC, an extension of the traditional chi2 method of comparing haplotype frequencies, is introduced by simply adding the chi2 test statistics of the two haplotype blocks together. The merit of the three methods is explored by calculation and comparison of power and of type I errors. In the presence of LD between the two blocks, the type I error of TC is higher than that of TH and TG, since TC ignores the correlation between the two blocks. For each of the three statistics, the power of using two haplotype blocks is higher than that of using only one haplotype block. By power comparison, we notice that TC has higher power than that of TH, and TH has higher power than that of TG. In the absence of LD between the two blocks, the power of TC is similar to that of TH and higher than that of TG. Hence, we advocate use of TH in the data analysis. In the presence of LD between the two blocks, TH takes into account the correlation between the two haplotype blocks and has a lower type I error and higher power than TG. Besides, the feasibility of the methods is shown by sample-size calculation.
    The American Journal of Human Genetics 05/2003; 72(4):850-68. · 10.60 Impact Factor
  • Source
    Article: Combined high resolution linkage and association mapping of quantitative trait loci.
    Ruzong Fan, Momiao Xiong
    [show abstract] [hide abstract]
    ABSTRACT: In this paper, we investigate variance component models of both linkage analysis and high resolution linkage disequilibrium (LD) mapping for quantitative trait loci (QTL). The models are based on both family pedigree and population data. We consider likelihoods which utilize flanking marker information, and carry out an analysis of model building and parameter estimations. The likelihoods jointly include recombination fractions, LD coefficients, the average allele substitution effect and allele dominant effect as parameters. Hence, the model simultaneously takes care of the linkage, LD or association and the effects of the putative trait locus. The models clearly demonstrate that linkage analysis and LD mapping are complementary, not exclusive, methods for QTL mapping. By power calculations and comparisons, we show the advantages of the proposed method: (1) population data can provide information for LD mapping, and family pedigree data can provide information for both linkage analysis and LD mapping; (2) using family pedigree data and a sparse marker map, one may investigate the prior suggestive linkage between trait locus and markers to obtain low resolution of the trait loci, because linkage analysis can locate a broad candidate region; (3) with the prior knowledge of suggestive linkage from linkage analysis, both population and family pedigree data can be used simultaneously in high resolution LD mapping based on a dense marker map, since LD mapping can increase the resolution for candidate regions; (4) models of high resolution LD mappings using two flanking markers have higher power than that of models of using only one marker in the analysis; (5) excluding the dominant variance from the analysis when it does exist would lose power; (6) by performing linkage interval mappings, one may get higher power than by using only one marker in the analysis.
    European Journal of HumanGenetics 03/2003; 11(2):125-37. · 4.40 Impact Factor
  • Source
    Article: Linkage and association studies of QTL for nuclear families by mixed models.
    Ruzong Fan, Momiao Xiong
    [show abstract] [hide abstract]
    ABSTRACT: The transmission disequilibrium test (TDT) has been utilized to test the linkage and association between a genetic trait locus and a marker. Spielman et al. (1993) introduced TDT to test linkage between a qualitative trait and a marker in the presence of association. In the presence of linkage, TDT can be applied to test for association for fine mapping (Martin et al., 1997; Spielman and Ewens, 1996). In recent years, extensive research has been carried out on the TDT between a quantitative trait and a marker locus (Allison, 1997; Fan et al., 2002; George et al., 1999; Rabinowitz, 1997; Xiong et al., 1998; Zhu and Elston, 2000, 2001). The original TDT for both qualitative and quantitative traits requires unrelated offspring of heterozygous parents for analysis, and much research has been carried out to extend it to fit for different settings. For nuclear families with multiple offspring, one approach is to treat each child independently for analysis. Obviously, this may not be a valid method since offspring of one family are related to each other. Another approach is to select one offspring randomly from each family for analysis. However, with this method much information may be lost. Martin et al. (1997, 2000) constructed useful statistical tests to analyse the data for qualitative traits. In this paper, we propose to use mixed models to analyse sample data of nuclear families with multiple offspring for quantitative traits according to the models in Amos (1994). The method uses data of all offspring by taking into account their trait mean and variance-covariance structures, which contain all the effects of major gene locus, polygenic loci and environment. A test statistic based on mixed models is shown to be more powerful than the test statistic proposed by George et al. (1999) under moderate disequilibrium for nuclear families. Moreover, it has higher power than the TDT statistic which is constructed by randomly choosing a single offspring from each nuclear family.
    Biostatistics 02/2003; 4(1):75-95. · 2.14 Impact Factor
  • Source
    Article: High-resolution joint linkage disequilibrium and linkage mapping of quantitative trait loci based on sibship data.
    Ruzong Fan, Jeesun Jung
    [show abstract] [hide abstract]
    ABSTRACT: This paper proposes variance component models for high resolution joint linkage disequilibrium (LD) and linkage mapping of quantitative trait loci (QTL) based on sibship data; this can include population data if independent individuals are treated as single sibships. One application of these models is late onset complex disease gene mapping, when parental data are not available. The models simultaneously incorporate both LD and linkage information. The LD information is contained in mean coefficients of sibship data. The linkage information is contained in the variance-covariance matrices of trait values for sibships with at least two siblings. We derive formulas for calculating the probability of sharing two trait alleles identical by descent (IBD) for sibpairs in interval mapping of QTL; this is the coefficient of dominant variance of the trait covariance of sibpairs on major QTL. To investigate the performance of the formulas, we calculate the numerical values via the formulas and get satisfactory approximations. We compare the power and sample sizes for both LD and linkage mapping. By simulation and theoretical analysis, we compare the results with those of Fulker and Abecasis "AbAw" approach. It is well known that the resolution of linkage analysis can be low for complex disease gene mapping. LD mapping, on the other hand, can increase mapping precision and is useful in high resolution mapping. Linkage analysis is less sensitive to population subdivisions and admixtures. The level of LD is sensitive to population stratification which may easily lead to spurious association. Performing a joint analysis of LD and linkage mapping can help to overcome the limits of both approaches. Moreover, the advantages of the two complementary strategies can be utilized maximally. In practice, linkage analysis may be performed using pedigree data to identify suggestive linkage between markers and trait loci based on a sparse marker map. In the presence of linkage, joint LD and linkage mapping can be carried out to do fine gene mapping based on a dense genetic map using both pedigree and population data. Population and pedigree data of any type can be combined to perform a joint analysis of high resolution LD and linkage mapping of QTL by generalizing the method.
    Human Heredity 02/2003; 56(4):166-87. · 1.79 Impact Factor
  • Source
    Article: High resolution mapping of quantitative trait loci by linkage disequilibrium analysis.
    Ruzong Fan, Momiao Xiong
    [show abstract] [hide abstract]
    ABSTRACT: Two methods, linkage analysis and linkage disequilibrium (LD) mapping or association study, are usually utilised for mapping quantitative trait loci (QTL). Linkage mapping is appropriate for low resolution mapping to localise trait loci to broad chromosome regions within a few cM (<10 cM), and is based on family data. Linkage disequilibrium mapping, on the other hand, is useful in high resolution or fine mapping, and is based on both population and family data. Using only one marker, one may carry out single-point linkage analysis and linkage disequilibrium mapping. Using two or more markers, it is possible to flank the QTL by multipoint analysis. The development and thus availability of dense marker maps, such as single nucleotide polymorphisms (SNP) in human genome, presents a tremendous opportunity for multipoint fine mapping. In this article, we propose a regression approach of mapping QTL by linkage disequilibrium mapping based on population data. Assuming that two marker loci flank one quantitative trait locus, a two-point linear regression is proposed to analyse population data. We derive analytical formulas of parameter estimations, and non-centrality parameters of appropriate tests of genetic effects and linkage disequilibrium coefficients. The merit of the method is shown by the power calculation and comparison. The two-point regression model can capture much more linkage and linkage disequilibrium information than that derived when only one marker is used. For a complex disease with heritability h(2)> or =0.15, a study with sample size of 250 can provide high power for QTL detection under moderate linkage disequilibria.
    European Journal of HumanGenetics 11/2002; 10(10):607-15. · 4.40 Impact Factor

Institutions

  • 2002–2010
    • Texas A&M University
      • Department of Statistics
      College Station, TX, USA
  • 2009
    • University of Texas MD Anderson Cancer Center
      • Department of Epidemiology
      Houston, TX, USA
  • 2008
    • Indiana University-Purdue University Indianapolis
      • Department of Medical and Molecular Genetics
      Indianapolis, IN, USA
  • 2005
    • Rheinische Friedrich-Wilhelms-Universität Bonn
      Bonn, North Rhine-Westphalia, Germany