Effects of Genetic and Environmental Factors on Trait Network Predictions From Quantitative Trait Locus Data

Department of Biology, University of North Carolina, Greensboro, North Carolina 27402-6170, USA.
Genetics (Impact Factor: 5.96). 02/2009; 181(3):1087-99. DOI: 10.1534/genetics.108.092668
Source: PubMed


The use of high-throughput genomic techniques to map gene expression quantitative trait loci has spurred the development of path analysis approaches for predicting functional networks linking genes and natural trait variation. The goal of this study was to test whether potentially confounding factors, including effects of common environment and genes not included in path models, affect predictions of cause-effect relationships among traits generated by QTL path analyses. Structural equation modeling (SEM) was used to test simple QTL-trait networks under different regulatory scenarios involving direct and indirect effects. SEM identified the correct models under simple scenarios, but when common-environment effects were simulated in conjunction with direct QTL effects on traits, they were poorly distinguished from indirect effects, leading to false support for indirect models. Application of SEM to loblolly pine QTL data provided support for biologically plausible a priori hypotheses of QTL mechanisms affecting height and diameter growth. However, some biologically implausible models were also well supported. The results emphasize the need to include any available functional information, including predictions for genetic and environmental correlations, to develop plausible models if biologically useful trait network predictions are to be made.

Download full-text


Available from: David L Remington, Oct 08, 2015
9 Reads
  • Source
    • "Once the fit of all possible path models was evaluated, the best fitted model was required to fit the following four criteria as defined previously [46-48]: (1) Goodness of Fit Test P value ≥0.05 (indicating how likely the hypothesis is, or how well the observed data fit the expectation of the model); (2) 0.9 < Goodness of Fit Index (GoFI) ≤1; (3) Root Mean Squared Error Approximation (RMSEA) ≤0.05; (4) smallest negative Bayesian Information Criterion (BIC). Where multiple models fit to the data, the best fitted model was selected if its BIC was at least two units smaller than the next lowest BIC [48], otherwise none was selected. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Emerging technologies based on mass spectrometry or nuclear magnetic resonance enable the monitoring of hundreds of small metabolites from tissues or body fluids. Profiling of metabolites can help elucidate causal pathways linking established genetic variants to known disease risk factors such as blood lipid traits. We applied statistical methodology to dissect causal relationships between single nucleotide polymorphisms, metabolite concentrations and serum lipid traits, focusing on 95 genetic loci reproducibly associated with the four main serum lipids (total-, low-density lipoprotein- and high-density lipoprotein- cholesterol and triglycerides). The dataset used included 2,973 individuals from two independent population-based cohorts with data for 151 small molecule metabolites and four main serum lipids. Three statistical approaches, namely conditional analysis, Mendelian Randomization and Structural Equation Modelling, were compared to investigate causal relationship at sets of a single nucleotide polymorphism, a metabolite and a lipid trait associated with one another. A subset of three lipid-associated loci (FADS1, GCKR and LPA) have a statistically significant association with at least one main lipid and one metabolite concentration in our data, defining a total of 38 cross-associated sets of a single nucleotide polymorphism, a metabolite and a lipid trait. Structural Equation Modelling provided sufficient discrimination to indicate that the association of a single nucleotide polymorphism with a lipid trait was mediated through a metabolite at 15 of the 38 sets, and involving variants at the FADS1 and GCKR loci. These data provide a framework for evaluating the causal role of components of the metabolome (or other intermediate factors) in mediating the association between established genetic variants and diseases or traits.
    Genome Medicine 03/2014; 6(3):25. DOI:10.1186/gm542 · 5.34 Impact Factor
  • Source
    • "Paths were considered significant when: P-val (z-test)< 0.05 for paths between traits and <0.01 for paths between QTL and traits; only significant paths were reported. Although the SEM framework handles false-positive QTL effects well (Remington 2009), we decided to be more conservative with respect to the inclusion of QTL effects. This ensured that our conclusions about selection at specific loci were robust. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Selection on quantitative trait loci (QTL) may vary among natural environments due to differences in the genetic architecture of traits, environment-specific allelic effects or changes in the direction and magnitude of selection on specific traits. To dissect the environmental differences in selection on life history QTL across climatic regions, we grew a panel of interconnected recombinant inbred lines (RILs) of Arabidopsis thaliana in four field sites across its native European range. For each environment, we mapped QTL for growth, reproductive timing and development. Several QTL were pleiotropic across environments, three colocalizing with known functional polymorphisms in flowering time genes (CRY2, FRI and MAF2-5), but major QTL differed across field sites, showing conditional neutrality. We used structural equation models to trace selection paths from QTL to lifetime fitness in each environment. Only three QTL directly affected fruit number, measuring fitness. Most QTL had an indirect effect on fitness through their effect on bolting time or leaf length. Influence of life history traits on fitness differed dramatically across sites, resulting in different patterns of selection on reproductive timing and underlying QTL. In two oceanic field sites with high prereproductive mortality, QTL alleles contributing to early reproduction resulted in greater fruit production, conferring selective advantage, whereas alleles contributing to later reproduction resulted in larger size and higher fitness in a continental site. This demonstrates how environmental variation leads to change in both QTL effect sizes and direction of selection on traits, justifying the persistence of allelic polymorphism at life history QTL across the species range.
    Molecular Ecology 03/2013; 22(13). DOI:10.1111/mec.12285 · 6.49 Impact Factor
  • Source
    • "Graphical model network inference can be subject to a large proportion of false positive edges (Li et al., 2010). Environmental and experimental design factors that are not accounted for in the model can further misguide models (Remington, 2009). Assessing and improving the utility of mathematical models in the context of systems biology will continue to be an active area of research. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Cancer is a major health problem with high mortality rates. In the post-genome era, investigators have access to massive amounts of rapidly accumulating high-throughput data in publicly available databases, some of which are exclusively devoted to housing Cancer data. However, data interpretation efforts have not kept pace with data collection, and gained knowledge is not necessarily translating into better diagnoses and treatments. A fundamental problem is to integrate and interpret data to further our understanding in Cancer Systems Biology. Viewing cancer as a network provides insights into the complex mechanisms underlying the disease. Mathematical and statistical models provide an avenue for cancer network modeling. In this article, we review two widely used modeling paradigms: deterministic metabolic models and statistical graphical models. The strength of these approaches lies in their flexibility and predictive power. Once a model has been validated, it can be used to make predictions and generate hypotheses. We describe a number of diverse applications to Cancer Biology, including, the system-wide effects of drug-treatments, disease prognosis, tumor classification, forecasting treatment outcomes, and survival predictions.
    Frontiers in Physiology 06/2012; 3:227. DOI:10.3389/fphys.2012.00227 · 3.53 Impact Factor
Show more