From phenotype to genotype: an association study of longitudinal phenotypic markers to Alzheimer's disease relevant SNPs.

Department of Computer Science and Engineering, University of Texas at Arlington, TX 76019, USA.
Bioinformatics (Impact Factor: 4.62). 09/2012; 28(18):i619-i625. DOI: 10.1093/bioinformatics/bts411
Source: PubMed

ABSTRACT MOTIVATION: Imaging genetic studies typically focus on identifying single-nucleotide polymorphism (SNP) markers associated with imaging phenotypes. Few studies perform regression of SNP values on phenotypic measures for examining how the SNP values change when phenotypic measures are varied. This alternative approach may have a potential to help us discover important imaging genetic associations from a different perspective. In addition, the imaging markers are often measured over time, and this longitudinal profile may provide increased power for differentiating genotype groups. How to identify the longitudinal phenotypic markers associated to disease sensitive SNPs is an important and challenging research topic. RESULTS: Taking into account the temporal structure of the longitudinal imaging data and the interrelatedness among the SNPs, we propose a novel 'task-correlated longitudinal sparse regression' model to study the association between the phenotypic imaging markers and the genotypes encoded by SNPs. In our new association model, we extend the widely used ℓ(2,1)-norm for matrices to tensors to jointly select imaging markers that have common effects across all the regression tasks and time points, and meanwhile impose the trace-norm regularization onto the unfolded coefficient tensor to achieve low rank such that the interrelationship among SNPs can be addressed. The effectiveness of our method is demonstrated by both clearly improved prediction performance in empirical evaluations and a compact set of selected imaging predictors relevant to disease sensitive SNPs. AVAILABILITY: Software is publicly available at: CONTACT: or

  • [Show abstract] [Hide abstract]
    ABSTRACT: The development of advanced medical imaging technologies and high-throughput genomic measurements has enhanced our ability to understand their interplay as well as their relationship with human behavior by integrating these two types of datasets. However, the high dimensionality and heterogeneity of these datasets presents a challenge to conventional statistical methods; there is a high demand for the development of both correlative and integrative analysis approaches. Here, we review our recent work on developing sparse representation based approaches to address this challenge. We show how sparse models are applied to the correlation and integration of imaging and genetic data for biomarker identification. We present examples on how these approaches are used for the detection of risk genes and classification of complex diseases such as schizophrenia. Finally, we discuss future directions on the integration of multiple imaging and genomic datasets including their interactions such as epistasis.
    Journal of Neuroscience Methods 11/2014; 237. DOI:10.1016/j.jneumeth.2014.09.001 · 1.96 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Most existing genome-wide association analyses are cross-sectional, utilizing only phenotypic data at a single time point, e.g. baseline. On the other hand, longitudinal studies, such as Alzheimer's Disease Neuroimaging Initiative (ADNI), collect phenotypic information at multiple time points. In this article, as a case study, we conducted both longitudinal and cross-sectional analyses of the ADNI data with several brain imaging (not clinical diagnosis) phenotypes, demonstrating the power gains of longitudinal analysis over cross-sectional analysis. Specifically, we scanned genome-wide single nucleotide polymorphisms (SNPs) with 56 brain-wide imaging phenotypes processed by FreeSurfer on 638 subjects. At the genome-wide significance level ([Formula: see text]) or a less stringent level (e.g. [Formula: see text]), longitudinal analysis of the phenotypic data from the baseline to month 48 identified more SNP-phenotype associations than cross-sectional analysis of only the baseline data. In particular, at the genome-wide significance level, both SNP rs429358 in gene APOE and SNP rs2075650 in gene TOMM40 were confirmed to be associated with various imaging phenotypes in multiple regions of interests (ROIs) by both analyses, though longitudinal analysis detected more regional phenotypes associated with the two SNPs and indicated another significant SNP rs439401 in gene APOE. In light of the power advantage of longitudinal analysis, we advocate its use in current and future longitudinal neuroimaging studies.
    PLoS ONE 08/2014; 9(8):e102312. DOI:10.1371/journal.pone.0102312 · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A widely studied problem in systems biology is to predict bacterial phenotype from growth conditions, using mechanistic models such as flux balance analysis (FBA). However, the inverse prediction of growth conditions from phenotype is rarely considered. Here we develop a computational framework to carry out this inverse prediction on a computational model of bacterial metabolism. We use FBA to calculate bacterial phenotypes from growth conditions in E. coli, and then we assess how accurately we can predict the original growth conditions from the phenotypes. Prediction is carried out via regularized multinomial regression. Our analysis provides several important physiological and statistical insights. First, we show that by analyzing metabolic end products we can consistently predict growth conditions. Second, prediction is reliable even in the presence of small amounts of impurities. Third, flux through a relatively small number of reactions per growth source (∼10) is sufficient for accurate prediction. Fourth, combining the predictions from two separate models, one trained only on carbon sources and one only on nitrogen sources, performs better than models trained to perform joint prediction. Finally, that separate predictions perform better than a more sophisticated joint prediction scheme suggests that carbon and nitrogen utilization pathways, despite jointly affecting cellular growth, may be fairly decoupled in terms of their dependence on specific assortments of molecular precursors.
    PLoS ONE 12/2014; 9(12):e114608. DOI:10.1371/journal.pone.0114608 · 3.53 Impact Factor

Full-text (2 Sources)

Available from
May 30, 2014