Vounou M, Nichols TE, Montana G, For the Alzheimer's disease neuroimaging initiative. Discovering genetic associations with high-dimensional neuroimaging phenotypes: a sparse reduced-rank regression approach. Neuroimage 53: 1147-1159

Statistics Section, Department of Mathematics, Imperial College London, UK.
NeuroImage (Impact Factor: 6.36). 11/2010; 53(3):1147-59. DOI: 10.1016/j.neuroimage.2010.07.002
Source: PubMed


There is growing interest in performing genome-wide searches for associations between genetic variants and brain imaging phenotypes. While much work has focused on single scalar valued summaries of brain phenotype, accounting for the richness of imaging data requires a brain-wide, genome-wide search. In particular, the standard approach based on mass-univariate linear modelling (MULM) does not account for the structured patterns of correlations present in each domain. In this work, we propose sparse reduced rank regression (sRRR), a strategy for multivariate modelling of high-dimensional imaging responses (measurements taken over regions of interest or individual voxels) and genetic covariates (single nucleotide polymorphisms or copy number variations), which enforces sparsity in the regression coefficients. Such sparsity constraints ensure that the model performs simultaneous genotype and phenotype selection. Using simulation procedures that accurately reflect realistic human genetic variation and imaging correlations, we present detailed evaluations of the sRRR method in comparison with the more traditional MULM approach. In all settings considered, sRRR has better power to detect deleterious genetic variants compared to MULM. Important issues concerning model selection and connections to existing latent variable models are also discussed. This work shows that sRRR offers a promising alternative for detecting brain-wide, genome-wide associations.

1 Follower
19 Reads
  • Source
    • "Incontrasttounivariateanalyses,becauseanapproach likeICAestimatesallthevariablesjointly,bydefinitionthe voxelsinthe'network'arefunctioningcoherentlywithone another.ThispropertyofICAmethods(andinextension, p-ICA)providesthreemajorbenefits.First,ithelpswith interpretation,asonecanaccuratelyassumetheregion(or genes)inagivencomponentcovarytogether.Secondly,it providesrobustnesstonoise.Forexample,againtodraw onthefMRIexample,correlation-basedapproachescanbe 'tricked'byphenomenasuchasphaserandomizednoise whichcanappeartorepresentrealsignal(Handwerkeretal., 2012).However,inthecaseofICA,theassumptionsare strongerinthatoneisidentifyingpatternsandthusthe sametypeofrandomizednoisewillnotresemblerealsignal. ThisisnottosaythatICA-basedmethodsareimperviousto noise,buttheydotendtobemorerobustthanunivariate correlationastheyareworkingwithpatternsratherthan justpairedrelationships.ICA-basedmethodsarenotthe onlyapproachesthathavethisadvantage,forexample,other multivariateapproachesbecomingwidelyusedincludesparse reducedrankregression(Vounouetal.,2010)andsparse canonicalcorrelationanalysis(Linetal.,2014a,b).Andfinally, becausethestatisticaltestingisdoneatthelevelofnetworks, correctionformultiplecomparisonsisappropriatelybasedonthe numberofnetworktested,ratherthanthenumberofSNPsor voxels. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Complex inherited phenotypes, including those for many common medical and psychiatric diseases, are most likely underpinned by multiple genes contributing to interlocking molecular biological processes, along with environmental factors (Owen et al., 2010). Despite this, genotyping strategies for complex, inherited, disease-related phenotypes mostly employ univariate analyses, e.g., genome wide association. Such procedures most often identify isolated risk-related SNPs or loci, not the underlying biological pathways necessary to help guide the development of novel treatment approaches. This article focuses on the multivariate analysis strategy of parallel (i.e., simultaneous combination of SNP and neuroimage information) independent component analysis (p-ICA), which typically yields large clusters of functionally related SNPs statistically correlated with phenotype components, whose overall molecular biologic relevance is inferred subsequently using annotation software suites. Because this is a novel approach, whose details are relatively new to the field we summarize its underlying principles and address conceptual questions regarding interpretation of resulting data and provide practical illustrations of the method.
    Full-text · Article · Oct 2015 · Frontiers in Genetics
  • Source
    • "A Stein's unbiased risk estimation (SURE) based approach is developed for choosing the regularization parameters automatically. The differences between our method and the method in [36] are the following: first, [36] uses a suboptimal algorithm that estimates the RRR components sequentially; second, it does not use wavelet expansion; and third, it uses other simplifying assumptions that are only valid in the application domain where the algorithm was developed in, i.e., [36] develops an algorithm for genomics. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, a method called wavelet-based sparse reduced-rank regression (WSRRR) is proposed for hyperspectral image restoration. The method is based on minimizing a sparse regularization problem subject to an orthogonality constraint. A cyclic descent-type algorithm is derived for solving the minimization problem. For selecting the tuning parameters, we propose a method based on Stein's unbiased risk estimation. It is shown that the hyperspectral image can be restored using a few sparse components. The method is evaluated using signal-to-noise ratio and spectral angle distance for a simulated noisy data set and by classification accuracies for a real data set. Two different classifiers, namely, support vector machines and random forest, are used in this paper. The method is compared to other restoration methods, and it is shown that WSRRR outperforms them for the simulated noisy data set. It is also shown in the experiments on a real data set that WSRRR not only effectively removes noise but also maintains more fine features compared to other methods used. WSRRR also gives higher classification accuracies.
    Full-text · Article · Oct 2014 · IEEE Transactions on Geoscience and Remote Sensing
  • Source
    • "We utilized the same simulation design as the work in Hua and Ghosh (2014), where a linear model y r = h(x) + ǫ r with r = 1, ..., q was used to associate the phenotypes (y = (y 1 , ..., y q )) and genotypes (x = (x 1 , ..., x p )). The structure of the responses (y 1 , ..., y q ) was modelled using a covariance matrix˜Σ based on the eight (q = 8) positive correlated frontal cortex regions using the 358 mild cognitive impairment (MCI) subjects (Figure 2a), since the MRI scans of the MCI samples are relatively more uniform than both the healthy and disease groups Vounou et al. (2010). Therefore, the multivariate responses were according to all ǫ's that were generated from multivariate normal MVN(0, ˜ Σ). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Testing the independence between two random variables $x$ and $y$ is an important problem in statistics and machine learning, where the kernel-based tests of independence is focused to address the study of dependence recently. The advantage of the kernel framework rests on its flexibility in choice of kernel. The Hilbert-Schmidt Independence Criterion (HSIC) was shown to be equivalent to a class of tests, where the tests are based on different distance-induced kernel pairs. In this work, we propose to select the optimal kernel pair by considering local alternatives, and evaluate the efficiency using the quadratic time estimator of HSIC. The local alternative offers the advantage that the measure of efficiency do not depend on a particular alternative, and only requires the knowledge of the asymptotic null distribution of the test. We show in our experiments that the proposed strategy results in higher power than other existing kernel selection approaches.
    Full-text · Article · Sep 2014
Show more