Multivariate regression analysis of distance matrices for testing associations between gene expression patterns and related variables. Proc Natl Acad Sci USA

Biomedical Sciences Graduate Program and the Polymorphism Research Laboratory, Department of Psychiatry, Moores UCSD Cancer Center, Center for Human Genetics and Genomics, University of California at San Diego, La Jolla, CA 92093, USA.
Proceedings of the National Academy of Sciences (Impact Factor: 9.81). 01/2007; 103(51):19430-5. DOI: 10.1073/pnas.0609333103
Source: PubMed

ABSTRACT A fundamental step in the analysis of gene expression and other high-dimensional genomic data is the calculation of the similarity or distance between pairs of individual samples in a study. If one has collected N total samples and assayed the expression level of G genes on those samples, then an N x N similarity matrix can be formed that reflects the correlation or similarity of the samples with respect to the expression values over the G genes. This matrix can then be examined for patterns via standard data reduction and cluster analysis techniques. We consider an alternative to conventional data reduction and cluster analyses of similarity matrices that is rooted in traditional linear models. This analysis method allows predictor variables collected on the samples to be related to variation in the pairwise similarity/distance values reflected in the matrix. The proposed multivariate method avoids the need for reducing the dimensions of a similarity matrix, can be used to assess relationships between the genes used to construct the matrix and additional information collected on the samples under study, and can be used to analyze individual genes or groups of genes identified in different ways. The technique can be used with any high-dimensional assay or data type and is ideally suited for testing subsets of genes defined by their participation in a biochemical pathway or other a priori grouping. We showcase the methodology using three published gene expression data sets.

    • "The similarity in species composition among treatments was tested using a multivariate analysis of variance by means of distance matrices, known as Adonis (Legendre and Anderson 1999; Oksanen 2013), using 300 permutations and the Bray-Curtis dissimilarity index as the default (Faith et al. 1987). The Adonis is suitable for small individual sample sizes (Zapala and Schork 2006) and useful for beta diversity studies (Oksanen 2013). Adonis was also used for further paired comparisons between treatments. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The mutualism between ants and trophobiont insects is widely known as an interaction where the trophobionts are usually defended from their natural enemies by the ants. Ants can also remove herbivores, causing changes in the arthropod community structure and altering the stability of ecological communities. However, few studies have been conducted in field conditions to test the effects of mutualism on the associated arthropod community. In this study, we tested the hypothesis that the mutualism between trophobiont insects and ants decreases the abundance and species richness of the associated arthropod community, supporting a more stable community. We also investigated whether the abundance of specific arthropod feeding groups is affected by the mutualism and whether plants with and without mutualism form a mosaic with different species composition, increasing overall species richness. Our assumptions were experimentally examined in a system comprising the host plant Psittacanthus robustus, three trophobionts and two tending ants. We showed that, locally, the abundance and species richness of the whole arthropod community did not decrease when mutualism was present, but the feeding group composed by predators was negatively affected by mutualism. Plants with trophobionts but without ants presented the highest stability. At the landscape level, plants with and without mutualism differed in arthropod species composition, suggesting that there was a mosaic formed by plants with and without mutualism, enhancing the overall species richness (or beta diversity). Overall, our results revealed that the mutualism can alter the structure and stability of the surrounding arthropod community.
    Journal of Insect Conservation 06/2015; DOI:10.1007/s10841-015-9785-2 · 1.79 Impact Factor
  • Source
    • "A simple fix would be an F-test, but it is sensitive to nonnormality (Box, 1953) of data being tested. To address this issue, the distance-based permutation test was developed for between-group comparisons in neuroimaging studies (Anderson and Legendre, 1999; Reiss et al., 2010; Zapala and Schork, 2006), which fits the purpose of the current study. For these reasons, the distance-based permutation test was used to detect differences between specific grouping criteria of interest (heretofore referred to as " group membership " ): either between TD and WRD for a specific ROI, or across the L-OT/F ROIs, as well as initially to confirm that there were no site specific confounds. "
    [Show abstract] [Hide abstract]
    ABSTRACT: With the advent of neuroimaging techniques, especially functional MRI (fMRI), studies have mapped brain regions that are associated with good and poor reading, most centrally a region within the left occipito-temporal/fusiform region (L-OT/F) often referred to as the visual word form area (VWFA). Despite an abundance of fMRI studies of the VWFA, research about its structural connectivity has just started. Provided that the VWFA may be connected to distributed regions in the brain, it remains unclear how this network is engaged in constituting a well-tuned reading circuitry in the brain. Here we used diffusion MRI to study the structural connectivity patterns of the putative VWFA and surrounding areas within the L-OT/F in children with typically developing (TD) reading ability and with word recognition deficits (WRD; sometimes referred to as dyslexia). We found that L-OT/F connectivity varied along a posterior-anterior gradient, with specific structural connectivity patterns related to reading ability in the ROIs centered upon the putative VWFA. Findings suggest that the architecture of the VWFA connectivity is fundamentally different between TD and WRD, with TD showing greater connectivity to linguistic regions than WRD, and WRD showing greater connectivity to visual and parahippocampal regions than TD. Findings thus reveal clear structural abnormalities underlying the functional abnormalities in the VWFA in WRD.
    Brain Research 10/2014; DOI:10.1016/j.brainres.2014.08.050 · 2.83 Impact Factor
  • Source
    • "Selection of a distance measure is necessary and is typically done a-priori. In preliminary comparisons of seven distance measures, we found that all had a similar number and pattern of MDMR results (mean Dice index = 0.63; Supplementary Tables 1-2), in line with prior simulations (Zapala and Schork, 2006). Larger differences were, however, found with the Euclidean and Mahalanobois distances (Supplementary Fig. 10). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The identification of phenotypic associations in high-dimensional brain connectivity data represents the next frontier in the neuroimaging connectomics era. Exploration of brain- phenotype relationships remains limited by statistical approaches that are computationally intensive, depend on a priori hypotheses, or require stringent correction for multiple comparisons. Here, we propose a computationally efficient, data-driven technique for connectome-wide association studies (CWAS) that provides a comprehensive voxel-wise survey of brain-behavior relationships across the connectome; the approach identifies voxels whose whole-brain connectivity patterns vary significantly with a phenotypic variable. Using resting state fMRI data, we demonstrate the utility of our analytic framework by identifying significant connectivity-phenotype relationships for full-scale IQ and assessing their overlap with existent neuroimaging findings, as synthesized by openly available automated meta-analysis ( The results appeared to be robust to the removal of nuisance covariates (i.e., mean connectivity, global signal, and motion) and varying brain resolution (i.e., voxelwise results are highly similar to results using 800 parcellations). We show that CWAS findings can be used to guide subsequent seed-based correlation analyses. Finally, we demonstrate the applicability of the approach by examining CWAS for three additional datasets, each encompassing a distinct phenotypic variable: neurotypical development, Attention-Deficit/Hyperactivity Disorder diagnostic status, and L-dopa pharmacological manipulation. For each phenotype, our approach to CWAS identified distinct connectome-wide association profiles, not previously attainable in a single study utilizing traditional univariate approaches. As a computationally efficient, extensible, and scalable method, our CWAS framework can accelerate the discovery of brain-behavior relationships in the connectome.
    NeuroImage 02/2014; 93. DOI:10.1016/j.neuroimage.2014.02.024 · 6.36 Impact Factor
Show more


Available from