Mapping reliability in multicenter MRI: Voxel-based morphometry and cortical thickness
ABSTRACT Multicenter structural MRI studies can have greater statistical power than single-center studies. However, across-center differences in contrast sensitivity, spatial uniformity, etc., may lead to tissue classification or image registration differences that could reduce or wholly offset the enhanced statistical power of multicenter data. Prior work has validated volumetric multicenter MRI, but robust methods for assessing reliability and power of multisite analyses with voxel-based morphometry (VBM) and cortical thickness measurement (CORT) are not yet available. We developed quantitative methods to investigate the reproducibility of VBM and CORT to detect group differences and estimate heritability when MRI scans from different scanners running different acquisition protocols in a multicenter setup are included. The method produces brain maps displaying information such as lowest detectable effect size (or heritability) and effective number of subjects in the multicenter study. We applied the method to a five-site multicenter calibration study using scanners from four different manufacturers, running different acquisition protocols. The reliability maps showed an overall good comparability between the sites, providing a reasonable gain in sensitivity in most parts of the brain. In large parts of the cerebrum and cortex scan pooling improved heritability estimates, with "effective-N" values upto the theoretical maximum. For some areas, "optimal-pool" maps indicated that leaving out a site would give better results. The reliability maps also reveal which brain regions are in any case difficult to measure reliably (e.g., around the thalamus). These tools will facilitate the design and analysis of multisite VBM and CORT studies for detecting group differences and estimating heritability.
- SourceAvailable from: Harald J Hampel[Show abstract] [Hide abstract]
ABSTRACT: White matter (WM) magnetic resonance imaging (MRI) hyperintensities are common in Alzheimer's disease (AD), but their pathophysiological relevance and relationship to genetic factors are unclear. In the present study, we investigated potential apolipoprotein E (APOE)-dependent effects on the extent and cognitive impact of WM hyperintensities in patients with AD. WM hyperintensity volume on fluid-attenuated inversion recovery images of 201 patients with AD (128 carriers and 73 non-carriers of the APOE ε4 risk allele) was determined globally as well as regionally with voxel-based lesion mapping. Clinical, neuropsychological and MRI data were collected from prospective multicenter trials conducted by the German Dementia Competence Network. WM hyperintensity volume was significantly greater in non-carriers of the APOE ε4 allele. Lesion distribution was similar among ε4 carriers and non-carriers. Only ε4 non-carriers showed a correlation between lesion volume and cognitive performance. The current findings indicate an increased prevalence of WM hyperintensities in non-carriers compared with carriers of the APOE ε4 allele among patients with AD. This is consistent with a possibly more pronounced contribution of heterogeneous vascular risk factors to WM damage and cognitive impairment in patients with AD without APOE ε4-mediated risk.Alzheimer's Research and Therapy 12/2015; 7(1). DOI:10.1186/s13195-015-0111-8 · 3.50 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: FreeSurfer is a tool to quantify cortical and subcortical brain anatomy automatically and noninvasively. Previous studies have reported reliability and statistical power analyses in relatively small samples or only selected one aspect of brain anatomy. Here, we investigated reliability and statistical power of cortical thickness, surface area, volume, and the volume of subcortical structures in a large sample (N=189) of healthy elderly subjects (64+ years). Reliability (intraclass correlation coefficient) of cortical and subcortical parameters is generally high (cortical: ICCs>0.87, subcortical: ICCs>0.95). Surface-based smoothing increases reliability of cortical thickness maps, while it decreases reliability of cortical surface areas and volumes. Nevertheless, statistical power of all measures benefits from smoothing. When aiming to detect a 10 % difference between groups, the number of subjects required to test effects with sufficient power over the entire cortex varies between cortical measures (cortical thickness: N=39, surface area: N=21, volume: N=81; 10mm smoothing, power=0.8, α = 0.05). For subcortical regions this number is between 16 and 76 subjects, depending on the region. We also demonstrate the advantage of within-subject designs over between-subject designs. Furthermore, we publicly provide a tool that allows researchers to perform a priori power analysis and sensitivity analysis to help evaluate previously published studies and to design future studies with sufficient statistical power. Copyright © 2014 Elsevier Inc. All rights reserved.NeuroImage 12/2014; 108. DOI:10.1016/j.neuroimage.2014.12.035 · 6.13 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Automated gray matter segmentation of magnetic resonance imaging data is essential for morphometric analyses of the brain, particularly when large sample sizes are investigated. However, although detection of small structural brain differences may fundamentally depend on the method used, both accuracy and reliability of different automated segmentation algorithms have rarely been compared. Here, performance of the segmentation algorithms provided by SPM8, VBM8, FSL and FreeSurfer was quantified on simulated and real magnetic resonance imaging data. First, accuracy was assessed by comparing segmentations of twenty simulated and 18 real T1 images with corresponding ground truth images. Second, reliability was determined in ten T1 images from the same subject and in ten T1 images of different subjects scanned twice. Third, the impact of preprocessing steps on segmentation accuracy was investigated. VBM8 showed a very high accuracy and a very high reliability. FSL achieved the highest accuracy but demonstrated poor reliability and FreeSurfer showed the lowest accuracy, but high reliability. An universally valid recommendation on how to implement morphometric analyses is not warranted due to the vast number of scanning and analysis parameters. However, our analysis suggests that researchers can optimize their individual processing procedures with respect to final segmentation quality and exemplifies adequate performance criteria.PLoS ONE 09/2012; 7(9):e45081. DOI:10.1371/journal.pone.0045081 · 3.53 Impact Factor