Conference PaperPDF Available

Reference-less SSIM Regression for Detection and Quantification of Motion Artefacts in Brain MRIs

Authors:

Abstract and Figures

Motion artefacts in magnetic resonance images can critically affect diagnosis and the quan-tification of image degradation due to their presence is required. Usually, image quality assessment is carried out by experts such as radiographers, radiologists and researchers. However, subjective evaluation requires time and is strongly dependent on the experience of the rater. In this work, an automated image quality assessment based on the structural similarity index regression through ResNet models is presented. The results show that the trained models are able to regress the SSIM values with high level of accuracy. When the predicted SSIM values were grouped into 10 classes and compared against the ground-truth motion classes, the best weighted accuracy of 89 ± 2% was observed with RN-18 model, trained with contrast augmentation.
Content may be subject to copyright.
Medical -Imaging with Deep Learning – Under Review 2022 Short Paper – MIDL 2022 submission
Reference-less SSIM Regression for Detection and
Quantification of Motion Artefacts in Brain MRIs
Alessandro Sciarra1alessandro.sciarra@med.ovgu.de
Soumick Chatterjee1soumick.chatterjee@ovgu.de
Max D¨unnwald1max.duennwald@med.ovgu.de
Giuseppe Placidi2giuseppe.placidi@univaq.it
Andreas N¨urnberger1andreas.nuernberger@ovgu.de
Oliver Speck1oliver.speck@ovgu.de
Steffen Oeltze-Jafra1steffen.oeltze-jafra@med.ovgu.de
1Otto von Guericke University Magdeburg, Germany
2University of L’Aquila, Italy
Editors: Under Review for MIDL 2022
Abstract
Motion artefacts in magnetic resonance images can critically affect diagnosis and the quan-
tification of image degradation due to their presence is required. Usually, image quality
assessment is carried out by experts such as radiographers, radiologists and researchers.
However, subjective evaluation requires time and is strongly dependent on the experience
of the rater. In this work, an automated image quality assessment based on the structural
similarity index regression through ResNet models is presented. The results show that the
trained models are able to regress the SSIM values with high level of accuracy. When the
predicted SSIM values were grouped into 10 classes and compared against the ground-truth
motion classes, the best weighted accuracy of 89 ±2% was observed with RN-18 model,
trained with contrast augmentation.
Keywords: Motion artefacts, MRI, ResNet, Image quality assessment
1. Introduction
Image quality assessment (IQA) is a critical step to evaluate if the quality of the MR images
can guarantee diagnostic reliability (Khosravy et al.,2019). Moreover, it is an important
step for large clinical studies as typically they require high quality data. Often the evaluation
process requires time and is subjectively dependent upon the observer (Ma et al.,2020).
Structural similarity index measure (SSIM) is a popular way of evaluating the quality of the
images objectively, but it requires reference images. If the images are artificially corrupted,
then the original non-corrupted images can serve as reference to calculate the SSIM values
- which is not possible during real-life acquisitions. This research proposes an automated
IQA method to detect the presence of motion artefacts and quantify the level of corruption
by regressing the SSIM values directly from the corrupted images using convolutional neural
networks, without using any reference image.
2. Methodology
The proposed IQA method uses a ResNet model (He et al.,2016) with different depths
- ResNet18 (RN18) and ResNet101 (RN-101). Given a 3D input volume during training,
one random slice (2D image) is selected from one of the possible orientations - axial, sagit-
tal, and coronal. To make the model more robust against changes in the image contrast
in clinical scenarios, four different random contrast augmentation (CA) techniques were
A. Sciarra and S. Chatterjee contributed equally
©2022 A. Sciarra, S. Chatterjee, M. D¨unnwald, G. Placidi, A. N¨urnberger, O. Speck & S. Oeltze-Jafra.
Sciarra Chatterjee D¨
unnwald Placidi N¨
urnberger Speck Oeltze-Jafra
employed during training - gamma manipulation, logarithmic manipulation, sigmoid ma-
nipulation, and adaptive histogram adjustments. Then, artificial motion corruption was
applied on these images using two different methods - the RandomMotion functionality
of TorchIO (erez-Garc´ıa et al.,2021) and a more physically realistic in-house line-wise
motion corruption algorithm. The SSIM values between these artificially corrupted images
and the corresponding non-corrupted images were calculated, and were used as the ground-
truth values to train the models. The mean squared error (MSE) between these values
and the models’ predictions were compared as the loss during training and optimised using
the Adam optimiser, a learning rate of 1e
3and a batch size of 100 for 2000 epochs. 300
MRI volumes with different acquisition devices and parameters were used in this research,
split into 200-50-50, for training, validation, and testing, respectively. T1, T2, PD, and
FLAIR images acquired at three different sites using different devices were included (114
volumes at 3T, 93 volumes at 7T, 25 volumes with different 1.5 and 3T scanners), while the
remaining 68 volumes were taken from the T1, T2, and PD weighted MRIs of the publicly-
available IXI dataset. All the trainings and evaluations were performed by combining all
these different contrasts and other variations - to make the model generally-applicable in
clinical situations. All the images were intensity-normalised by dividing by its max value
and interpolated or padded to a 2D matrix size of 256x256.
3. Evaluation
In order to evaluate the performances of these models, a scatter plot for the SSIM values has
been used, specifically, predicted against ground truth values, as shown in figure 1(ii).Both
models trained with contrast augmentation show less dispersion. The term dispersion refers
to the distance from the ideal linear function y=xwith unitary coefficient, where yis the
predicted SSIM value while xthe ground truth value. The regression performance of the
models were also evaluated using residual SSIM values (the difference between ground-
truth and the predicted SSIMs), and the best performing model RN-18 with CA got -
0.0009±0.0139. The predicted SSIM value can be considered to measure the distortion or
corruption level of the image. However, when applying this approach to a real clinical case
it is difficult to compare it with a subjective assessment. To get around this problem, the
regression task was simplified as a classification by sub-dividing the SSIM range [0-1] into
10 classes: class-1:[0.00-0.10], class-2:[0.11-0.20] and so on. The SSIM values predicted by
the models, as well as the ground-truth SSIMs were converted into these 10 classes, referred
as the predicted classes and true classes, respectively and the results are shown in Figure 2.
The best weighted accuracy has been achieved by RN-18 with CA, 89 ±2%, followed by
RN-101 with CA 88 ±2%, RN-18 without CA 87 ±2% and RN-101 without CA 86 ±2%.
4. Conclusion
This research presents an SSIM-regression based IQA technique using ResNet models, cou-
pled with contrast augmentations to make them robust against changes in the image con-
trasts in clinical scenarios. The method managed to predict the SSIM values from artificially
motion corrupted images without the ground-truth (motion-free) images with high accu-
racy (residual SSIMs as less as -0.0009±0.0139). Moreover, the motion classes obtained
from the predicted SSIMs were very close to the true ones and achieved a weighted ac-
curacy of 89 ±2%. Considering the complexity of the problem in quantifying the image
degradation level due to motion artefacts and additionally the variability of the type of
contrast, resolution, etc., the results obtained are promising. Further evaluations, including
2
SSIM Regression for Detection and Quantification of Motion Artefacts
Figure 1: (i): Samples of artificially corrupted images. On the left column original images,
on the right the corrupted images using (a) TorchIO and (b) in-house algorithm. (ii):
(bottom-left) dispersion plot SSIM predicted vs. ground truth values; (top-left) histogram
of the SSIM ground truth values; (bottom-right) histograms of the SSIM predicted values.
Figure 2: Confusion matrices for the classification task. From left to right: RN-18 without
contrast augmentation (CA), RN-18 with CA, RN-101 without CA, RN-101 with CA
subjective evaluation, will be performed on clinical data to judge its clinical applicability
and robustness against changes in real-world scenarios.
Acknowledgments
This work was supported by the state of Saxony-Anhalt (’I 88’) and the ESF (ZS/2016/08/80646).
References
Kaiming He et al. Deep residual learning for image recognition. In IEEE CVPR, 2016.
Mahdi Khosravy et al. Image quality assessment: A review to full reference indexes. Recent
trends in communication, computing, and electronics, 2019.
Jeffrey J Ma et al. Diagnostic image quality assessment and classification in medical imaging:
Opportunities and challenges. In IEEE ISBI, 2020.
Fernando P´erez-Garc´ıa et al. Torchio: a python library for efficient loading, preprocessing,
augmentation and patch-based sampling of medical images in deep learning. Computer
Methods and Programs in Biomedicine, 208:106236, 2021.
3
... To objectively assess the MRI image quality, we make use of the convolutional neural network based approach to motion artefact quantification as recently proposed. [33] This network was trained to estimate the structural similarity (SSIM) of a single corrupted image slice to its (non-existent) ideal, uncorrupted version. The resulting predicted/estimated SSIM thus quantifies the amount of corruption by motion artefacts, 1.0 encoding perfect image quality and 0.0 the worst. ...
... Using a recently proposed method for the quantification of motion artefacts, [33] we conducted the previously described experiments to estimate the influence of acquisition artefacts on the LC metrics. ...
Preprint
Full-text available
INTRODUCTION: The Locus Coeruleus (LC) is linked to the development and pathophysiology of neurodegenerative diseases such as Alzheimer's Disease (AD). Magnetic Resonance Imaging based LC features have shown potential to assess LC integrity in vivo. METHODS: We present a Deep Learning based LC segmentation and feature extraction method: ELSI-Net and apply it to healthy aging and AD dementia datasets. Agreement to expert raters and previously published LC atlases were assessed. We aimed to reproduce previously reported differences in LC integrity in aging and AD dementia and correlate extracted features to cerebrospinal fluid (CSF) biomarkers of AD pathology. RESULTS: ELSI-Net demonstrated high agreement to expert raters and published atlases. Previously reported group differences in LC integrity were detected and correlations to CSF biomarkers were found. DISCUSSION: Although we found excellent performance, further evaluations on more diverse datasets from clinical cohorts are required for a conclusive assessment of ELSI-Nets general applicability.
... Attempts have been made to relate motion trajectories acquired with tracking systems to image quality, such as integrating motion based on head speed 13 and comparing that to known values of motion-corrupted cases. 14 Convolutional neural networks (CNNs) have been used to classify fully sampled MRI images as motion-free or corrupted, 15 to estimate motion between artificially corrupted images and the corresponding uncorrupted images in terms of structural similarity (SSIM), 16 or to generate probability maps for motion artifacts in MRI. 17,18 While research has been done on estimating motion parameters of undersampled data, the aforementioned motion artifact estimation approaches all work on fully sampled images, whereas differentiation between motion and undersampling artifacts is challenging, and the focus of this paper. ...
... To the best of our knowledge,the existing retrospective motion artifact estimation approaches require the full set of k-space data to be available. [15][16][17][18] We investigated motion artifact severity estimation on accelerated MRI data,which introduces undersampling artifacts on top of the motion artifacts that can have similar appearances and are thus challenging to be distinguished from each other. Our model is able to detect motion corruption during acquisition before all k-space lines are sampled, enabling the possibility to alert the MR technician early, before the scan is fully completed. ...
Article
Full-text available
Background Magnetic Resonance acquisition is a time consuming process, making it susceptible to patient motion during scanning. Even motion in the order of a millimeter can introduce severe blurring and ghosting artifacts, potentially necessitating re‐acquisition. Magnetic Resonance Imaging (MRI) can be accelerated by acquiring only a fraction of k‐space, combined with advanced reconstruction techniques leveraging coil sensitivity profiles and prior knowledge. Artificial intelligence (AI)‐based reconstruction techniques have recently been popularized, but generally assume an ideal setting without intra‐scan motion. Purpose To retrospectively detect and quantify the severity of motion artifacts in undersampled MRI data. This may prove valuable as a safety mechanism for AI‐based approaches, provide useful information to the reconstruction method, or prompt for re‐acquisition while the patient is still in the scanner. Methods We developed a deep learning approach that detects and quantifies motion artifacts in undersampled brain MRI. We demonstrate that synthetically motion‐corrupted data can be leveraged to train the convolutional neural network (CNN)‐based motion artifact estimator, generalizing well to real‐world data. Additionally, we leverage the motion artifact estimator by using it as a selector for a motion‐robust reconstruction model in case a considerable amount of motion was detected, and a high data consistency model otherwise. Results Training and validation were performed on 4387 and 1304 synthetically motion‐corrupted images and their uncorrupted counterparts, respectively. Testing was performed on undersampled in vivo motion‐corrupted data from 28 volunteers, where our model distinguished head motion from motion‐free scans with 91% and 96% accuracy when trained on synthetic and on real data, respectively. It predicted a manually defined quality label (‘Good’, ‘Medium’ or ‘Bad’ quality) correctly in 76% and 85% of the time when trained on synthetic and real data, respectively. When used as a selector it selected the appropriate reconstruction network 93% of the time, achieving near optimal SSIM values. Conclusions The proposed method quantified motion artifact severity in undersampled MRI data with high accuracy, enabling real‐time motion artifact detection that can help improve the safety and quality of AI‐based reconstructions.
... We embark on an extensive exploration, offering a panoramic view of the multifaceted deep learning architectures harnessed within this domain [75]. These include the widely utilized convolutional neural networks (CNNs) [16], renowned for their prowess in spatial feature extraction, and recurrent neural networks (RNNs) [61], adept at capturing temporal dependencies [13]. Venturing further, we navigate the innovative realm of generative adversarial networks (GANs) [73], which have demonstrated remarkable potential in generating high-fidelity images free from motion-induced distortion. ...
Article
Full-text available
Motion artifacts occur in magnetic resonance imaging (MRI) due to the motion or movement of the object being scanned. Motion artifacts can have various origins such as voluntary or involuntary patient movement, faulty components, improper software configuration, etc. Blurry MRI scans are generated due to the presence of motion artifacts. In some cases motion artifact induced MRI scans can tamper with crucial information and consequently leads to a faulty diagnosis. We attempt to provide a review of current suggested technologies (such as, deep learning, and encoding) used to remove motion artifacts from the MRI scans. The different approaches are summarized in brief, with their advantages and disadvantages. We expect the readers to be better equipped with the knowledge and tools available for magnetic resonance image artifact removal. We have also proposed many simple cases and their corresponding solutions in the discussion section of this manuscript. It was found that deep learning with CNN can reach maximum accuracy reported in motion artifact correction is about 97%. With the advancements in the field of artificial intelligence, and more and more sample data available (due to non-linear expansion of big data and cloud storage capabilities), it is expected that motion artifacts could soon be eliminated quite accurately.
... Therefore, even metrics that are known to be correlated with motion for specific acquisition protocols and in specific cohorts [74] are unlikely to generalize across all acquisitions, especially when motion levels may differ drastically between previously investigated clinical cohorts with tremor and general population cohorts. The advent of deep learning may offer an opportunity to capture the complex interactions between motion and acquisition, however, initial research is still limited to predicting the experts sentiment [75][76][77][78][79][80][81][82][83], or perceptual quality metrics, based on simulated data [84]. In initial work we leverage the motion quantification pipeline presented here with continuous motion scores to alleviate the reliance on manual labels [28]. ...
Article
Head motion during MR acquisition reduces image quality and has been shown to bias neuromorphometric analysis. The quantification of head motion, therefore, has both neuroscientific as well as clinical applications, for example, to control for motion in statistical analyses of brain morphology, or as a variable of interest in neurological studies. The accuracy of markerless optical head tracking, however, is largely unexplored. Furthermore, no quantitative analysis of head motion in a general, mostly healthy population cohort exists thus far. In this work, we present a robust registration method for the alignment of depth camera data that sensitively estimates even small head movements of compliant participants. Our method outperforms the vendor-supplied method in three validation experiments: 1. similarity to fMRI motion traces as a low-frequency reference, 2. recovery of the independently acquired breathing signal as a high-frequency reference, and 3. correlation with image-based quality metrics in structural T1-weighted MRI. In addition to the core algorithm, we establish an analysis pipeline that computes average motion scores per time interval or per sequence for inclusion in downstream analyses. We apply the pipeline in the Rhineland Study, a large population cohort study, where we replicate age and body mass index (BMI) as motion correlates and show that head motion significantly increases over the duration of the scan session. We observe weak, yet significant interactions between this within-session increase and age, BMI, and sex. High correlations between fMRI and camera-based motion scores of proceeding sequences further suggest that fMRI motion estimates can be used as a surrogate score in the absence of better measures to control for motion in statistical analyses.
... Expert ratings are limited by their subjectivity to Since the rise of deep learning, tools have been able to predict expert motion ratings with increasing accuracy [8][9][10][13][14][15][16][17][18] sometimes addressing previous limitations, for example the subjectivity to the task [18]. Currently, the only alternative to the prediction of expert labels is to predict a perceptual image similarity metric between low motion "baseline" images and high-motion images, which are retrospectively simulated [21]. This approach replaces the human annotation task by a comparison of images with a perceptual similarity metric (SSIM), which may also suffer from similar limitations as expert labels. ...
Preprint
Full-text available
Head motion is an omnipresent confounder of magnetic resonance image (MRI) analyses as it systematically affects morphometric measurements, even when visual quality control is performed. In order to estimate subtle head motion, that remains undetected by experts, we introduce a deep learning method to predict in-scanner head motion directly from T1-weighted (T1w), T2-weighted (T2w) and fluid-attenuated inversion recovery (FLAIR) images using motion estimates from an in-scanner depth camera as ground truth. Since we work with data from compliant healthy participants of the Rhineland Study, head motion and resulting imaging artifacts are less prevalent than in most clinical cohorts and more difficult to detect. Our method demonstrates improved performance compared to state-of-the-art motion estimation methods and can quantify drift and respiration movement independently. Finally, on unseen data, our predictions preserve the known, significant correlation with age.
Article
The maintenance of public toilets, such as those found in convenience stores, presents challenges in the era of limited human resources. Consequently, the development of an automatic toilet cleaning system is necessary. The detection of trash on the toilet floor is an essential component of the automatic toilet cleaning system. However, this process presents its own set of challenges, including the unpredictability of the types and locations of the trash that may be present. This study proposes a system for detecting solid waste on the toilet floor by applying the structure and feature similarity index method of image. The difference in the structure and features of the reference and actual images can indicate the trash that appears on the toilet floor. This study also proposes a method for determining the threshold value of similarity feature measurement. The experimental results demonstrate that the proposed detection system is able to produce a detection success rate of up to 96.5%. Additionally, the system proves capable of detecting small objects, such as human hair, under specific conditions. This method offers a resource-efficient solution to the challenges faced in maintaining public toilet cleanliness.
Article
Objective. This study developed an unsupervised motion artifact reduction method for magnetic resonance imaging (MRI) images of patients with brain tumors. The proposed novel design uses multi-parametric multicenter contrast-enhanced T1W (ceT1W) and T2-FLAIR MRI images. Approach. The proposed framework included two generators, two discriminators, and two feature extractor networks. A 3-fold cross-validation was used to train and fine-tune the hyperparameters of the proposed model using 230 brain MRI images with tumors, which were then tested on 148 patients’ in-vivo datasets. An ablation was performed to evaluate the model’s compartments. Our model was compared with Pix2pix and CycleGAN. Six evaluation metrics were reported, including normalized mean squared error (NMSE), structural similarity index (SSIM), multi-scale-SSIM (MS-SSIM), peak signal-to-noise ratio (PSNR), visual information fidelity (VIF), and multi-scale gradient magnitude similarity deviation (MS-GMSD). Artifact reduction and consistency of tumor regions, image contrast, and sharpness were evaluated by three evaluators using Likert scales and compared with ANOVA and Tukey’s HSD tests. Main results. On average, our method outperforms comparative models to remove heavy motion artifacts with the lowest NMSE (18.34±5.07%) and MS-GMSD (0.07 ± 0.03) for heavy motion artifact level. Additionally, our method creates motion-free images with the highest SSIM (0.93 ± 0.04), PSNR (30.63 ± 4.96), and VIF (0.45 ± 0.05) values, along with comparable MS-SSIM (0.96 ± 0.31). Similarly, our method outperformed comparative models in removing in-vivo motion artifacts for different distortion levels except for MS- SSIM and VIF, which have comparable performance with CycleGAN. Moreover, our method had a consistent performance for different artifact levels. For the heavy level of motion artifacts, our method got the highest Likert scores of 2.82 ± 0.52, 1.88 ± 0.71, and 1.02 ± 0.14 (p-values ≪ 0.0001) for our method, CycleGAN, and Pix2pix respectively. Similar trends were also found for other motion artifact levels. Significance. Our proposed unsupervised method was demonstrated to reduce motion artifacts from the ceT1W brain images under a multi-parametric framework.
Article
Full-text available
Background and Objective Processing of medical images such as MRI or CT presents different challenges compared to RGB images typically used in computer vision. These include a lack of labels for large datasets, high computational costs, and the need of metadata to describe the physical properties of voxels. Data augmentation is used to artificially increase the size of the training datasets. Training with image subvolumes or patches decreases the need for computational power. Spatial metadata needs to be carefully taken into account in order to ensure a correct alignment and orientation of volumes. Methods We present TorchIO, an open-source Python library to enable efficient loading, preprocessing, augmentation and patch-based sampling of medical images for deep learning. TorchIO follows the style of PyTorch and integrates standard medical image processing libraries to efficiently process images during training of neural networks. TorchIO transforms can be easily composed, reproduced, traced and extended. Most transforms can be inverted, making the library suitable for test-time augmentation and estimation of aleatoric uncertainty in the context of segmentation. We provide multiple generic preprocessing and augmentation operations as well as simulation of MRI-specific artifacts. ResultsSource code, comprehensive tutorials and extensive documentation for TorchIO can be found at https://github.com/fepegar/torchio. The package can be installed from the Python Package Index (PyPI) running pip install torchio. It includes a command-line interface which allows users to apply transforms to image files without using Python. Additionally, we provide a graphical user interface within a TorchIO extension in 3D Slicer to visualize the effects of transforms. Conclusions TorchIO was developed to help researchers standardize medical image processing pipelines and allow them to focus on the deep learning experiments. It encourages good open-science practices, as it supports experiment reproducibility and is version-controlled so that the software can be cited precisely. Due to its modularity, the library is compatible with other frameworks for deep learning with medical images.
Article
Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
Diagnostic image quality assessment and classification in medical imaging: Opportunities and challenges
  • J Jeffrey
  • Ma
Jeffrey J Ma et al. Diagnostic image quality assessment and classification in medical imaging: Opportunities and challenges. In IEEE ISBI, 2020.