Conference PaperPDF Available

DDoS-UNet: Incorporating temporal information using Dynamic Dual-channel UNet for enhancing super-resolution of dynamic MRI

Authors:

Abstract and Figures

Dynamic MRI is an essential tool for interventions to visualise movements or changes in the target organ. However, such MRI acquisition with high temporal resolution suffers from limited spatial resolution-also known as the spatio-temporal trade-off. Several approaches, including deep learning based super-resolution approaches, have been proposed to mitigate this trade-off. Nevertheless, such an approach typically aims to super-resolve each time-point separately, treating them as individual volumes. This research addresses the problem by creating a deep learning model that attempts to learn spatial and temporal relationships. The performance was tested with 3D dynamic data that was undersampled to different in-plane levels. The proposed network achieved an average SSIM value of 0.951±0.017 while reconstructing the lowest resolution data (i.e. only 4% of the k-space acquired), resulting in a theoretical acceleration factor of 25.
Content may be subject to copyright.
Medical -Imaging with Deep Learning – Under Review 2022 Short Paper – MIDL 2022 submission
DDoS-UNet: Incorporating temporal information using
Dynamic Dual-channel UNet for enhancing super-resolution
of dynamic MRI
Soumick Chatterjee1soumick.chatterjee@ovgu.de
Chompunuch Sarasaen1chompunuch.sarasaen@ovgu.de
Georg Rose1georg.rose@ovgu.de
Andreas N¨urnberger1andreas.nuernberger@ovgu.de
Oliver Speck1oliver.speck@ovgu.de
1Otto von Guericke University Magdeburg, Germany
Editors: Under Review for MIDL 2022
Abstract
Dynamic MRI is an essential tool for interventions to visualise movements or changes in the
target organ. However, such MRI acquisition with high temporal resolution suffers from
limited spatial resolution - also known as the spatio-temporal trade-off. Several approaches,
including deep learning based super-resolution approaches, have been proposed to mitigate
this trade-off. Nevertheless, such an approach typically aims to super-resolve each time-
point separately, treating them as individual volumes. This research addresses the problem
by creating a deep learning model that attempts to learn spatial and temporal relationships.
The performance was tested with 3D dynamic data that was undersampled to different in-
plane levels. The proposed network achieved an average SSIM value of 0.951±0.017 while
reconstructing the lowest resolution data (i.e. only 4% of the k-space acquired), resulting
in a theoretical acceleration factor of 25.
Keywords: Dynamic MRI, Super-Resolution, Dual-channel Training, Deep Learning
1. Introduction
Interventional MRIs, such as MR-guided liver biopsy, show excellent contrast between the
target organ or structure and adjacent soft tissue while visualising the changes in internal
organs during an examination. In such applications, dynamic MRI is used, which is obtained
by acquiring the k-space data (in frequency domain) continuously and reconstructing a
sequence of images over time. However, while achieving high temporal resolution, these
acquisitions suffer from restricted spatial resolution because only a limited part of the
data can be measured (undersampling). Consequently, the resultant image might have
reconstruction artefacts due to the violation of the Nyquist criterion, and also leads to
image resolution loss - known as the spatio-temporal trade-off of dynamic MRI. Super-
resolution is one of the techniques employed to mitigate this problem (Fathi et al.,2020;
Sarasaen et al.,2021). However, such single image super-resolution (SISR) techniques treat
each of the timepoints of the dynamic MRI as independent images. This does not exploit
the inherent temporal properties of the dynamic MRI. This paper extends the previous
work into the temporal domain (Sarasaen et al.,2021) by exploiting dual-channel inputs
(prior-image and low-resolution image) in the deep learning model - to learn the temporal
relationship between timepoints while also learning the spatial relationship between low- and
high-resolution images to perform SISR, using the proposed DDoS (Dynamic Dual-channel
ofSuper-resolution) approach.
S. Chatterjee and C. Sarasaen contributed equally
©2022 S. Chatterjee, C. Sarasaen, G. Rose, A. N¨urnberger & O. Speck.
Chatterjee Sarasaen Rose N¨
urnberger Speck
Figure 1: Method overview of the two different phases: training and inference
2. Methodology
DDoS-UNet is a modified version of the dual-channel 3D UNet, which receives the low-
resolution image of the current time-point (LRTP n) and a high-resolution prior image
(HRp, such as the previous super-resolved time-point HRTP n 1). The dynamic train-
ing data was initially generated from the benchmark dataset due to the lack of dynamic
abdominal data, by applying random elastic deformation on the static abdominal CHAOS
dataset (Kavur et al.,2021), comprising 80 volumes (40 subjects, in-phase and opposed-
phase for each subject). The dataset was divided into training and validation sets with a
ratio of 70:30. For testing the approach, high-resolution 3D static (breath-hold) and 3D
”pseudo”-dynamic (free-breathing) scans for 25 timepoints of five healthy subjects were
acquired using a 3T MRI. The network was trained and tested with three different levels
of undersampling - by taking the 10%, 6.25%, 4% of the centre k-space. Initially, during
inference, the network is supplied with a patient-specific fully sampled high-resolution (HR)
static prior scan on the first channel and the first timepoint (TP0) of the undersampled
low-resolution (LR) dynamic MRI on the second channel. Given this pair of HR-LR im-
ages, DDoS-UNet super-resolves the LR to obtain the TP0 of the super-resolved (SR) HR
dynamic MRI. This initial phase is termed here as the ”Antipasto” phase as it precedes the
main reconstruction phase. The reconstruction phase starts by supplying this SR-TP0 on
the first channel, while the LR-TP1 is supplied on the network’s second channel to gener-
ate SR-TP1. This process is continued recursively for all the subsequent timepoints. The
approach has been shown in Fig. 1, and the code of this project is available publicly on
GitHub: https://github.com/soumickmj/DDoS.
3. Evaluation
The performance of the DDoS-UNet was compared against two different baseline deep
learning models: two UNet models identical to the DDoS-UNet except for the initial layer
(unlike DDoS-UNet, these UNets received one input) - one of them trained on the original
CHAOS dataset and the other one was trained using artificial dynamic CHAOS (same
training set as DDoS-UNet). The quantitative results employing SSIM and PSNR are
presented in Table 1and a qualitative comparison has been shown in Fig. 2. It can be
observed from the qualitative results that the proposed DDoS-UNet managed to restore
2
Incorporating temporal information using Dynamic Dual-channel UNet
Figure 2: An example of reconstructed results from UNet baselines and DDoS-UNet, com-
pared against its ground-truth (GT) for low resolution images from 4% of k-space. For the
two ROIs results and difference images from GT pairs - (a-d): UNet CHAOS, (e-h): UNet
CHAOS Dynamic, (i-l): DDos-UNet.
Table 1: The mean and the standard deviation of SSIM, PSNR, and NRMSE.
Data 10% of k-space 6.25% of k-space 4% of k-space
SSIM PSNR SSIM PSNR SSIM PSNR
Trilinear Interpolation 0.872±0.014 28.631±1.364 0.821±0.017 26.770±1.226 0.765±0.022 25.248±1.298
Zero-padded 0.949±0.013 36.138±1.753 0.910±0.018 29.761±1.640 0.863±0.021 32.520±1.508
UNet (CHAOS) 0.967±0.006 38.359±1.580 0.944±0.010 35.623±1.552 0.916±0.015 32.658±1.598
UNet (CHAOS Dynamic) 0.959±0.012 37.376±1.275 0.941±0.012 35.113±1.566 0.914±0.012 33.620±1.035
DDoS-UNet 0.980±0.006 41.824±2.070 0.967±0.011 39.494±2.121 0.951±0.017 37.557±2.179
finer details better than others and quantitative results corroborate with the same, while the
Mann-Whitney U-test helped determine that the improvements were statistically significant.
4. Conclusion
This research performs 3D volumetric super-resolution of low-resolution dynamic MRIs
by using a subject-specific high-resolution prior planning scan and exploiting the spatio-
temporal relationship present in the dynamic MRI, resulting in 0.951±0.017 SSIM for 4%
of the centre k-space, achieving statistically significant improvements over the baselines.
Given the reconstruction speed of the proposed approach, this can be a candidate for near
real-time dynamic acquisition scenarios, such as interventional MRI.
Acknowledgments
This research was supported by the ESF (project no. ZS/2016/08/80646).
References
Mojtaba F Fathi et al. Super-resolution and denoising of 4d-flow mri using physics-informed
deep neural nets. Computer Methods and Programs in Biomedicine, 197:105729, 2020.
A Emre Kavur et al. Chaos challenge-combined (ct-mr) healthy abdominal organ segmen-
tation. Medical Image Analysis, 69:101950, 2021.
Chompunuch Sarasaen, Soumick Chatterjee, et al. Fine-tuning deep learning model pa-
rameters for improved super-resolution of dynamic mri with prior-knowledge. Artificial
Intelligence in Medicine, 121:102196, 2021.
3
... Future work will focus on improving the networks' performance on the phase images, which should also reduce the temperature difference. Furthermore, combining the Fourier-PDNet and Fourier-PDUNet models with dynamic MRI-centric pipelines Chatterjee et al., 2022b) could allow these models to better exploit the spatio-temporal nature of MR thermometry data, improving the overall reconstruction quality. ...
Preprint
Full-text available
Hyperthermia (HT) in combination with radio-and/or chemotherapy has become an accepted cancer treatment for distinct solid tumour entities. In HT, tumour tissue is exogenously heated to temperatures of 39 to 43 °C for 60 minutes. Temperature monitoring can be performed noninvasively using dynamic magnetic resonance imaging (MRI). However, the slow nature of MRI leads to motion artefacts in the images due to the movements of patients during image acquisition time. By discarding parts of the data, the speed of the acquisition can be increased-known as Undersampling. However, due to the invalidation of the Nyquist criterion, the acquired images have lower resolution and can also produce artefacts. The aim of this work was, therefore, to reconstruct highly undersampled MR thermometry acquisitions with better resolution and with less artefacts compared to conventional techniques like compressed sensing. The use of deep learning in the medical field has emerged in recent times, and various studies have shown that deep learning has the potential to solve inverse problems such as MR image reconstruction. However, most of the published work only focusses on the magnitude images, while the phase images are ignored, which are fundamental requirements for MR thermometry. This work, for the first time ever, presents deep learning based solutions for reconstructing undersampled MR thermometry data. Two different deep learning models have been employed here, the Fourier Primal-Dual network and Fourier Primal-Dual UNet, to reconstruct highly undersampled complex images of MR thermometry. MR images of 44 patients with different sarcoma cancers who have received the HT treatment in a combination of radiotherapy/chemotherapy were used in this study. It was observed that the method was able to reduce the temperature difference between the undersampled MRIs and the fully sampled MRIs from 1.5 °C to 0.5°C.
Article
Adversarial training has attracted much attention in enhancing the visual realism of images, but its efficacy in clinical imaging has not yet been explored. This work investigated adversarial training in a clinical context, by training 206 networks on the OASIS-1 dataset for improving low-resolution and low signal-to-noise ratio (SNR) magnetic resonance images. Each network corresponded to a different combination of perceptual and adversarial loss weights and distinct learning rate values. For each perceptual loss weighting, we identified its corresponding adversarial loss weighting that minimized structural disparity. Each optimally weighted adversarial loss yielded an average SSIM reduction of 1.5%. We further introduced a set of new metrics to assess other clinically relevant image features: Gradient Error (GE) to measure structural disparities; Sharpness to compute edge clarity; and Edge-Contrast Error (ECE) to quantify any distortion of the pixel distribution around edges. Including adversarial loss increased structural enhancement in visual inspection, which correlated with statistically consistent GE reductions (p-value << 0.05). This also resulted in increased Sharpness; however, the level of statistical significance was dependent on the perceptual loss weighting. Additionally, adversarial loss yielded ECE reductions for smaller perceptual loss weightings, while showing non-significant increases (p-value >> 0.05) when these weightings were higher, demonstrating that the increased Sharpness does not adversely distort the pixel distribution around the edges in the image. These studies clearly suggest that adversarial training significantly improves the performance of an MRI enhancement pipeline, and highlights the need for systematic studies of hyperparameter optimization and investigation of alternative image quality metrics.
Article
This study aims to assess the statistical significance of training parameters in 240 dense UNets (DUNets) used for enhancing low Signal-to-Noise Ratio (SNR) and undersampled MRI in various acquisition protocols. The objective is to determine the validity of differences between different DUNet configurations and their impact on image quality metrics. To achieve this, we trained all DUNets using the same learning rate and number of epochs, with variations in 5 acquisition protocols, 24 loss function weightings, and 2 ground truths. We calculated evaluation metrics for two metric regions of interest (ROI). We employed both Analysis of Variance (ANOVA) and Mixed Effects Model (MEM) to assess the statistical significance of the independent parameters, aiming to compare their efficacy in revealing differences and interactions among fixed parameters. ANOVA analysis showed that, except for the acquisition protocol, fixed variables were statistically insignificant. In contrast, MEM analysis revealed that all fixed parameters and their interactions held statistical significance. This emphasizes the need for advanced statistical analysis in comparative studies, where MEM can uncover finer distinctions often overlooked by ANOVA. These findings highlight the importance of utilizing appropriate statistical analysis when comparing different deep learning models. Additionally, the surprising effectiveness of the UNet architecture in enhancing various acquisition protocols underscores the potential for developing improved methods for characterizing and training deep learning models. This study serves as a stepping stone toward enhancing the transparency and comparability of deep learning techniques for medical imaging applications.
Article
Full-text available
Dynamic imaging is a beneficial tool for interventions to assess physiological changes. Nonetheless during dynamic MRI, while achieving a high temporal resolution, the spatial resolution is compromised. To overcome this spatio-temporal trade-off, this research presents a super-resolution (SR) MRI reconstruction with prior knowledge based fine-tuning to maximise spatial information while reducing the required scan-time for dynamic MRIs. An U-Net based network with perceptual loss is trained on a benchmark dataset and fine-tuned using one subject-specific static high resolution MRI as prior knowledge to obtain high resolution dynamic images during the inference stage. 3D dynamic data for three subjects were acquired with different parameters to test the generalisation capabilities of the network. The method was tested for different levels of in-plane undersampling for dynamic MRI. The reconstructed dynamic SR results after fine-tuning showed higher similarity with the high resolution ground-truth, while quantitatively achieving statistically significant improvement. The average SSIM of the lowest resolution experimented during this research (6.25 % of the k-space) before and after fine-tuning were 0.939 ± 0.008 and 0.957 ± 0.006 respectively. This could theoretically result in an acceleration factor of 16, which can potentially be acquired in less than half a second. The proposed approach shows that the super-resolution MRI reconstruction with prior-information can alleviate the spatio-temporal trade-off in dynamic MRI, even for high acceleration factors.
Article
Full-text available
Segmentation of abdominal organs has been a comprehensive, yet unresolved, research field for many years. In the last decade, intensive developments in deep learning (DL) introduced new state-of-the-art segmentation systems. Despite outperforming the overall accuracy of existing systems, the effects of DL model properties and parameters on the performance are hard to interpret. This makes comparative analysis a necessary tool towards interpretable studies and systems. Moreover, the performance of DL for emerging learning approaches such as cross-modality and multi-modal semantic segmentation tasks has been rarely discussed. In order to expand the knowledge on these topics, the CHAOS – Combined (CT-MR) Healthy Abdominal Organ Segmentation challenge was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI), 2019, in Venice, Italy. Abdominal organ segmentation from routine acquisitions plays an important role in several clinical applications, such as pre-surgical planning or morphological and volumetric follow-ups for various diseases. These applications require a certain level of performance on a diverse set of metrics such as maximum symmetric surface distance (MSSD) to determine surgical error-margin or overlap errors for tracking size and shape differences. Previous abdomen related challenges are mainly focused on tumor/lesion detection and/or classification with a single modality. Conversely, CHAOS provides both abdominal CT and MR data from healthy subjects for single and multiple abdominal organ segmentation. Five different but complementary tasks were designed to analyze the capabilities of participating approaches from multiple perspectives. The results were investigated thoroughly, compared with manual annotations and interactive methods. The analysis shows that the performance of DL models for single modality (CT / MR) can show reliable volumetric analysis performance (DICE: 0.98 ± 0.00 / 0.95 ± 0.01), but the best MSSD performance remains limited (21.89 ± 13.94 / 20.85 ± 10.63 mm). The performances of participating models decrease dramatically for cross-modality tasks both for the liver (DICE: 0.88 ± 0.15 MSSD: 36.33 21.97 mm). Despite contrary examples on different applications, multi-tasking DL models designed to segment all organs are observed to perform worse compared to organ-specific ones (performance drop around 5%). Nevertheless, some of the successful models show better performance with their multi-organ versions. We conclude that the exploration of those pros and cons in both single vs multi-organ and cross-modality segmentations is poised to have an impact on further research for developing effective algorithms that would support real-world clinical applications. Finally, having more than 1500 participants and receiving more than 550 submissions, another important contribution of this study is the analysis on shortcomings of challenge organizations such as the effects of multiple submissions and peeking phenomenon.
Article
Background and Objective: Time resolved three-dimensional phase contrast magnetic resonance imaging (4D-Flow MRI) has been used to non-invasively measure blood velocities in the human vascular system. However, issues such as low spatio-temporal resolution, acquisition noise, velocity aliasing, and phase-offset artifacts have hampered its clinical application. In this research, we developed a purely data-driven method for super-resolution and denoising of 4D-Flow MRI. Methods: The flow velocities, pressure, and the MRI image magnitude are modeled as a patient-specific deep neural net (DNN). For training, 4D-Flow MRI images in the complex Cartesian space are used to impose data-fidelity. Physics of fluid flow is imposed through regularization. Creative loss function terms have been introduced to handle noise and super-resolution. The trained patient-specific DNN can be sampled to generate noise-free high-resolution flow images. The proposed method has been implemented using the TensorFlow DNN library and tested on numerical phantoms and validated in-vitro using high-resolution particle image velocitmetry (PIV) and 4D-Flow MRI experiments on transparent models subjected to pulsatile flow conditions. Results: In case of numerical phantoms, we were able to increase spatial resolution by a factor of 100 and temporal resolution by a factor of 5 compared to the simulated 4D-Flow MRI. There is an order of magnitude reduction of velocity normalized root mean square error (vNRMSE). In case of the in-vitro validation tests with PIV as reference, there is similar improvement in spatio-temporal resolution. Although the vNRMSE is reduced by 50%, the method is unable to negate a systematic bias with respect to the reference PIV that is introduced by the 4D-Flow MRI measurement. Conclusions: This work has demonstrated the feasibility of using the readily available machinery of deep learning to enhance 4D-Flow MRI using a purely data-driven method. Unlike current state-of-the-art methods, the proposed method is agnostic to geometry and boundary conditions and therefore eliminates the need for tedious tasks such as accurate image segmentation for geometry, image registration, and estimation of boundary flow conditions. Arbitrary regions of interest can be selected for processing. This work will lead to user-friendly analysis tools that will enable quantitative hemodynamic analysis of vascular diseases in a clinical setting.