Article

Convolutional Neural Network With Shape Prior Applied to Cardiac MRI Segmentation

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In this paper, we present a novel convolutional neural network (CNN) architecture to segment images from a series of short-axis cardiac magnetic resonance slices (CMRI). The proposed model is an extension of the U-net that embeds a cardiac shape prior and involves a loss function tailored to the cardiac anatomy. Our system takes as input raw MR images, requires no manual preprocessing or image cropping and is trained to segment the endocardium and epicardium of the left ventricle, the endocardium of the right ventricle, as well as the center of the left ventricle. With its multi-resolution grid architecture, the network learns both high and low-level features useful to register the shape prior as well as accurately localize the borders of the cardiac regions. Experimental results obtained on the ACDC-MICCAI 2017 dataset show that our model segments multi-slices CMRI (left and right ventricle contours) in 0.17 second with an average Dice coefficient of 0.91 and an average 3D Hausdorff distance of 9.5 mm.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Entretanto, nesse mesmo estudo, constatou-se que a maior limitação de tais métodoś e a produção frequente de segmentações com formatos anatomicamente impossíveis. Métodos recentes têm buscado corrigir tais erros a partir da inclusão de algum tipo de restrição de forma ao método de AP [Yuan et al. 2018, Zotti et al. 2019]. Entretanto, tais abordagens ainda geram resultados com erros anatômicos. ...
... Já os métodos de AP aprendem automaticamente, a partir de exemplos, uma função capaz de classificar diferentes regiões da imagem de RMC. O aprendizado está associadoà minimização de uma função de perda, queé usualmente formulada com base em erros de classificação individual de cada pixel (e.g., Dice loss [Zotti et al. 2019]). Entretanto, apesar de estimular a produção das segmentações aproximadas, a minimização não necessariamente favorece a produção de segmentações anatomicamente corretas. ...
Conference Paper
Sistemas de Informação estão evoluindo para processar dados multimídia. A segmentação automática do ventrículo esquerdo em exames médicos para auxílio ao diagnóstico é um desafio multidisciplinar da área de Cardiologia. Diversas abordagens têm sido propostas, com destaque para redes de aprendizado profundo, que têm obtido excelente desempenho, mas ainda produzem segmentações com erros anatômicos. Considerando essa limitação, esse trabalho apresenta um método de segmentação híbrido que combina aprendizado profundo e modelos deformáveis com restrições de forma. A combinação favorece a produção de segmentações anatomicamente mais consistentes. Resultados indicam que o método é competitivo e oferece boa generalização.
... Khened et al. 56 first extracted a region of interest with Fourier analyses after which a dense-Net analyzed the region. Zotti et al. 57 used a multi-resolution U-Net and additionally incorporated a cardiac shape prior. Painchaud et al. 58 used a U-Net to obtain segmentation predictions of the cardiac structures, which were subsequently converted into anatomically correct ones using an adversarial variational auto-encoder. ...
... In recent years, all three network architectures have shown a good performance for medical image segmentation tasks. 7,39,50,52,53,55,57,58 Other network architectures, such as DeepLabV3 with atrous spatial pyramid pooling 59 or multi-scale CNNs 44,60 could also have been included in our ensemble to increase diversity or to obtain a larger ensemble. Increasing the number of networks in an ensemble might also lead to a better performance and would therefore be an interesting topic for future research. ...
Article
Purpose: Radiologists exhibit wide inter-reader variability in diagnostic performance. This work aimed to compare different feature sets to predict if a radiologist could detect a specific liver metastasis in contrast-enhanced computed tomography (CT) images and to evaluate possible improvements in individualizing models to specific radiologists. Approach: Abdominal CT images from 102 patients, including 124 liver metastases in 51 patients were reconstructed at five different kernels/doses using projection domain noise insertion to yield 510 image sets. Ten abdominal radiologists marked suspected metastases in all image sets. Potentially salient features predicting metastasis detection were identified in three ways: (i) logistic regression based on human annotations (semantic), (ii) random forests based on radiologic features (radiomic), and (iii) inductive derivation using convolutional neural networks (CNN). For all three approaches, generalized models were trained using metastases that were detected by at least two radiologists. Conversely, individualized models were trained using each radiologist's markings to predict reader-specific metastases detection. Results: In fivefold cross-validation, both individualized and generalized CNN models achieved higher area under the receiver operating characteristic curves (AUCs) than semantic and radiomic models in predicting reader-specific metastases detection ability ( p < 0.001 ). The individualized CNN with an AUC of mean (SD) 0.85(0.04) outperformed the generalized one [ AUC = 0.78 ( 0.06 ) , p = 0.004 ]. The individualized semantic [ AUC = 0.70 ( 0.05 ) ] and radiomic models [ AUC = 0.68 ( 0.06 ) ] outperformed the respective generalized versions [semantic AUC = 0.66 ( 0.03 ) , p = 0.009 ; radiomic AUC = 0.64 ( 0.06 ) , p = 0.03 ]. Conclusions: Individualized models slightly outperformed generalized models for all three feature sets. Inductive CNNs were better at predicting metastases detection than semantic or radiomic features. Generalized models have implementation advantages when individualized data are unavailable.
... Khened et al. 56 first extracted a region of interest with Fourier analyses after which a dense-Net analyzed the region. Zotti et al. 57 used a multi-resolution U-Net and additionally incorporated a cardiac shape prior. Painchaud et al. 58 used a U-Net to obtain segmentation predictions of the cardiac structures, which were subsequently converted into anatomically correct ones using an adversarial variational auto-encoder. ...
... In recent years, all three network architectures have shown a good performance for medical image segmentation tasks. 7,39,50,52,53,55,57,58 Other network architectures, such as DeepLabV3 with atrous spatial pyramid pooling 59 or multi-scale CNNs 44,60 could also have been included in our ensemble to increase diversity or to obtain a larger ensemble. Increasing the number of networks in an ensemble might also lead to a better performance and would therefore be an interesting topic for future research. ...
Article
Purpose: Ensembles of convolutional neural networks (CNNs) often outperform a single CNN in medical image segmentation tasks, but inference is computationally more expensive and makes ensembles unattractive for some applications. We compared the performance of differently constructed ensembles with the performance of CNNs derived from these ensembles using knowledge distillation, a technique for reducing the footprint of large models such as ensembles. Approach: We investigated two different types of ensembles, namely, diverse ensembles of networks with three different architectures and two different loss-functions, and uniform ensembles of networks with the same architecture but initialized with different random seeds. For each ensemble, additionally, a single student network was trained to mimic the class probabilities predicted by the teacher model, the ensemble. We evaluated the performance of each network, the ensembles, and the corresponding distilled networks across three different publicly available datasets. These included chest computed tomography scans with four annotated organs of interest, brain magnetic resonance imaging (MRI) with six annotated brain structures, and cardiac cine-MRI with three annotated heart structures. Results: Both uniform and diverse ensembles obtained better results than any of the individual networks in the ensemble. Furthermore, applying knowledge distillation resulted in a single network that was smaller and faster without compromising performance compared with the ensemble it learned from. The distilled networks significantly outperformed the same network trained with reference segmentation instead of knowledge distillation. Conclusion: Knowledge distillation can compress segmentation ensembles of uniform or diverse composition into a single CNN while maintaining the performance of the ensemble.
... Only the top-four methods [19], [43], [44], [45] from challenge leader board are used for comparison. We briefly discuss the architectures of the above mentioned methods: Zotti et al. [45] employed a multi-resolution gradient structure, which can be considered to be an extension to the U-Net, Khened et al. [43] developed a two stage U-Net based network by embedding Densely connected blocks in the place of convolutional layers in the standard U-Net, Isensee et al. [44] developed an ensemble of 10 models based on 2D, and 3D UNet architectures, and Painchaud et al. [19] designed an adhersal variational autoencoder for producing better segmentation results. ...
... Only the top-four methods [19], [43], [44], [45] from challenge leader board are used for comparison. We briefly discuss the architectures of the above mentioned methods: Zotti et al. [45] employed a multi-resolution gradient structure, which can be considered to be an extension to the U-Net, Khened et al. [43] developed a two stage U-Net based network by embedding Densely connected blocks in the place of convolutional layers in the standard U-Net, Isensee et al. [44] developed an ensemble of 10 models based on 2D, and 3D UNet architectures, and Painchaud et al. [19] designed an adhersal variational autoencoder for producing better segmentation results. As shown in Table 10, the proposed MMC-Net has achieved better DC than the other four models in segmenting all the three structures. ...
Preprint
p>Automatic segmentation of multi-modal Cardiac Magnetic Resonance Imaging (CMRI) scans is challenging due to the variant intensity distribution and unclear boundaries between the neighbouring tissues and other organs. The deep convolutional neural networks have shown great potential in medical image segmentation tasks. In this paper, we present a deep convolutional neural network model named Multi-Modal Cardiac Network (MMC-Net) for segmenting three cardiac structures namely right ventricle (RV), left ventricle (LV), and left ventricular myocardium (LVM) from multi-modal CMRI’s. The proposed MMC-Net is designed using a densely connected backbone enabling feature reuse, an atrous convolution module for fusing multi-scale features, and a pixel-classification module for generating the segmentation result. This model was evaluated on a publicly available MS-CMRSeg-2019 challenge dataset in segmentation of RV, LV, and LVM from CMRI scans. The segmentation results from extensive experiments demonstrate our MMC-Net can achieve better segmentation performance compared to other state-of-the-art models, and the existing approaches. Additionally, the generalization ability of the proposed MMC-Net is validated on another publicly available ACDC dataset without fine-tuning. The results demonstrate that the proposed MMC-Net shows a powerful generalisation ability of segmenting RV, LV, and LVM with higher performance.</p
... Only the top-four methods [19], [43], [44], [45] from challenge leader board are used for comparison. We briefly discuss the architectures of the above mentioned methods: Zotti et al. [45] employed a multi-resolution gradient structure, which can be considered to be an extension to the U-Net, Khened et al. [43] developed a two stage U-Net based network by embedding Densely connected blocks in the place of convolutional layers in the standard U-Net, Isensee et al. [44] developed an ensemble of 10 models based on 2D, and 3D UNet architectures, and Painchaud et al. [19] designed an adhersal variational autoencoder for producing better segmentation results. ...
... Only the top-four methods [19], [43], [44], [45] from challenge leader board are used for comparison. We briefly discuss the architectures of the above mentioned methods: Zotti et al. [45] employed a multi-resolution gradient structure, which can be considered to be an extension to the U-Net, Khened et al. [43] developed a two stage U-Net based network by embedding Densely connected blocks in the place of convolutional layers in the standard U-Net, Isensee et al. [44] developed an ensemble of 10 models based on 2D, and 3D UNet architectures, and Painchaud et al. [19] designed an adhersal variational autoencoder for producing better segmentation results. As shown in Table 10, the proposed MMC-Net has achieved better DC than the other four models in segmenting all the three structures. ...
Preprint
Full-text available
p>Automatic segmentation of multi-modal Cardiac Magnetic Resonance Imaging (CMRI) scans is challenging due to the variant intensity distribution and unclear boundaries between the neighbouring tissues and other organs. The deep convolutional neural networks have shown great potential in medical image segmentation tasks. In this paper, we present a deep convolutional neural network model named Multi-Modal Cardiac Network (MMC-Net) for segmenting three cardiac structures namely right ventricle (RV), left ventricle (LV), and left ventricular myocardium (LVM) from multi-modal CMRI’s. The proposed MMC-Net is designed using a densely connected backbone enabling feature reuse, an atrous convolution module for fusing multi-scale features, and a pixel-classification module for generating the segmentation result. This model was evaluated on a publicly available MS-CMRSeg-2019 challenge dataset in segmentation of RV, LV, and LVM from CMRI scans. The segmentation results from extensive experiments demonstrate our MMC-Net can achieve better segmentation performance compared to other state-of-the-art models, and the existing approaches. Additionally, the generalization ability of the proposed MMC-Net is validated on another publicly available ACDC dataset without fine-tuning. The results demonstrate that the proposed MMC-Net shows a powerful generalisation ability of segmenting RV, LV, and LVM with higher performance.</p
... In this section, we described several DL-based CMRI segmentation networks. 4. 3 contour [159], and shape [107,156,186]), force the model to produce more accurate segmentation results. For example, Oktay et al. [107] proposed an ACNN model that embeds prior knowledge into CNNs-based segmentation through an autoencoder network. ...
... Zotti et al. [156] developed a GridNet-based network that incorporates a cardiac shape prior to help kinetic cardiac MRI segmentation. Unlike these models, Painchaud et al. [178] developed a Variational AE (VAE) to refine the network's output through correcting non-anatomically plausible segmentation masks in the post-processing step. ...
Thesis
Deep learning architectures for automatic detection of viable myocardiac segmentsAccurate myocardial segmentation in LGE-MRI is an important purpose for diagnosis assistance of infarcted patients. Nevertheless, manual delineation of target volumes is time-consuming and depends on intra- and inter-observer variability. This thesis aims at developing efficient deep learning-based methods for automatically segmenting myocardial tissues (healthy myocardium, myocardial infarction, and microvascular obstruction) on LGE-MRI. In this regard, we first proposed a 2.5D SegU-Net model based on a fusion framework (U-Net and SegNet) to learn different feature representations adaptively. Then, we extended to new 3D architectures to benefit from additional depth cues. In a second step, we proposed to segment the anatomical structures using inception residual block and convolutional block attention module and diseased regions using 3D Auto-encoder to perfect myocardial shape. To this end, a prior shape penalty term is added to 3D U-Net architecture. Finally, we proposed first segment the left ventricular cavity and the myocardium based on the no-new-U-Net and second use a priori inclusion and classification networks to maintain the topological constraints of pathological tissues within the pre-segmented myocardium. We have introduced a post-processing decision phase to reduce the uncertainty of the model. The state-of-the-art performance of the proposed methods is validated on the EMIDEC dataset, comprising 100 training images and 50 test images from healthy and infarcted patients. Comprehensive empirical evaluations show that all of our algorithms have promising results.
... The selected atlases are then non-rigidly aligned to the predicted segmentation and labels of the warped atlases are fused to obtain the final segmentation. Zotti et al. (2019) constructed a fuzzy shape prior representing the probability for a voxel to be part of the object of interest. This shape prior is integrated in a segmentation CNN which also predicts the object's center position. ...
... These feature maps are jointly used for separate regression and semantic segmentation branches. Combining segmentation with regression in a similar way in multitask learning problems was used by Vigneault et al. (2018), Zotti et al. (2019) and Yue et al. (2019) to estimate cardiac pose, by Gessert and Schlaefer (2020) and Tilborghs and Maes (2020) to perform direct quantification of LV parameters and by Cao et al. (2018) for simultaneous hippocampus segmentation and clinical score regression from brain MR images. In our approach, we explicitly enforce consistency between predicted shape and predicted semantic segmentation by introducing two new loss functions, one minimizing the distance between the contours of the two representations and the other maximizing their overlap. ...
Preprint
Semantic segmentation using convolutional neural networks (CNNs) is the state-of-the-art for many medical image segmentation tasks including myocardial segmentation in cardiac MR images. However, the predicted segmentation maps obtained from such standard CNN do not allow direct quantification of regional shape properties such as regional wall thickness. Furthermore, the CNNs lack explicit shape constraints, occasionally resulting in unrealistic segmentations. In this paper, we use a CNN to predict shape parameters of an underlying statistical shape model of the myocardium learned from a training set of images. Additionally, the cardiac pose is predicted, which allows to reconstruct the myocardial contours. The integrated shape model regularizes the predicted contours and guarantees realistic shapes. We enforce robustness of shape and pose prediction by simultaneously performing pixel-wise semantic segmentation during training and define two loss functions to impose consistency between the two predicted representations: one distance-based loss and one overlap-based loss. We evaluated the proposed method in a 5-fold cross validation on an in-house clinical dataset with 75 subjects and on the ACDC and LVQuan19 public datasets. We show the benefits of simultaneous semantic segmentation and the two newly defined loss functions for the prediction of shape parameters. Our method achieved a correlation of 99% for left ventricular (LV) area on the three datasets, between 91% and 97% for myocardial area, 98-99% for LV dimensions and between 80% and 92% for regional wall thickness.
... The autoencoder structure with a low-dimensional latent space representation also allows for the straightforward integration with other common image-based tasks while maintaining a good degree of interpretability. Such tasks include the detection of coronary artery disease (27) and hypertrophic cardiomyopathy (28), image segmentation with shape priors (29)(30)(31)(32), multi-task segmentation and regression (33), image-to-image synthesis (33), and survival prediction (34). However, the aforementioned approaches mostly rely on representing cardiac shapes as fixed-size 3D voxelgrids and use standard grid-based deep learning operations. ...
Article
Full-text available
Cardiac anatomy and function vary considerably across the human population with important implications for clinical diagnosis and treatment planning. Consequently, many computer-based approaches have been developed to capture this variability for a wide range of applications, including explainable cardiac disease detection and prediction, dimensionality reduction, cardiac shape analysis, and the generation of virtual heart populations. In this work, we propose a variational mesh autoencoder (mesh VAE) as a novel geometric deep learning approach to model such population-wide variations in cardiac shapes. It embeds multi-scale graph convolutions and mesh pooling layers in a hierarchical VAE framework to enable direct processing of surface mesh representations of the cardiac anatomy in an efficient manner. The proposed mesh VAE achieves low reconstruction errors on a dataset of 3D cardiac meshes from over 1,000 patients with acute myocardial infarction, with mean surface distances between input and reconstructed meshes below the underlying image resolution. We also find that it outperforms a voxelgrid-based deep learning benchmark in terms of both mean surface distance and Hausdorff distance while requiring considerably less memory. Furthermore, we explore the quality and interpretability of the mesh VAE's latent space and showcase its ability to improve the prediction of major adverse cardiac events over a clinical benchmark. Finally, we investigate the method's ability to generate realistic virtual populations of cardiac anatomies and find good alignment between the synthesized and gold standard mesh populations in terms of multiple clinical metrics.
... The approach is divided into two categories: LV local-ization and contour segmentation, including shape feature detection [33,34], LV segmentation and function estimation using deep learning [35,36,37,38,39]; automatic iden-ti cation of blood vessel using a cascading classi er [40], diffusion-based unsupervised clustering technique for My-ocardial motion patterns classi cation [41]; CNN and U-Net approach [42], Multi-input fusion network [43], car-diac motion measurement by used algorithm namely surface structure feature matching [44], deep earning method used deformable, level set and threshold method for automatic LV contour segmentation [45,46,47,48]. Another major challenge is a region of interest (ROI) for automatic contour segmentation. ...
Preprint
Full-text available
Automatic segmentation solution is the process of detecting and extracting information to simplify the representation of Cardiac Magnetic Resonance images (CMRI) of Left Ventricle (LV) contour. This segmented information, using CMR images, helps to reduce the segmentation error between expert and automatic segmented contours. The error represents missing region values calculated in percentages after segmenting a cardiac LV contour. This review paper will discuss the major three segmentation approaches, namely manual approach, semi-automatic, and fully automatic, along with the segmentation models, namely image-based models, region-based models, edge-based models, deformable-based models, active shape-based models (ASM), active contour-based models (ACM), level set-based models (LSM), and Variational LSM (VLSM). The review deeply explains the performance of segmentation models using different techniques. Furthermore, the review compares 122 studies on segmentation model approaches, i.e., 16 from 2004 to 2010, 40 from 2011 to 2016, and 63 from 2017 to 2021, and 3 other related studies were conducted LV contour segmentation, cardiac function, area-at-risk (AAR) identification, scar tissue classification, oedema tissue classification, and identification via presence, size, and location. Given the large number of articles on CMR-LV images that have been published, this review conducted a critical analysis and found a gap for researchers in the areas of LV localization, LV contour segmentation, cardiac function, and oedoema tissue classification and segmentation. Regarding critical analysis, this paper summrised a research gap and made useful suggestions for new CMR-LV researchers. Although a timely reviewed study can lead to cardiac segmentation challenges, which will be discussed in each review section.
... The approach is divided into two categories: LV localization and contour segmentation, including shape feature detection [33], [34], LV segmentation and function estimation using deep learning [35]- [39]; automatic identification of blood vessel using a cascading classifier [40], diffusion-based unsupervised clustering technique for Myocardial motion patterns classification [41]; CNN and U-Net approach [42], Multi-input fusion network [43], cardiac motion measurement by used algorithm namely surface structure feature matching [44], deep earning method used deformable, level set and threshold method for automatic LV contour segmentation [45]- [48]. Another major challenge is a region of interest (ROI) for automatic contour segmentation. ...
Preprint
p>A review papaer abour cardiac MRI Image segmentation. </p
... Khened et al. 33 work on densely connected fully convolutional network (FCN) by not utilizing the skip connections that are used in the UNet architecture. Zotti et al. 42 used a multi-resolution GridNet architecture, which is an extension of the UNet architecture. Painchaud et al. 43 proposed an adversarial variational auto-encoder (aVAE) for anatomically plausible segmentation. ...
Article
Purpose: Cardiac ventricle segmentation from cine magnetic resonance imaging (CMRI) is a recognized modality for the noninvasive assessment of cardiovascular pathologies. Deep learning based algorithms achieved state-of-the-art result performance from CMRI cardiac ventricle segmentation. However, most approaches received less attention at the bottom layer of UNet, where main features are lost due to pixel degradation. To increase performance, it is important to handle the bottleneck layer of UNet properly. Considering this problem, we enhanced the performance of main features at the bottom layer of network. Method: We developed a fully automatic pipeline for segmenting the right ventricle (RV), myocardium (MYO), and left ventricle (LV) by incorporating short-axis CMRI sequence images. We propose a dilated residual network (DRN) to capture the features at full resolution in the bottleneck of UNet. Thus, it significantly increases spatial and temporal information and maintains the localization accuracy. A data-augmentation technique is employed to avoid overfitting and class imbalance problems. Finally, output from each expanding path is added pixel-wise to improve the training response. Results: We used and evaluated our proposed method on automatic cardiac diagnosis challenge (ACDC). The test set consists of 50 patient records. The overall dice similarity coefficient (DSC) we achieved for our model is 0.924±0.03, 0.907±0.01, and 0.949±0.05 for RV, MYO, and LV, respectively. Similarly, we obtained hausdorff distance (HD) scores of 10.09±0.01 mm, 7.25±0.05 mm, and 6.86±0.02 mm for RV, MYO, and LV, respectively. The results shows superior performance and outperformed state-of-the-art methods in terms of accuracy and reached expert-level segmentation. Consequently, the overall DSC and HD result improved by 1.0% and 1.5%, respectively. Conclusion: We designed a dilated residual UNet (DRN) for cardiac ventricle segmentation using short-axis CMRI. Our method has the advantage of restoring and capturing spatial and temporal information by expanding the receptive field without degrading the image main features in the bottleneck of UNet. Our method is highly accurate and quick, taking 0.28 seconds on average to process 2D MR images. Also, the network was designed to work on predictions of individual MR images to segment the ventricular region, for which our model outperforms many state-of-the-art methods. This article is protected by copyright. All rights reserved.
... In recent years, with the rapid development of convolutional neural networks (CNNs) [9]- [16], numerous deep models have been successfully leveraged to medical field [17]- [24]. The methods represented by U-Net [25] and its variants have become the pragmatic standard in the field of medical image segmentation, which greatly improves the performance of medical image segmentation by bridging the paths among the encoder and decoder stages. ...
Preprint
Full-text available
The hippocampus plays a vital role in the diagnosis and treatment of many neurological disorders. Recent years, deep learning technology has made great progress in the field of medical image segmentation, and the performance of related tasks has been constantly refreshed. In this paper, we focus on the hippocampus segmentation task and propose a novel hierarchical feedback chain network. The feedback chain structure unit learns deeper and wider feature representation of each encoder layer through the hierarchical feature aggregation feedback chains, and achieves feature selection and feedback through the feature handover attention module. Then, we embed a global pyramid attention unit between the feature encoder and the decoder to further modify the encoder features, including the pair-wise pyramid attention module for achieving adjacent attention interaction and the global context modeling module for capturing the long-range knowledge. The proposed approach achieves state-of-the-art performance on three publicly available datasets, compared with existing hippocampus segmentation approaches.
... Apparently, the use of medical imaging has a huge impact in terms of providing an accurate clinical screening and diagnosis [54][55][56][57][58][59]. One of the subdomains of this application is medical-image segmentation, which uses advanced automated-segmentation algorithms to provide segmentation results that are as similar as possible to the region's original structure [18,[60][61][62]. Deep-learning methods for medical-image applications, either for classification or segmentation purposes, often encounter the following three profound issues: ...
Article
Full-text available
In general, most of the existing convolutional neural network (CNN)-based deep-learning models suffer from spatial-information loss and inadequate feature-representation issues. This is due to their inability to capture multiscale-context information and the exclusion of semantic information throughout the pooling operations. In the early layers of a CNN, the network encodes simple semantic representations, such as edges and corners, while, in the latter part of the CNN, the network encodes more complex semantic features, such as complex geometric shapes. Theoretically, it is better for a CNN to extract features from different levels of semantic representation because tasks such as classification and segmentation work better when both simple and complex feature maps are utilized. Hence, it is also crucial to embed multiscale capability throughout the network so that the various scales of the features can be optimally captured to represent the intended task. Multiscale representation enables the network to fuse low-level and high-level features from a restricted receptive field to enhance the deep-model performance. The main novelty of this review is the comprehensive novel taxonomy of multiscale-deep-learning methods, which includes details of several architectures and their strengths that have been implemented in the existing works. Predominantly, multiscale approaches in deep-learning networks can be classed into two categories: multiscale feature learning and multiscale feature fusion. Multiscale feature learning refers to the method of deriving feature maps by examining kernels over several sizes to collect a larger range of relevant features and predict the input images’ spatial mapping. Multiscale feature fusion uses features with different resolutions to find patterns over short and long distances, without a deep network. Additionally, several examples of the techniques are also discussed according to their applications in satellite imagery, medical imaging, agriculture, and industrial and manufacturing systems.
... Researchers have also incorporated shape priors into deep learning frameworks [11,37,26,27,36]. Instead of using handcrafted features, CNNs are used to extract complex appearance features from images to learn the shape parameters of the statistical shape models [25,1,34]. ...
Preprint
Full-text available
Accurate segmentation and motion estimation of myocardium have always been important in clinic field, which essentially contribute to the downstream diagnosis. However, existing methods cannot always guarantee the shape integrity for myocardium segmentation. In addition, motion estimation requires point correspondence on the myocardium region across different frames. In this paper, we propose a novel end-to-end deep statistic shape model to focus on myocardium segmentation with both shape integrity and boundary correspondence preserving. Specifically, myocardium shapes are represented by a fixed number of points, whose variations are extracted by Principal Component Analysis (PCA). Deep neural network is used to predict the transformation parameters (both affine and deformation), which are then used to warp the mean point cloud to the image domain. Furthermore, a differentiable rendering layer is introduced to incorporate mask supervision into the framework to learn more accurate point clouds. In this way, the proposed method is able to consistently produce anatomically reasonable segmentation mask without post processing. Additionally, the predicted point cloud guarantees boundary correspondence for sequential images, which contributes to the downstream tasks, such as the motion estimation of myocardium. We conduct several experiments to demonstrate the effectiveness of the proposed method on several benchmark datasets.
... To alleviate the defects of 2D FCN-based networks, several works have utilised additional contextual information. Zotti et al. [28] used a shape prior to guide the 2D FCN. Zheng et al. [7] and Chen et al. [29] leveraged spatial or temporal context information to improve the consistency of segmentation results. ...
Article
Full-text available
The accurate and robust automatic segmentation of cardiac structures in magnetic resonance imaging (MRI) is significant in calculating cardiac clinical functional indices, and diagnosing heart diseases. Most U‐Net based methods use pooling, transposed convolution, and skip connection operations to integrate the multiscale features for improved segmentation in cardiac MRI. However, this architecture lacks adequate semantic connection between the channel and spatial information, and robustness in segmenting objects with significant shape variations. In this paper, a new multiscale feature attentive U‐Net for cardiac MRI structural segmentation method is proposed. An attention mechanism is adopted after concatenating the multi‐level features to aggregate different scale features and determine on which features to focus. Cascade and parallel dilated convolution is also employed in the decoder blocks and skip connection is employed to enhance the ability of sensing receptive fields for multiscale context information. Furthermore, deep supervision approach with a loss function that combines the dice and cross‐entropy losses to reduce overfitting and ensure better prediction is introduced. The proposed method was evaluated on three public cardiac datasets. The experimental results indicate that the method achieved competitive segmentation performance with the three datasets, which verifies the robustness and generalisability of the proposed network. In comparison with conventional U‐Net methods, the model leverages attention mechanism and dilated convolution block, which increases the semantic connection between the channel and the spatial information, and improves the robustness of the right ventricle segmentation performance. From the view of the Dice scores and segmentation results, the multiscale feature attentive U‐Net method is one of effective methods in segmenting cardiac MRI structures.
... Zotti et al. [93] extend the well-established U-net architecture [94] through the formulation of a probabilistic framework, which allows the embedding of a cardiac shape prior, in the form of a 3D volume encoding the probability of a voxel to belong to a certain "cardiac class" (LV, RV, or MYO), and the definition of a loss function tailored to the cardiac anatomy. Clough et al. [95] propose a loss function that measures the topological correspondence between predicted segmentations and prior shape knowledge. ...
Article
Full-text available
Since the rise of deep learning (DL) in the mid-2010s, cardiac magnetic resonance (CMR) image segmentation has achieved state-of-the-art performance. Despite achieving inter-observer variability in terms of different accuracy performance measures, visual inspections reveal errors in most segmentation results, indicating a lack of reliability and robustness of DL segmentation models, which can be critical if a model was to be deployed into clinical practice. In this work, we aim to bring attention to reliability and robustness, two unmet needs of cardiac image segmentation methods, which are hampering their translation into practice. To this end, we first study the performance accuracy evolution of CMR segmentation, illustrate the improvements brought by DL algorithms and highlight the symptoms of performance stagnation. Afterwards, we provide formal definitions of reliability and robustness. Based on the two definitions, we identify the factors that limit the reliability and robustness of state-of-the-art deep learning CMR segmentation techniques. Finally, we give an overview of the current set of works that focus on improving the reliability and robustness of CMR segmentation, and we categorize them into two families of methods: quality control methods and model improvement techniques. The first category corresponds to simpler strategies that only aim to flag situations where a model may be incurring poor reliability or robustness. The second one, instead, directly tackles the problem by bringing improvements into different aspects of the CMR segmentation model development process. We aim to bring the attention of more researchers towards these emerging trends regarding the development of reliable and robust CMR segmentation frameworks, which can guarantee the safe use of DL in clinical routines and studies.
... Oktay et al. [31] modified the decoder layers of a U-Net architecture to embed prior information through super-resolution gold standard maps using cardiac cine MRI. Zotti et al. [32] developed a grid-Net-based network to segment heart structures from cardiac cine-MRI. Their model integrates cardiac shape prior information to encode a 3D position-point likelihood for being a definite class. ...
Article
Full-text available
Accurate segmentation of the myocardial scar may supply relevant advancements in predicting and controlling deadly ventricular arrhythmias in subjects with cardiovascular disease. In this paper, we propose the architecture of inclusion and classification of prior information U-Net (ICPIU-Net) to efficiently segment the left ventricle (LV) myocardium, myocardial infarction (MI), and microvascular-obstructed (MVO) tissues from late gadolinium enhancement magnetic resonance (LGE-MR) images. Our approach was developed using two subnets cascaded to first segment the LV cavity and myocardium. Then, we used inclusion and classification constraint networks to improve the resulting segmentation of the diseased regions within the pre-segmented LV myocardium. This network incorporates the inclusion and classification information of the LGE-MRI to maintain topological constraints of pathological areas. In the testing stage, the outputs of each segmentation network obtained with specific estimated parameters from training were fused using the majority voting technique for the final label prediction of each voxel in the LGE-MR image. The proposed method was validated by comparing its results to manual drawings by experts from 50 LGE-MR images. Importantly, compared to various deep learning-based methods participating in the EMIDEC challenge, the results of our approach have a more significant agreement with manual contouring in segmenting myocardial diseases. View Full-Text Keywords: segmentation; ICPIU-Net; myocardium; myocardial infarction (MI); late gadolinium enhancement magnetic resonance (LGE-MR); microvascular-obstructed (MVO); deep learning
... For multi-scale data sampling, independent component analysis (ICA) (Wang et al. 2019c) is applied over the patches of data to form clusters of canonical form distributions which represent spatio-temporal correlations at coarser scales. This data sampling parallelization tends to speed up the performance significantly by 26.8% as compared to the standard U-Net model and achieved an increased dice score by 1.6% on ACDC MICCAI 2017 challenge, while also improving significantly over state-of-the-art GridNet (Zotti et al. 2018) model. Wang et al. (2019a) proposed a 3D FCN model with deep supervision and group dilation (DSD-FCN model) to address various challenges concerning the automated MRI prostate segmentation like inhomogeneous intensity distribution, varying prostate anatomy, etc., which makes it hard for manual intervention. ...
Article
Full-text available
With the advent of advancements in deep learning approaches, such as deep convolution neural network, residual neural network, adversarial network; U-Net architectures are most widely utilized in biomedical image segmentation to address the automation in identification and detection of the target regions or sub-regions. In recent studies, U-Net based approaches have illustrated state-of-the-art performance in different applications for the development of computer-aided diagnosis systems for early diagnosis and treatment of diseases such as brain tumor, lung cancer, alzheimer, breast cancer, etc., using various modalities. This article contributes in presenting the success of these approaches by describing the U-Net framework, followed by the comprehensive analysis of the U-Net variants by performing (1) inter-modality, and (2) intra-modality categorization to establish better insights into the associated challenges and solutions. Besides, this article also highlights the contribution of U-Net based frameworks in the ongoing pandemic, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) also known as COVID-19. Finally, the strengths and similarities of these U-Net variants are analysed along with the challenges involved in biomedical image segmentation to uncover promising future research directions in this area.
... Mauger et al. (2019) évaluent l'influence de facteurs de risque (cholestérol, diabète, obésité, etc...) en quantifiant les différences de variations de forme entre un groupe référence sans facteur de risque et des groupes en présentant. Bernardino et al. (2021) utilisent un atlas statistique considérant les variables confondantes (comme par exemple certaines données démographiques, qui peut biaiser l'analyse) pour identifier le remodelage cardiaque dû à la pratique d'un exercice d'endurance.Les atlas de forme ont aussi montré leur pertinence pour apporter des informations a priori dans les méthodes d'apprentissage profond notamment pour la segmentation d'images(Oktay et al., 2017 ;Zotti et al., 2019 ;Chen et al., 2019 ;Duan et al., 2019).D'autres travaux utilisent la distribution apprise d'un atlas pour générer des collections de formes. Par exemple, Attar et al. (2019) prédisent un paramètre de forme basé sur les informations du patient et Clough et al. (2019a) génèrent des formes correspondantes à des patients ayant une faible fraction d'éjection. ...
Thesis
En routine clinique, les méthodes d'imagerie permettent d'extraire des descripteurs ou indices caractérisant la fonction cardiaque et d'établir un diagnostic. Dans le cas de l'insuffisance cardiaque, un remodelage cardiaque se produit très souvent. Plusieurs aspects de la morphologie et de la fonction sont affectés lors de ce phénomène, notamment des anomalies de forme et de déformation peuvent apparaître. De plus, des interactions entre ces deux aspects ont été mises en évidence structurellement ou reliées à certaines pathologies cardiaques. Ces interactions peuvent être difficilement analysées par de simples indices scalaires qui décrivent généralement un comportement global. L'imagerie médicale est en mesure de fournir des représentations de hautes dimensions de ces descripteurs, c'est-à-dire une information régionale/locale et à plusieurs instants du cycle cardiaque ; elles sont cependant non exploitées en routine clinique par manque de temps, de consensus et de la difficulté de leur analyse. Dans ce manuscrit, nous explorons des approches de caractérisation plus fine du lien partiellement connu entre la forme et la déformation cardiaques via des représentations hautes dimensions de ces deux aspects. Des méthodes d'anatomie computationnelle ou d'apprentissage de variétés permettent d'exploiter ces représentations hautes dimensions individuelles et de généraliser l'analyse à une population. Néanmoins, ces méthodes ne considèrent généralement qu'un aspect de la fonction cardiaque à la fois alors que plusieurs interagissent. Les méthodes intégrant plusieurs descripteurs ne prennent habituellement pas explicitement en compte le lien possible entre eux.Ce travail comporte trois principales contributions. Premièrement, nous proposons une stratégie pour caractériser les interactions entre la forme et la déformation cardiaques évaluées par des descripteurs hautes dimensions et démontrons sa pertinence pour plusieurs pathologies du ventricule droit. Cette stratégie est basée sur une méthode d'apprentissage non-linéaire (alignement de variétés multiples) et utilisée ici pour caractériser un lien partiellement connu alors qu'elle a été jusqu'à présent appliquée à des descripteurs évoluant dans la même variété, et pour lesquels le lien existe naturellement. Deuxièmement, nous avons évalué le bénéfice de prendre en compte le lien entre les descripteurs en étudiant les différences avec une approche qui considère chaque descripteur individuellement et d'autres approches prenant en compte plusieurs descripteurs. Pour finir, l'étude de l'influence des descripteurs de forme/déformation et de stratégies de normalisation sur notre approche a mis en valeur un possible biais introduit par les choix faits et montré que le choix approprié dépend de l'application visée.Cette thèse montre la pertinence d'utiliser l'alignement de variétés pour considérer le lien partiellement connu entre forme et déformation cardiaques en l'illustrant dans l'étude comparée de plusieurs pathologies du ventricule droit. Ces analyses ouvrent la porte à l'exploitation de ces espaces cohérents pour des challenges plus applicatifs comme la quantification de risques.
... Simantiris et al. (38) used a simple network composed of cascaded modules of dilated convolutions with increasing dilation rate without using concatenation or operations like pooling that will lead to the decrease of resolution. Zotti et al. (39) introduced the shape prior obtained from the training dataset in the 3D Grid-net and employed the contour Although these above methods showed great improvements in the segmentation performance compared to ASM or AAM. They treat each frame independently, which makes the segmentation results of some specific sequences inaccurate or the overall results lack coherence. ...
Article
Full-text available
Coronary artery disease (CAD) is the most common cause of death globally, and its diagnosis is usually based on manual myocardial (MYO) segmentation of MRI sequences. As manual segmentation is tedious, time-consuming, and with low replicability, automatic MYO segmentation using machine learning techniques has been widely explored recently. However, almost all the existing methods treat the input MRI sequences independently, which fails to capture the temporal information between sequences, e.g., the shape and location information of the myocardium in sequences along time. In this article, we propose a MYO segmentation framework for sequence of cardiac MRI (CMR) scanning images of the left ventricular (LV) cavity, right ventricular (RV) cavity, and myocardium. Specifically, we propose to combine conventional neural networks and recurrent neural networks to incorporate temporal information between sequences to ensure temporal consistency. We evaluated our framework on the automated cardiac diagnosis challenge (ACDC) dataset. The experiment results demonstrate that our framework can improve the segmentation accuracy by up to 2% in the Dice coefficient.
... In 2018, Bai [16] proposed an automated analysis method that accomplished outperforming results in segmentation of the LV and RV on short-axis CMR images and the left atrium (LA) and right atrium (RA) on long-axis CMR images. Zotti et al. [17] proposed a convolutional neural network (CNN) model based on the UNET architecture. It used shape prior knowledge for the segmentation task. ...
Preprint
Full-text available
Cardiovascular diseases (CVDs) remain the principal cause of all global death and disabilities worldwide. Cardiac MR Images play an important role in diagnosing and treating cardiac ailments in patients. Automatic segmentation of Cardiac Magnetic Resonance Imaging (Cardiac MRI) is an essential application in clinical practice. In this paper, Cardiac MRI segmentation is performed using a convolutional neural network. ACDC Challenge 2017 dataset is used the training and testing purpose. It consists of data of 100 subjects, including the End Systole and End Diastole phase. The model's performance is measured using the Dice coefficient, achieving an accuracy of 0.90. The results for basal as well as with apical slices are pretty encouraging.
... There are also some image segmentation methods based on specific theories, such as level sets [15]- [17] and spectral clustering [18]- [20]. Driven by the popularity of deep neural networks, end-to-end medical image segmentation methods [21]- [23] are becoming increasingly popular. Given that the original data for cardiac MRIs are three-dimensional, 3D CNNs can easily be applied to segmentation. ...
Article
Full-text available
For diagnosing cardiovascular disease, an accurate segmentation method is needed. There are several unre-solved issues in the complex field of cardiac magnetic resonance imaging, some of which have been partially addressed by using deep neural networks. To solve two problems of over-segmentation and under-segmentation of anatomical shapes in the short-axis view from different cardiac magnetic resonance sequences, we propose a novel two-stage framework with a weighted decision map based on convolutional neural networks to segment the myocardium (Myo), left ventricle (LV), and right ventricle (RV) simultaneously. The framework comprises a deci-sion map extractor and a cardiac segmenter. A cascaded U-Net++ is used as a decision map extractor to acquire the decision map that decides the category of each pixel. Cardiac segmenter is a multiscale dual-path feature ag-gregation network (MDFA-Net) which consists of a densely connected network and an asymmetric encoding and decoding network. The input to the cardiac seg-menter is derived from processed original images weighted by the output of the decision map extractor. We conducted experiments on two datasets of mul-ti-sequence cardiac magnetic resonance segmentation challenge 2019 (MS-CMRSeg 2019) and myocardial pa-thology segmentation challenge 2020 (MyoPS 2020). Test results obtained on MyoPS 2020 show that proposed method with average Dice coefficient of 84.70%, 86.00% and 86.31% in the segmentation task of Myo, LV, and RV, respectively.
Chapter
Cardiac magnetic resonance imaging is used to detect cardiovascular diseases in the early stage of diagnosis, preventing cardiac disease worsening. There are numerous methods to detect cardiac diseases in the early stages. But in this chapter, we discussed convolutional neural networks, which are used to predict cardiovascular diseases. Convolutional neural networks are used for image segmentation, particularly for finding left ventricle volume segmentation in a short-axis view (SAX) image. The dataset we used is from Kaggle’s Second Annual Data Science Bowl (SADSB), which has 500 patients’ cardiac images for training, 200 patients’ cardiac images for validation that can be used for training, and 440 patients’ cardiac images for testing. The metrics we used are continuous ranked probability score (CRPS) used for the Kaggle’s Second Annual Data Science Bowl challenge evaluation. The proposed model achieved the CRPS value of 0.043 on the test.KeywordsDeep learningConvolutional neural networkLeft ventricleCardiac MRIMedical images
Article
Full-text available
Convolutional neural networks originate from image classification tasks. The pooling operation can expand the receptive field and reduce the amount of calculation, but a large amount of pixel information can be lost, which is obviously harmful to pixel-level segmentation accuracy. Dilation convolution expands the receptive field and keeps the resolution unchanged, but it increases the amount of data storage and calculation. Therefore, dilation convolution can only be applied to a limited number of deep layers in a network. It is common that different samples have different sizes in medical image dataset. The resize operation is widely used in the field to obtain uniform sizes. However, the resize operation adds, deletes and modifies a large number of pixels based on the interpolation method. In this way, an image after resizing is damaged to a certain extent at the pixel level, which also significantly affects the performance of the segmentation network. We propose a resolution-consistent network (RCN), which removes all pooling layers and keeps all resolutions consistent to solve the data loss problem caused by downsampling operations. To solve the problem of the increased amount of data storage and calculation caused by dilated convolution and the data damage caused by the resize operation, we propose a nondamage data preprocessing method that includes a coarse segmentation network, cardiac center point positioning algorithm and nondamage cropping to avoid any resize operations. We achieve state-of-the-art performance and reach first place with respect to some indicators on the widely-used automated cardiac diagnosis challenge (ACDC) dataset. Our average dice scores are 0.951 (left ventricle), 0.915 (right ventricle) and 0.910 (myocardium) on the test set.
Article
Full-text available
Background Cardiac magnetic resonance image (MRI) has been widely used in diagnosis of cardiovascular diseases because of its noninvasive nature and high image quality. The evaluation standard of physiological indexes in cardiac diagnosis is essentially the accuracy of segmentation of left ventricle (LV) and right ventricle (RV) in cardiac MRI. The traditional symmetric single codec network structure such as U-Net tends to expand the number of channels to make up for lost information that results in the network looking cumbersome. Methods Instead of a single codec, we propose a multiple codecs structure based on the FC-DenseNet (FCD) model and capsule convolution-capsule deconvolution, named Nested Capsule Dense Network (NCDN) . NCDN uses multiple codecs to achieve multi-resolution, which makes it possible to save more spatial information and improve the robustness of the model. Results The proposed model is tested on three datasets that include the York University Cardiac MRI dataset, Automated Cardiac Diagnosis Challenge (ACDC-2017), and the local dataset. The results show that the proposed NCDN outperforms most methods. In particular, we achieved nearly the most advanced accuracy performance in the ACDC-2017 segmentation challenge. This means that our method is a reliable segmentation method, which is conducive to the application of deep learning-based segmentation methods in the field of medical image segmentation.
Article
The hippocampus plays a vital role in the diagnosis and treatment of many neurological disorders. Recent years, deep learning technology has made great progress in the field of medical image segmentation, and the performance of related tasks has been constantly refreshed. In this paper, we focus on the hippocampus segmentation task and propose a novel hierarchical feedback chain network. The feedback chain structure unit learns deeper and wider feature representation of each encoder layer through the hierarchical feature aggregation feedback chains, and achieves feature selection and feedback through the feature handover attention module. Then, we embed a global pyramid attention unit between the feature encoder and the decoder to further modify the encoder features, including the pair-wise pyramid attention module for achieving adjacent attention interaction and the global context modeling module for capturing the long-range knowledge. The proposed approach achieves state-of-the-art performance on three publicly available datasets, compared with existing hippocampus segmentation approaches. The code and results can be found from the link of https://github.com/easymoneysniper183/sematic_seg.
Preprint
Full-text available
In recent years, cardiovascular diseases (CVDs) have become one of the leading causes of mortality globally. CVDs appear with minor symptoms and progressively get worse. The majority of people experience symptoms such as exhaustion, shortness of breath, ankle swelling, fluid retention, and other symptoms when starting CVD. Coronary artery disease (CAD), arrhythmia, cardiomyopathy, congenital heart defect (CHD), mitral regurgitation, and angina are the most common CVDs. Clinical methods such as blood tests, electrocardiography (ECG) signals, and medical imaging are the most effective methods used for the detection of CVDs. Among the diagnostic methods, cardiac magnetic resonance imaging (CMR) is increasingly used to diagnose, monitor the disease, plan treatment and predict CVDs. Coupled with all the advantages of CMR data, CVDs diagnosis is challenging for physicians due to many slices of data, low contrast, etc. To address these issues, deep learning (DL) techniques have been employed to the diagnosis of CVDs using CMR data, and much research is currently being conducted in this field. This review provides an overview of the studies performed in CVDs detection using CMR images and DL techniques. The introduction section examined CVDs types, diagnostic methods, and the most important medical imaging techniques. In the following, investigations to detect CVDs using CMR images and the most significant DL methods are presented. Another section discussed the challenges in diagnosing CVDs from CMR data. Next, the discussion section discusses the results of this review, and future work in CVDs diagnosis from CMR images and DL techniques are outlined. The most important findings of this study are presented in the conclusion section.
Article
To accurately and simultaneously segment myocardium left and right ventricles at the end-diastolic (ED) and end-systolic (ES) phases from short-axis cardiac magnetic resonance (CMR) images with inherent variability in appearance, shape, and location of the region of interest (ROI), we propose a diversity convolutional network (DCNet) that aims to solve ventricle under- and over-segmentation problems. DCNet is composed of three stages: the integration of diversity features, recoding of diversity features, and decoding of integrated features. To enhance the representational capacity and enrich the feature space of a single convolution, we design a diversity convolution block during the first stage. During the second stage, we design a dual-path channel attention mechanism to simultaneously select average and maximum features. In addition, we use a soft Dice loss function to assist in the network’s training. We conducted experiments on the 2017 Automated Cardiac Diagnosis Challenge (ACDC 2017), 2019 Multi-Sequence Cardiac MR Segmentation Challenge (MS-CMRSeg 2019), and 2020 Myocardial Pathology Segmentation Challenge (MyoPS 2020) datasets. We submitted our test results on the ACDC dataset to an online test platform, the proposed DCNet achieved Dice scores of 95.80%, 91.77%, and 91.57% in the left ventricle, right ventricle, and myocardium segmentation tasks, respectively. Compared with four representative networks, the proposed DCNet achieves the best results on balanced steady state free precession (bSSFP) cine sequence and late gadolinium enhancement (LGE) CMR sequences on the MS-CMRSeg and MyoPS datasets. Therefore, the proposed method is promising for automatic ventricle segmentation in clinical applications. We uploaded the code to https://github.com/fly1995/DCNet.
Article
The cardiac MRI segmentation specifically for the Right Ventricle (RV) through organising individual pixel of a medical image need a sequential dataset for the Deep learning approach. The most vital image segmentation for the fully automatic method is simplifying the demarcation, classification, and visualisation regions. The segmentation of the cMRI complex anatomy of the RV chamber is most challenging. The trabeculations in RV with signal intensity and shape from base to apex as in crescent shape myocardium’s complex varies from base to apex depending upon heavy augmentations and image protocol. The global context of segmentation for the successful classification of a medical image is essential to achieve the human annotator margin. The future research direction and approaches can be drawn through analysing variant reports of active researchers.
Article
Segmenting breast mass from magnetic resonance imaging (MRI) scans is an important step in the breast cancer diagnostic procedure for physicians and computer-aided diagnosis systems. Sufficient high-quality annotation is essential for establishing an automatic segmentation model, particularly for MRI breast masses with complex backgrounds and various sizes. In this study, we have proposed a novel approach for training an MRI breast mass segmentation network with partial annotations and reinforcing it with two weakly supervised constraint losses. Specifically, following three user-friendly partial annotation methods were designed to alleviate annotation costs: single-slice, orthogonal slice, and interval slice annotations. With the guidance of partial annotations, we first introduced a volume awareness loss that supports the additional constraint for masses with various scales. Moreover, to reduce false-positive predictions, we proposed an end-to-end differentiable outlier-suppression loss to suppress noise activation outside the target during training. We validated our method on 140 patients. The Dice similarity coefficient (DSC) of the proposed three partial annotation methods are 0.674, 0.835, and 0.837 respectively. Quantitative and qualitative evaluations demonstrate that our method can achieve competitive performance compared to state-of-the-art methods with complete annotations.
Article
Accurate image segmentation plays a crucial role in medical image analysis, yet it faces great challenges caused by various shapes, diverse sizes, and blurry boundaries. To address these difficulties, square kernel-based encoder-decoder architectures have been proposed and widely used, but their performance remains unsatisfactory. To further address these challenges, we present a novel double-branch encoder architecture. Our architecture is inspired by two observations. (1) Since the discrimination of the features learned via square convolutional kernels needs to be further improved, we propose utilizing nonsquare vertical and horizontal convolutional kernels in a double-branch encoder so that the features learned by both branches can be expected to complement each other. (2) Considering that spatial attention can help models to better focus on the target region in a large-sized image, we develop an attention loss to further emphasize the segmentation of small-sized targets. With the above two schemes, we develop a novel double-branch encoder-based segmentation framework for medical image segmentation, namely, Crosslink-Net, and validate its effectiveness on five datasets with experiments. The code is released at https://github.com/Qianyu1226/Crosslink-Net.
Article
Semantic segmentation using convolutional neural networks (CNNs) is the state-of-the-art for many medical image segmentation tasks including myocardial segmentation in cardiac MR images. However, the predicted segmentation maps obtained from such standard CNN do not allow direct quantification of regional shape properties such as regional wall thickness. Furthermore, the CNNs lack explicit shape constraints, occasionally resulting in unrealistic segmentations. In this paper, we use a CNN to predict shape parameters of an underlying statistical shape model of the myocardium learned from a training set of images. Additionally, the cardiac pose is predicted, which allows to reconstruct the myocardial contours. The integrated shape model regularizes the predicted contours and guarantees realistic shapes. We enforce robustness of shape and pose prediction by simultaneously performing pixel-wise semantic segmentation during training and define two loss functions to impose consistency between the two predicted representations: one distance-based loss and one overlap-based loss. We evaluated the proposed method in a 5-fold cross validation on an in-house clinical dataset with 75 subjects and on the ACDC and LVQuan19 public datasets. We show that the two newly defined loss functions successfully increase the consistency between shape and pose parameters and semantic segmentation, which leads to a significant improvement of the reconstructed myocardial contours. Additionally, these loss functions drastically reduce the occurrence of unrealistic shapes in the semantic segmentation output.
Article
Cardiac magnetic resonance imaging (CMRI) segmentation transforms cardiac MR images into semantic regions to define the left ventricle cavity, right ventricle cavity, and myocardium. CMRI segmentation provides ventricles’ volume, mass, and ejection fraction, playing a significant role in cardiac disease diagnosis. This paper aims to propose novel deep supervision in U-Net-based architecture to enhance the segmentation performance. It presents a deeply supervised W-Net, which creates another path in parallel with the decoder path in U-Net-based architecture. The output of every upsampling layer in the decoder path is combined with pixel-wise addition for feature reuse, and loss is computed at each feature dimension on the deep supervision layer, which enables gradients to be implanted at a greater depth into the network and enhances the training of all layers in the network. Proposed W-Net applied on single scanner-based ACDC dataset and Multi-Centre, Multi-Vendor & Multi-Disease dataset, making it more robust in model generalization. W-Net significantly outperforms numerous state-of-the-art methods on the two publicly available CMRI datasets, according to experiments conducted. W-Net obtained better segmentation results ranked in the top three for many metrics. It is evident that the proposed W-Net has considerable potential in CMRI segmentation, cardiac assessment, and disease diagnosis.
Article
The process of identifying cardiac adipose tissue (CAT) from volumetric magnetic resonance imaging of the heart is tedious, time-consuming, and often dependent on observer interpretation. Many 2-dimensional (2D) convolutional neural networks (CNNs) have been implemented to automate the cardiac segmentation process, but none have attempted to identify CAT. Furthermore, the results from automatic segmentation of other cardiac structures leave room for improvement. This study investigated the viability of a 3-dimensional (3D) CNN in comparison to a similar 2D CNN. Both models used a U-Net architecture to simultaneously classify CAT, left myocardium, left ventricle, and right myocardium. The multi-phase model trained with multiple observers' segmentations reached a whole-volume Dice similarity coefficient (DSC) of 0.925 across all classes and 0.640 for CAT specifically; the corresponding 2D model's DSC across all classes was 0.902 and 0.590 for CAT specifically. This 3D model also achieved a higher level of CAT-specific DSC agreement with a group of observers with a Williams Index score of 0.973 in comparison to the 2D model's score of 0.822.
Article
Full-text available
The interest for machine learning (ML) has grown tremendously in recent years, partly due to the performance leap that occurred with new techniques of deep learning, convolutional neural networks for images, increased computational power, and wider availability of large data sets. Most fields of medicine follow that popular trend and, notably, radiation oncology is one of those that are at the forefront, with already a long tradition in using digital images and fully computerized workflows. ML models are driven by data, and in contrast with many statistical or physical models, they can be very large and complex, with countless generic parameters. This inevitably raises two questions, namely, the tight dependence between the models and the data sets that feed them, and the interpretability of the models, which scales with its complexity. Any problems in the data used to train the model will be later reflected in their performance. This, together with the low interpretability of ML models, makes their implementation into the clinical workflow particularly difficult. Building tools for risk assessment and quality assurance of ML models must involve then two main points: interpretability and data-model dependency. After a joint introduction of both radiation oncology and ML, this paper reviews the main risks and current solutions when applying the latter to workflows in the former. Risks associated with data and models, as well as their interaction, are detailed. Next, the core concepts of interpretability, explainability, and data-model dependency are formally defined and illustrated with examples. Afterwards, a broad discussion goes through key applications of ML in workflows of radiation oncology as well as vendors’ perspectives for the clinical implementation of ML.
Article
Cardiovascular diseases are responsible for millions of deaths every year. In this scenario, non-invasive exams such as cine-magnetic resonance imaging (cine-MRI) have favored a better understanding of these pathologies, helping early diagnosis and previous treatments essential to improve the quality of life of individuals. Through this exam, specialists can obtain more accurate information about cardiac structures, including the myocardium, the left ventricular cavity, and the right ventricle. Given this context, this work presents an automatic method for the segmentation of these cardiac structures in short-axis cine-MRI images. The proposed method uses a cascade approach and is therefore divided into three main steps. The first step consists of extracting a region of interest to reduce the scope of processing. The second applies a fully convolutional network proposed to generate the initial segmentations of the myocardium, left ventricular cavity, and right ventricle. These initial segmentations are passed on to the third step, called refinement, in which a mask reconstruction module based on U-Net is used to restore the generated segmentations. In addition, in this step some specific post-processing techniques are also applied for each structure of interest. The proposed method achieves promising results in tests with the ACDC challenge dataset, both at the local level, and in the evaluation made by the challenge’s own platform, in which the proposed method proves to be competitive with the best approaches.
Article
Left ventricle segmentation in short-axis cardiac magnetic resonance images is important to diagnose heart disease. However, the repetitive manual segmentation of these images requires considerable human effort and can decrease diagnostic accuracy. In recent years, several fully and semi-automatic approaches have been proposed, mainly using image-based, atlas, graphs, deformable models, and artificial intelligence methods. This paper presents a systematic mapping on the left ventricle segmentation, considering 74 studies published in the last decade. The main contributions of this review are: definition of the main segmentation challenges in these images; proposal of a new schematization, dividing the segmentation process into stages; categorization and analysis of the segmentation methods, including hybrid combinations; and analysis of the evaluation process, metrics, and databases. The performance of the methods in the most used public database is assessed, and the main limitations, weaknesses, and strengths of each method category are presented. Finally, trends, challenges, and research opportunities are discussed. The analysis indicates that methods from all categories can achieve good performance, and hybrid methods combining deep learning and deformable models obtain the best results. Methods still fail in specific slices, segment wrong regions, and produce anatomically impossible segmentations.
Article
Automatic segmentation of cardiac magnetic resonance imaging (MRI) facilitates efficient and accurate volume measurement in clinical applications. However, due to anisotropic resolution, ambiguous borders and complicated shapes, existing methods suffer from the degradation of accuracy and robustness in cardiac MRI segmentation. In this paper, we propose an enhanced Deformable U-Net (DeU-Net) for 3D cardiac cine MRI segmentation, composed of three modules, namely Temporal Deformable Aggregation Module (TDAM), Enhanced Deformable Attention Network (EDAN), and Probabilistic Noise Correction Module (PNCM). TDAM first takes consecutive cardiac MR slices (including a target slice and its neighboring reference slices) as input, and extracts spatio-temporal information by an offset prediction network to generate fused features of the target slice. Then the fused features are also fed into EDAN that exploits several flexible deformable convolutional layers and generates clear borders of every segmentation map. A Multi-Scale Attention Module (MSAM) in EDAN is proposed to capture long range dependencies between features of different scales. Meanwhile, PNCM treats the fused features as a distribution to quantify uncertainty. Experimental results show that our DeU-Net achieves the state-of-the-art performance in terms of the commonly used evaluation metrics on the Extended ACDC dataset and competitive performance on other two datasets, validating the robustness and generalization of DeU-Net.
Preprint
Full-text available
Multi-sequence cardiac magnetic resonance (CMR) provides essential pathology information (scar and edema) to diagnose myocardial infarction. However, automatic pathology segmentation can be challenging due to the difficulty of effectively exploring the underlying information from the multi-sequence CMR data. This paper aims to tackle the scar and edema segmentation from multi-sequence CMR with a novel auto-weighted supervision framework, where the interactions among different supervised layers are explored under a task-specific objective using reinforcement learning. Furthermore, we design a coarse-to-fine framework to boost the small myocardial pathology region segmentation with shape prior knowledge. The coarse segmentation model identifies the left ventricle myocardial structure as a shape prior, while the fine segmentation model integrates a pixel-wise attention strategy with an auto-weighted supervision model to learn and extract salient pathological structures from the multi-sequence CMR data. Extensive experimental results on a publicly available dataset from Myocardial pathology segmentation combining multi-sequence CMR (MyoPS 2020) demonstrate our method can achieve promising performance compared with other state-of-the-art methods. Our method is promising in advancing the myocardial pathology assessment on multi-sequence CMR data. To motivate the community, we have made our code publicly available via https://github.com/soleilssss/AWSnet/tree/master.
Article
Multi-sequence cardiac magnetic resonance (CMR) provides essential pathology information (scar and edema) to diagnose myocardial infarction. However, automatic pathology segmentation can be challenging due to the difficulty of effectively exploring the underlying information from the multi-sequence CMR data. This paper aims to tackle the scar and edema segmentation from multi-sequence CMR with a novel auto-weighted supervision framework, where the interactions among different supervised layers are explored under a task-specific objective using reinforcement learning. Furthermore, we design a coarse-to-fine framework to boost the small myocardial pathology region segmentation with shape prior knowledge. The coarse segmentation model identifies the left ventricle myocardial structure as a shape prior, while the fine segmentation model integrates a pixel-wise attention strategy with an auto-weighted supervision model to learn and extract salient pathological structures from the multi-sequence CMR data. Extensive experimental results on a publicly available dataset from Myocardial pathology segmentation combining multi-sequence CMR (MyoPS 2020) demonstrate our method can achieve promising performance compared with other state-of-the-art methods. Our method is promising in advancing the myocardial pathology assessment on multi-sequence CMR data. To motivate the community, we have made our code publicly available via https://github.com/soleilssss/AWSnet/tree/master.
Article
Aortic dissection is a rapid and critical cardiovascular disease. The automatic segmentation and detection of related organs and lesions in CT volumes of aortic dissection provide great help for the rapid diagnosis and treatment of aortic dissection. However, the diagnosis of aortic dissection involves multi-organ and lesion segmentation, which is also a multi-label segmentation problem. It faces many challenges, such as small target scale, variable location of the true and false lumen, and complex judgment. To solve these problems, this paper proposes a deep model (MOLS-Net) to segment and detect aortic dissection from CT volumes quickly and automatically. First, the sequence feature pyramid attention module correlates CT image sequence features of different scales and guides the current image segmentation by exploring the correlation between slices. Secondly, combine the spatial attention module and the channel attention module in the decoder of the network to strengthen the model’s positioning accuracy of the target area and the feature utilization. Thirdly, this paper designs a multi-label classifier for the inter-class relationship of multi-label segmentation of aortic dissection and realizes multi-label segmentation on the end-to-end network. In this paper, we evaluate MOLS-Net on multiple datasets (self-made aortic dissection segmentation dataset and COVID-19 CT segmentation dataset), and the results show that the proposed method is superior to other state-of-the-art methods.
Chapter
View planning for the acquisition of cardiac magnetic resonance imaging (CMR) requires acquaintance with the cardiac anatomy and remains a challenging task in clinical practice. Existing approaches to its automation relied either on an additional volumetric image not typically acquired in clinic routine, or on laborious manual annotations of cardiac structural landmarks. This work presents a clinic-compatible and annotation-free system for automatic CMR view planning. The system mines the spatial relationship—more specifically, locates and exploits the intersecting lines—between the source and target views, and trains deep networks to regress heatmaps defined by these intersecting lines. As the spatial relationship is self-contained in properly stored data, e.g., in the DICOM format, the need for manual annotation is eliminated. Then, a multi-view planning strategy is proposed to aggregate information from the predicted heatmaps for all the source views of a target view, for a globally optimal prescription. The multi-view aggregation mimics the similar strategy practiced by skilled human prescribers. Experimental results on 181 clinical CMR exams show that our system achieves superior accuracy to existing approaches including conventional atlas-based and newer deep learning based ones, in prescribing four standard CMR views. The mean angle difference and point-to-plane distance evaluated against the ground truth planes are 5.98∘ and 3.48 mm, respectively.
Article
Full-text available
Recent work has shown that convolutional networks can be substantially deeper, more accurate and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper we embrace this observation and introduce the Dense Convolutional Network (DenseNet), where each layer is directly connected to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections, one between each layer and its subsequent layer (treating the input as layer 0), our network has L(L+1)/2 direct connections. For each layer, the feature maps of all preceding layers are treated as separate inputs whereas its own feature maps are passed on as inputs to all subsequent layers. Our proposed connectivity pattern has several compelling advantages: it alleviates the vanishing gradient problem and strengthens feature propagation; despite the increase in connections, it encourages feature reuse and leads to a substantial reduction of parameters; its models tend to generalize surprisingly well. We evaluate our proposed architecture on five highly competitive object recognition benchmark tasks. The DenseNet obtains significant improvements over the state-of-the-art on all five of them (e.g., yielding 3.74% test error on CIFAR-10, 19.25% on CIFAR-100 and 1.59% on SVHN).
Conference Paper
Full-text available
In cardiac magnetic resonance imaging, fully-automatic segmentation of the heart enables precise structural and functional measurements to be taken, e.g. from short-axis MR images of the left-ventricle. In this work we propose a recurrent fully-convolutional network (RFCN) that learns image representations from the full stack of 2D slices and has the ability to leverage inter-slice spatial dependences through internal memory units. RFCN combines anatomical detection and segmentation into a single architecture that is trained end-to-end thus significantly reducing computational time, simplifying the segmentation pipeline, and potentially enabling real-time applications. We report on an investigation of RFCN using two datasets, including the publicly available MICCAI 2009 Challenge dataset. Comparisons have been carried out between fully convolutional networks and deep restricted Boltzmann machines, including a recurrent version that leverages inter-slice spatial correlation. Our studies suggest that RFCN produces state-of-the-art results and can substantially improve the delineation of contours near the apex of the heart.
Conference Paper
Full-text available
Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and models are available at https://github.com/liuzhuang13/DenseNet.
Article
Full-text available
In cardiac magnetic resonance imaging, fully-automatic segmentation of the heart enables precise structural and functional measurements to be taken, e.g. from short-axis MR images of the left-ventricle. In this work we propose a recurrent fully-convolutional network (RFCN) that learns image representations from the full stack of 2D slices and has the ability to leverage inter-slice spatial dependences through internal memory units. RFCN combines anatomical detection and segmentation into a single architecture that is trained end-to-end thus significantly reducing computational time, simplifying the segmentation pipeline, and potentially enabling real-time applications. We report on an investigation of RFCN using two datasets, including the publicly available MICCAI 2009 Challenge dataset. Comparisons have been carried out between fully convolutional networks and deep restricted Boltzmann machines, including a recurrent version that leverages inter-slice spatial correlation. Our studies suggest that RFCN produces state-of-the-art results and can substantially improve the delineation of contours near the apex of the heart.
Article
Full-text available
We propose a method for interactive boundary extraction which combines a deep, patch-based representation with an active contour framework. We train a class-specific convolutional neural network which predicts a vector pointing from the respective point on the evolving contour towards the closest point on the boundary of the object of interest. These predictions form a vector field which is then used for evolving the contour by the Sobolev active contour framework proposed by Sundaramoorthi et al. The resulting interactive segmentation method is very efficient in terms of required computational resources and can even be trained on comparatively small graphics cards. We evaluate the potential of the proposed method on both medical and non-medical challenge data sets, such as the STACOM data set and the PASCAL VOC 2012 data set.
Article
Full-text available
Cardiovascular magnetic resonance (CMR) has become a key imaging modality in clinical cardiology practice due to its unique capabilities for non-invasive imaging of the cardiac chambers and great vessels. A wide range of CMR sequences have been developed to assess various aspects of cardiac structure and function, and significant advances have also been made in terms of imaging quality and acquisition times. A lot of research has been dedicated to the development of global and regional quantitative CMR indices that help the distinction between health and pathology. The goal of this review paper is to discuss the structural and functional CMR indices that have been proposed thus far for clinical assessment of the cardiac chambers. We include indices definitions, the requirements for the calculations, exemplar applications in cardiovascular diseases, and the corresponding normal ranges. Furthermore, we review the most recent state-of-the art techniques for the automatic segmentation of the cardiac boundaries, which are necessary for the calculation of the CMR indices. Finally, we provide a detailed discussion of the existing literature and of the future challenges that need to be addressed to enable a more robust and comprehensive assessment of the cardiac chambers in clinical practice.
Article
Full-text available
Real-time 3D Echocardiography (RT3DE) has been proven to be an accurate tool for left ventricular (LV) volume assessment. However, identification of the LV endocardium remains a challenging task, mainly because of the low tissue/blood contrast of the images combined with typical artifacts. Several semi and fully automatic algorithms have been proposed for segmenting the endocardium in RT3DE data in order to extract relevant clinical indices, but a systematic and fair comparison between such methods has so far been impossible due to the lack of a publicly available common database. Here, we introduce a standardized evaluation framework to reliably evaluate and compare the performance of the algorithms developed to segment the LV border in RT3DE. A database consisting of 45 multivendor cardiac ultrasound recordings acquired at different centers with corresponding reference measurements from 3 experts are made available. The algorithms from nine research groups were quantitatively evaluated and compared using the proposed online platform. The results showed that the best methods produce promising results with respect to the experts' measurements for the extraction of clinical indices, and that they offer good segmentation precision in terms of mean distance error in the context of the experts' variability range. The platform remains open for new submissions.
Article
Full-text available
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
Article
Full-text available
CMR quantification of LV chamber volumes typically and manually defines the basal-most LV, which adds processing time and user-dependence. This study developed an LV segmentation method that is fully automated based on the spatiotemporal continuity of the LV (LV-FAST). An iteratively decreasing threshold region growing approach was used first from the midventricle to the apex, until the LV area and shape discontinued, and then from midventricle to the base, until less than 50% of the myocardium circumference was observable. Region growth was constrained by LV spatiotemporal continuity to improve robustness of apical and basal segmentations. The LV-FAST method was compared with manual tracing on cardiac cine MRI data of 45 consecutive patients. Of the 45 patients, LV-FAST and manual selection identified the same apical slices at both ED and ES and the same basal slices at both ED and ES in 38, 38, 38, and 41 cases, respectively, and their measurements agreed within mL, mL, and for EDV, ESV, and EF, respectively. LV-FAST allowed LV volume-time course quantitatively measured within 3 seconds on a standard desktop computer, which is fast and accurate for processing the cine volumetric cardiac MRI data, and enables LV filling course quantification over the cardiac cycle.
Article
Full-text available
The most time consuming and limiting step in three dimensional (3D) cine displacement encoding with stimulated echoes (DENSE) MR image analysis is the demarcation of the left ventricle (LV) from its surrounding anatomical structures. The aim of this study is to implement a semi-automated segmentation algorithm for 3D cine DENSE CMR using a guide point model approach. A 3D mathematical model is fitted to guide points which were interactively placed along the LV borders at a single time frame. An algorithm is presented to robustly propagate LV epicardial and endocardial surfaces of the model using the displacement information encoded in the phase images of DENSE data. The accuracy, precision and efficiency of the algorithm are tested. The model-defined contours show good accuracy when compared to the corresponding manually defined contours as similarity coefficients Dice and Jaccard consist of values above 0.7, while false positive and false negative measures show low percentage values. This is based on a measure of segmentation error on intra- and inter-observer spatial overlap variability. The segmentation algorithm offers a 10-fold reduction in the time required to identify LV epicardial and endocardial borders for a single 3D DENSE data set. A semi-automated segmentation method has been developed for 3D cine DENSE CMR. The algorithm allows for contouring of the first cardiac frame where blood-myocardium contrast is almost nonexistent and reduces the time required to segment a 3D DENSE data set significantly.
Article
Full-text available
This document is an update to the 2008 publication of the Society for Cardiovascular Magnetic Resonance (SCMR) Board of Trustees Task Force on Standardized Protocols. Since the time of the original publication, 3 additional task forces (Reporting, Post-Processing, and Congenital Heart Disease) have published documents that should be referred to in conjunction with the present document. The section on general principles and techniques has been expanded as more of the techniques common to CMR have been standardized. There is still a great deal of development in the area of tissue characterization/mapping, so these protocols have been in general left as optional. The authors hope that this document continues to standardize and simplify the patient-based approach to clinical CMR. It will be updated at regular intervals as the field of CMR advances.
Article
Cardiac left ventricle (LV) quantification is among the most clinically important tasks for identification and diagnosis of cardiac disease. However, it is still a task of great challenge due to the high variability of cardiac structure across subjects and the complexity of temporal dynamics of cardiac sequences. Full quantification, i.e., to simultaneously quantify all LV indices including two areas (cavity and myocardium), six regional wall thicknesses (RWT), three LV dimensions, and one phase (Diastole or Systole), is even more challenging since the ambiguous correlations existing among these indices may impinge upon the convergence and generalization of the learning procedure. In this paper, we propose a deep multitask relationship learning network (DMTRL) for full LV quantification. The proposed DMTRL first obtains expressive and robust cardiac representations with a deep convolution neural network (CNN); then models the temporal dynamics of cardiac sequences effectively with two parallel recurrent neural network (RNN) modules. After that, it estimates the three types of LV indices under a Bayesian framework that is capable of learning multitask relationships automatically, and estimates the cardiac phase with a softmax classifier. The CNN representation, RNN temporal modeling, Bayesian multitask relationship learning, and softmax classifier establish an effective and integrated network which can be learned in an end-to-end manner. The obtained task covariance matrix captures the correlations existing among these indices, therefore leads to accurate estimation of LV indices and cardiac phase. Experiments on MR sequences of 145 subjects show that DMTRL achieves high accurate prediction, with average mean absolute error of 180 mm2, 1.39 mm, 2.51 mm for areas, RWT, dimensions and error rate of 8.2% for the phase classification. This endows our method a great potential in comprehensive clinical assessment of global, regional and dynamic cardiac function.
Article
Accurate segmentation of the heart is an important step towards evaluating cardiac function. In this paper, we present a fully automated framework for segmentation of the left (LV) and right (RV) ventricular cavities and the myocardium (Myo) on short-axis cardiac MR images. We investigate various 2D and 3D convolutional neural network architectures for this task. We investigate the suitability of various state-of-the art 2D and 3D convolutional neural network architectures, as well as slight modifications thereof, for this task. Experiments were performed on the ACDC 2017 challenge training dataset comprising cardiac MR images of 100 patients, where manual reference segmentations were made available for end-diastolic (ED) and end-systolic (ES) frames. We find that processing the images in a slice-by-slice fashion using 2D networks is beneficial due to a relatively large slice thickness. However, the exact network architecture only plays a minor role. We report mean Dice coefficients of $0.950$ (LV), $0.893$ (RV), and $0.899$ (Myo), respectively with an average evaluation time of 1.1 seconds per volume on a modern GPU.
Article
Traditional regression methods minimize the sum of errors of samples with various regularization terms such as the ℓ1-norm and ℓ2-norm. For the diagnosis of cardiovascular diseases, the cardiac ejection fraction (EF) represents an essential measure. However, existing regularization terms do not consider the output correlation (the correlation between ground truth volumes and estimated volumes w.r.t. each subject), which is beneficial in estimating the cardiac EF. In this paper, we first propose a sparse regression with two regularization terms of the ℓ1-norm and output correlation (SROC). Then, we propose a one-dimensional solution path algorithm for quickly finding two good regulation parameters in the formulation of SROC. The solution path algorithm can effectively handle singularities and infinities in the key matrix. Finally, we conduct experiments on a clinical cardiac image dataset with 100 subjects. The experimental results show that our method produces competitive and strong results for estimating the cardiac EF based on quantitative and qualitative analyses.
Conference Paper
In this paper, we propose an innovative approach for registration based on the deterministic prediction of the parameters from both images instead of the optimization of a energy criteria. The method relies on a fully convolutional network whose architecture consists of contracting layers to detect relevant features and a symmetric expanding path that matches them together and outputs the transformation parametrization. Whereas convolutional networks have seen a widespread expansion and have been already applied to many medical imaging problems such as segmentation and classification, its application to registration has so far faced the challenge of defining ground truth data on which to train the algorithm. Here, we present a novel training strategy to build reference deformations which relies on the registration of segmented regions of interest. We apply this methodology to the problem of inter-patient heart registration and show an important improvement over a state of the art optimization based algorithm. Not only our method is more accurate but it is also faster - registration of two 3D-images taking less than 30 ms second on a GPU - and more robust to outliers.
Article
Cardiac indices estimation is of great importance during identification and diagnosis of cardiac disease in clinical routine. However, estimation of multitype cardiac indices with consistently reliable and high accuracy is still a great challenge due to the high variability of cardiac structures and complexity of temporal dynamics in cardiac MR sequences. While efforts have been devoted into cardiac volumes estimation through feature engineering followed by a independent regression model, these methods suffer from the vulnerable feature representation and incompatible regression model. In this paper, we propose a semi-automated method for multitype cardiac indices estimation. After manual labelling of two landmarks for ROI cropping, an integrated deep neural network Indices-Net is designed to jointly learn the representation and regression models. It comprises two tightly-coupled networks: a deep convolution autoencoder (DCAE) for cardiac image representation, and a multiple output convolution neural network (CNN) for indices regression. Joint learning of the two networks effectively enhances the expressiveness of image representation with respect to cardiac indices, and the compatibility between image representation and indices regression, thus leading to accurate and reliable estimations for all the cardiac indices. When applied with five-fold cross validation on MR images of 145 subjects, Indices-Net achieves consistently low estimation error for LV wall thicknesses (1.440.71mm) and areas of cavity and myocardium (204133mm2). It outperforms, with significant error reductions, segmentation method (55.1% and 17.4%) and two-phase direct volume-only methods (12.7% and 14.6%) for wall thicknesses and areas, respectively. These advantages endow the proposed method a great potential in clinical cardiac function assessment
Article
Automated left ventricular (LV) segmentation is crucial for efficient quantification of cardiac function and morphology to aid subsequent management of cardiac pathologies. In this paper, we parameterize the complete (all short axis slices and phases) LV segmentation task in terms of the radial distances between the LV centerpoint and the endo- and epicardial contours in polar space. We then utilize convolutional neural network regression to infer these parameters. Utilizing parameter regression, as opposed to conventional pixel classification, allows the network to inherently reflect domain-specific physical constraints. We have benchmarked our approach primarily against the publicly-available left ventricle segmentation challenge (LVSC) dataset, which consists of 100 training and 100 validation cardiac MRI cases representing a heterogeneous mix of cardiac pathologies and imaging parameters across multiple centers. Our approach attained a .77 Jaccard index, which is the highest published overall result in comparison to other automated algorithms. To test general applicability, we also evaluated against the Kaggle Second Annual Data Science Bowl, where the evaluation metric was the indirect clinical measures of LV volume rather than direct myocardial contours. Our approach attained a Continuous Ranked Probability Score (CRPS) of .0124, which would have ranked tenth in the original challenge. With this we demonstrate the effectiveness of convolutional neural network regression paired with domain-specific features in clinical segmentation.
Conference Paper
Cardiac MRI is important for the diagnosis and assessment of various cardiovascular diseases. Automated segmentation of the left ventricular (LV) endocardium at end-diastole (ED) and end-systole (ES) enables automated quantification of various clinical parameters including ejection fraction. Neural networks have been used for general image segmentation, usually via per-pixel categorization e.g. “foreground” and “background”. In this paper we propose that the generally circular LV endocardium can be parameterized and the endocardial contour determined via neural network regression. We designed two convolutional neural networks (CNN), one for localization of the LV, and the other for determining the endocardial radius. We trained the networks against 100 datasets from the Medical Image Computing and Computer Assisted Intervention (MICCAI) 2011 challenge, and tested the networks against 45 datasets from the MICCAI 2009 challenge. The networks achieved 0.88 average Dice metric, 2.30 mm average perpendicular distance, and 97.9% good contours, the latter being the highest published result to date. These results demonstrate that CNN regression is a viable and highly promising method for automated LV endocardial segmentation at ED and ES phases, and is capable of generalizing learning between highly distinct training and testing data sets.
Conference Paper
3D cardiac MR imaging enables accurate analysis of cardiac morphology and physiology. However, due to the requirements for long acquisition and breath-hold, the clinical routine is still dominated by multi-slice 2D imaging, which hamper the visualization of anatomy and quantitative measurements as relatively thick slices are acquired. As a solution, we propose a novel image super-resolution (SR) approach that is based on a residual convolutional neural network (CNN) model. It reconstructs high resolution 3D volumes from 2D image stacks for more accurate image analysis. The proposed model allows the use of multiple input data acquired from different viewing planes for improved performance. Experimental results on 1233 cardiac short and long-axis MR image stacks show that the CNN model outperforms state-of-the-art SR methods in terms of image quality while being computationally efficient. Also, we show that image segmentation and motion tracking benefits more from SR-CNN when it is used as an initial upscaling method than conventional interpolation methods for the subsequent analysis.
Conference Paper
Atlas selection and label fusion are two major challenges in multi-atlas segmentation. In this paper, we propose a novel deep fusion net for better solving these challenges. Deep fusion net is a deep architecture by concatenating a feature extraction subnet and a non-local patch-based label fusion (NL-PLF) subnet in a single network. This network is trained end-to-end for automatically learning deep features achieving optimal performance in a NL-PLF framework. The learned deep features are further utilized in defining a similarity measure for atlas selection. Experimental results on Cardiac MR images for left ventricular segmentation demonstrate that our approach is effective both in atlas selection and multi-atlas label fusion, and achieves state of the art in performance.
Conference Paper
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
Article
We introduce a new methodology that combines deep learning and level set for the automated segmentation of the left ventricle of the heart from cardiac cine magnetic resonance (MR) data. This combination is relevant for segmentation problems, where the visual object of interest presents large shape and appearance variations, but the annotated training set is small, which is the case for various medical image analysis applications, including the one considered in this paper. In particular, level set methods are based on shape and appearance terms that use small training sets, but present limitations for modelling the visual object variations. Deep learning methods can model such variations using relatively small amounts of annotated training, but they often need to be regularised to produce good generalisation. Therefore, the combination of these methods brings together the advantages of both approaches, producing a methodology that needs small training sets and produces accurate segmentation results. We test our methodology on the MICCAI 2009 left ventricle segmentation challenge database (containing 15 sequences for training, 15 for validation and 15 for testing), where our approach achieves the most accurate results in the semi-automated problem and state-of-the-art results for the fully automated challenge.
Article
Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation. Our key insight is to build "fully convolutional" networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet, the VGG net, and GoogLeNet) into fully convolutional networks and transfer their learned representations by fine-tuning to the segmentation task. We then define a skip architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional network achieves improved segmentation of PASCAL VOC (30% relative improvement to 67.2% mean IU on 2012), NYUDv2, SIFT Flow, and PASCAL-Context, while inference takes one tenth of a second for a typical image.
Article
Automated cardiac segmentation from magnetic resonance imaging datasets is an essential step in the timely diagnosis and management of cardiac pathologies. We propose to tackle the problem of automated left and right ventricle segmentation through the application of a deep fully convolutional neural network architecture. Our model is efficiently trained end-to-end in a single learning stage from whole-image inputs and ground truths to make inference at every pixel. To our knowledge, this is the first application of a fully convolutional neural network architecture for pixel-wise labeling in cardiac magnetic resonance imaging. Numerical experiments demonstrate that our model is robust to outperform previous fully automated methods across multiple evaluation measures on a range of cardiac datasets. It is equally noteworthy that our model leverages commodity compute resources such as the graphics processing unit to enable fast, state-of-the-art cardiac segmentation at massive scales. The models and code will be released open-source in the near future.