Chapter

Multi-organ Segmentation via Co-training Weight-Averaged Models from Few-Organ Datasets

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Multi-organ segmentation requires to segment multiple organs of interest from each image. However, it is generally quite difficult to collect full annotations of all the organs on the same images, as some medical centers might only annotate a portion of the organs due to their own clinical practice. In most scenarios, one might obtain annotations of a single or a few organs from one training set, and obtain annotations of the other organs from another set of training images. Existing approaches mostly train and deploy a single model for each subset of organs, which are memory intensive and also time inefficient. In this paper, we propose to co-train weight-averaged models for learning a unified multi-organ segmentation network from few-organ datasets. Specifically, we collaboratively train two networks and let the coupled networks teach each other on un-annotated organs. To alleviate the noisy teaching supervisions between the networks, the weighted-averaged models are adopted to produce more reliable soft labels. In addition, a novel region mask is utilized to selectively apply the consistent constraint on the un-annotated organ regions that require collaborative teaching, which further boosts the performance. Extensive experiments on three publicly available single-organ datasets LiTS [1], KiTS [8], Pancreas [12] and manually-constructed single-organ datasets from MOBA [7] show that our method can better utilize the few-organ datasets and achieves superior performance with less inference computational cost.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... One emerging trend was leveraging available sparse labeled images to perform multi-organ segmentation. Huang et al. (2020) attempted to perform co-training of single-organ datasets (liver, kidney, and pancreas). Fang & Yan (2020) proposed a pyramidinput and pyramid-output network to condense multi-scale features to reduce the semantic gaps. ...
... Authors Key Features MIA Zhou et al. (2021a) multimodal registration, unsupervised segmentation, image-guided intervention MIA Wang et al. (2021) conjugate fully convolutional network, pairwise segmentation, proxy supervision MIA Zhou et al. (2021b) 3D Deep learning, self-supervised learning, transfer learning MICCAI Shirokikh et al. (2020) loss reweighting, lesion detection MICCAI Haghighi et al. (2020) self-supervised learning, transfer learning, 3D model pre-training MICCAI Huang et al. (2020) co-training of sparse datasets, multi-organ segmentation MICCAI Wang et al. (2019) volumetric attention, 3D segmentation MICCAI Tang et al. (2020) edge enhanced network, cross feature fusion Nature Methods Isensee et al. (2020) self-configuring framework, extensive evaluation on 23 challenges TMI Cano-Espinosa et al. (2020) biomarker regression and localization TMI Fang & Yan (2020) multi-organ segmentation, multi-scale training, partially labeled data TMI Haghighi et al. (2021) self-supervised learning, anatomical visual words TMI Zhang et al. (2020a) interpretable learning, probability calibration TMI Ma et al. (2020) geodesic active contours learning, boundary segmentation TMI Yan et al. (2020) training on partially-labeled dataset, lesion detection, multi-dataset learning TMI Wang et al. (2020) 2.5D semantic segmentation, attention Fig. 6: Tumor segmentation results of the ISBI-LiTS 2017 challenge. The reference annotation is marked with green contour, while the prediction is with blue contour. ...
Article
Full-text available
In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with varied sizes and appearances with various lesion-to-background levels (hyper-/hypo-dense), created in collaboration with seven hospitals and research institutions. Seventy-five submitted liver and liver tumor segmentation algorithms were trained on a set of 131 computed tomography (CT) volumes and were tested on 70 unseen test images acquired from different patients. We found that not a single algorithm performed best for both liver and liver tumors in the three events. The best liver segmentation algorithm achieved a Dice score of 0.963, whereas, for tumor segmentation, the best algorithms achieved Dices scores of 0.674 (ISBI 2017), 0.702 (MICCAI 2017), and 0.739 (MICCAI 2018). Retrospectively, we performed additional analysis on liver tumor detection and revealed that not all top-performing segmentation algorithms worked well for tumor detection. The best liver tumor detection method achieved a lesion-wise recall of 0.458 (ISBI 2017), 0.515 (MICCAI 2017), and 0.554 (MICCAI 2018), indicating the need for further research. LiTS remains an active benchmark and resource for research, e.g., contributing the liver-related segmentation tasks in http://medicaldecathlon.com/. In addition, both data and online evaluation are accessible via www.lits-challenge.com.
... As a common method in semi-supervised learning that jointly uses labeled and unlabeled data to improve the generalization of the model through collaboration among multiple learners [23], it has been broadly applied to image classification [24], target recognition [25] and image segmentation [26]. For medical image analysis, co-training has been adopted for semi-supervised leaning [27,28] or consistency learning in different views [25,29]. For cell segmentation, Zhao et al. [30] combined co-training with divergence loss to enlarge the prediction difference between the two models. ...
... Addressing the challenge caused by imposing additional constraints, instead of directly treating the additional supervision as a golden standard, self-supervised learning (SSL) provides a more effective way to exploit the implicit supervision. Recently, we have witnessed the advances of SSL in natural image analysis [31][32][33][34][35], which can be categorized into three types: contrastive learning [31,32], clustering [28,33], and consistency learning [35]. Extensive recent studies [36][37][38] in pathology image analysis have demonstrated the effectiveness of SSL techniques, which were originally proposed for natural images (e.g., CPC [39], SimCLR [31], and MoCo [32]). ...
Preprint
Full-text available
Nuclei segmentation is a crucial task for whole slide image analysis in digital pathology. Generally, the segmentation performance of fully-supervised learning heavily depends on the amount and quality of the annotated data. However, it is time-consuming and expensive for professional pathologists to provide accurate pixel-level ground truth, while it is much easier to get coarse labels such as point annotations. In this paper, we propose a weakly-supervised learning method for nuclei segmentation that only requires point annotations for training. The proposed method achieves label propagation in a coarse-to-fine manner as follows. First, coarse pixel-level labels are derived from the point annotations based on the Voronoi diagram and the k-means clustering method to avoid overfitting. Second, a co-training strategy with an exponential moving average method is designed to refine the incomplete supervision of the coarse labels. Third, a self-supervised visual representation learning method is tailored for nuclei segmentation of pathology images that transforms the hematoxylin component images into the H\&E stained images to gain better understanding of the relationship between the nuclei and cytoplasm. We comprehensively evaluate the proposed method using two public datasets. Both visual and quantitative results demonstrate the superiority of our method to the state-of-the-art methods, and its competitive performance compared to the fully-supervised methods. The source codes for implementing the experiments will be released after acceptance.
... Semi-supervised method [139] Liver Liver segmentation Semi-supervised adversarial learning model (Segmentation Network and Discriminator Network) [140] Automatic 3D liver location and segmentation Graph cut and CNN [141] Extracting the Liver from CT images Graph Cut [142] Automatic liver segmentation on volumetric CT images Supervoxel-Based Graph Cuts [143] Automatic liver segmentation from abdominal CT volumes Graph Cuts and Border Marching [144] Fully automatic liver segmentation in CT images Modified Graph Cuts [145] Kidney Precise estimation of renal vascular dominant regions Tensor-cut, spatially aware FCN, and Voronoi diagrams [146] Automatic renal segmentation for MR Urography 3D-GraphCut and Random Forests [147] Segmenting kidneys in 2D US images Graph Cuts [148] Pancreas Semi-supervised medical image segmentation and domain adaptation Uncertainty-aware multi-view co-training (UMCT) [149] Medical image segmentation Semi-supervised task-driven data augmentation method [150] 3D semi-supervised learning UMCT [151] Organ segmentation refinement Uncertainty-based graph convolutional networks [152] Spleen Splenomegaly segmentation Co-learning and DCNN [153] Multi-organ Unified Multi-organ segmentation Co-training weight-averaged models [154] Multiple-organ segmentation Graph Cuts [155] Automatic multiorgan segmentation for 3-D radiological images Graph Cuts tion, shape, and location priors by [156] proved to be effective in the segmentation of gallbladder with computational efficiency that was high. [148] proposed an uncertainty-aware multi-view cotraining framework that leveraged on unlabeled data for better performance. ...
... Huo et al. lastly, employed graph cuts refinement to realize the segmentation finally from the probability maps from multi-atlas segmentation (MAS). [152] designed a co-learning strategy to train a deep network from scans that were heterogeneously labeled hence, they proposed a new method of deep convolutional neural network (DCNN) which integrated heterogeneous multi-resource labeled cohorts for the segmentation of splenomegaly. Tang et al. introduced a loss function based on the Dice similarity coefficient to learn adaptively multi-organ information from varied resources. ...
Article
Abdominal organ segmentation is the segregation of a single or multiple abdominal organ(s) into semantic image segments of pixels identified with homogeneous features such as color and texture, and intensity. The abdominal organ(s) condition is mostly connected with greater amount of morbidity and mortality since most patients often have asymptomatic abdominal condition and symptoms , which is often recognized late hence, the abdomen has been the third most common cause of damage to the human body. That notwithstanding , there may be improved outcome when the condition of an abdominal organ is detected earlier. Over the years supervised and semi-supervised machine learning methods have been used to segment abdominal organ(s) in order to detect the organ(s) condition. The supervised methods performs well when the training data used is a representative of the data targeted but, the methods require large manual annotated data and has adaptation problem. The semi-supervised methods are fast but records poor performance than the supervised if assumptions made about the data fails to hold. Current state-of-the art methods of supervised segmentation are largely based on deep learning techniques due to its good accuracy and success in real world applications though it requires large amount of training data for automatic feature extraction else deep learning can hardly be used. As regards the semi-supervised methods of segment-ation self-training and graph-based techniques have attracted much research attention. Self-training can be used with any classifier but does not have a mechanism to rectify mistakes early. Graph-based techniques thrives on its convexity, scalability, and effectiveness in application but have an out-of-sample problem. In this review paper, a study has been carried out on supervised and semi-supervised methods of performing abdominal organ segmentation. An observation of the current approaches connection and gaps are identified, and prospective future research opportunities are enumerated.
... Weakly supervised learning explores the use of weak annotations like noisy annotations and sparse annotations [22]. Besides, some approaches also aim at integrating multiple related datasets to learn general knowledge [23], [24]. To issue the problem of limited labeled COVID-19 data, in this work, we aim at utilizing existing non-COVID lung lesion datasets for generalizing useful information to related COVID-19 task, so as to achieve better segmentation performance with limited in-domain training data. ...
Article
The novel Coronavirus disease (COVID-19) is a highly contagious virus and has spread all over the world, posing an extremely serious threat to all countries. Automatic lung infection segmentation from computed tomography (CT) plays an important role in the quantitative analysis of COVID-19. However, the major challenge lies in the inadequacy of annotated COVID-19 datasets. Currently, there are several public non-COVID lung lesion segmentation datasets, providing the potential for generalizing useful information to the related COVID-19 segmentation task. In this paper, we propose a novel relation-driven collaborative learning model to exploit shared knowledge from non-COVID lesions for annotation-efficient COVID-19 CT lung infection segmentation. The model consists of a general encoder to capture general lung lesion features based on multiple non-COVID lesions, and a target encoder to focus on task-specific features based on COVID-19 infections. Features extracted from the two parallel encoders are concatenated for the subsequent decoder part. We develop a collaborative learning scheme to regularize feature-level relation consistency of given input and encourage the model to learn more general and discriminative representation of COVID-19 infections. Extensive experiments demonstrate that trained with limited COVID-19 data, exploiting shared knowledge from non-COVID lesions can further improve state-of-the-art performance with up to 3.0% in dice similarity coefficient and 4.2% in normalized surface dice. In addition, experimental results on large scale 2D dataset with CT slices show that our method significantly outperforms cutting-edge segmentation methods on all evaluation metrics. Our proposed method promotes new insights into annotation-efficient deep learning for COVID-19 infection segmentation and illustrates strong potential for real-world applications in the global fight against COVID-19 in the absence of sufficient high-quality annotations.
... Concretely, they inserted the conditional information as an intermediate activation between convolutional operation and the activation function. Huang et al. [276] tackled the problem of partially-supervised multi-organ segmentation in the co-training framework, where they collaboratively trained multiple networks, and each network was taught by other networks on un-annotated organs. Yan et al. [277] proposed to develop a universal lesion detection algorithm to detect a comprehensive variety of lesions from multiple datasets with partial labels. ...
Article
Full-text available
Despite the remarkable performance of deep learning methods on various tasks, most cutting-edge models rely heavily on large-scale annotated training examples, which are often unavailable for clinical and health care tasks. The labeling costs for medical images are very high, especially in medical image segmentation, which typically requires intensive pixel/voxel-wise labeling. Therefore, the strong capability of learning and generalizing from limited supervision, including a limited amount of annotations, sparse annotations, and inaccurate annotations, is crucial for the successful application of deep learning models in medical image segmentation. However, due to its intrinsic difficulty, segmentation with limited supervision is challenging and specific model design and/or learning strategies are needed. In this paper, we provide a systematic and up-to-date review of the solutions above, with summaries and comments about the methodologies. We also highlight several problems in this field, discussed future directions observing further investigations.
Preprint
The spleen is one of the most commonly injured solid organs in blunt abdominal trauma. The development of automatic segmentation systems from multi-phase CT for splenic vascular injury can augment severity grading for improving clinical decision support and outcome prediction. However, accurate segmentation of splenic vascular injury is challenging for the following reasons: 1) Splenic vascular injury can be highly variant in shape, texture, size, and overall appearance; and 2) Data acquisition is a complex and expensive procedure that requires intensive efforts from both data scientists and radiologists, which makes large-scale well-annotated datasets hard to acquire in general. In light of these challenges, we hereby design a novel framework for multi-phase splenic vascular injury segmentation, especially with limited data. On the one hand, we propose to leverage external data to mine pseudo splenic masks as the spatial attention, dubbed external attention, for guiding the segmentation of splenic vascular injury. On the other hand, we develop a synthetic phase augmentation module, which builds upon generative adversarial networks, for populating the internal data by fully leveraging the relation between different phases. By jointly enforcing external attention and populating internal data representation during training, our proposed method outperforms other competing methods and substantially improves the popular DeepLab-v3+ baseline by more than 7% in terms of average DSC, which confirms its effectiveness.
Article
The spleen is one of the most commonly injured solid organs in blunt abdominal trauma. The development of automatic segmentation systems from multi-phase CT for splenic vascular injury can augment severity grading for improving clinical decision support and outcome prediction. However, accurate segmentation of splenic vascular injury is challenging for the following reasons: 1) Splenic vascular injury can be highly variant in shape, texture, size, and overall appearance; and 2) Data acquisition is a complex and expensive procedure that requires intensive efforts from both data scientists and radiologists, which makes large-scale well-annotated datasets hard to acquire in general. In light of these challenges, we hereby design a novel framework for multi-phase splenic vascular injury segmentation, especially with limited data. On the one hand, we propose to leverage external data to mine pseudo splenic masks as the spatial attention, dubbed external attention , for guiding the segmentation of splenic vascular injury. On the other hand, we develop a synthetic phase augmentation module, which builds upon generative adversarial networks, for populating the internal data by fully leveraging the relation between different phases. By jointly enforcing external attention and populating internal data representation during training, our proposed method outperforms other competing methods and substantially improves the popular DeepLab-v3+ baseline by more than 7% in terms of average DSC, which confirms its effectiveness.
Article
Image segmentation is widely used in the medical field. Convolutional neural network has become more diverse and effective in recent years. However, at present, most networks are designed for a single dataset (i.e., a single organ or target). The designed network is only suitable for a single dataset, and its accuracy is very different (especially small-size image datasets). In response to this problem, a collaborative network can be designed to simultaneously extract the specific and common features of a multi-dataset (i.e., multiple organs or targets). The network can be used for multi-dataset segmentation and help to balance the segmentation performance of different datasets, especially to improve the accuracy of small-size image datasets. By exploring the adapters modified by the convolution kernels, the adaptive weight update strategy and the network branched structure, the paper proposes a multi-dataset collaborative image segmentation network, called Md-Unet, which integrates a shared-specific adapter (SSA), an asymmetric similarity loss function with the proposed adaptive weight update strategy, and a dual-branch. Experimental results showed that compared with the baseline 3D U²Net, the accuracy of the module using the SSA was improved by 3.7%, using several loss functions with the proposed adaptive weight update strategy was improved by 0.64%-30.63%, and using dual-branch integrated architecture was improved by 17.47%. Moreover, Md-Unet had a significant improvement on small-size image datasets compared with single-dataset models.
Article
Automatic multi-organ segmentation in medical images is crucial for many clinical applications. The art methods have reported promising results but rely on massive annotated data. However, such data is hard to obtain due to the need for considerable expertise. In contrast, obtaining a single-organ dataset is relatively easier, and many well-annotated ones are publicly available. To this end, this work raises the partially supervised problem: can we use these single-organ datasets to learn a multi-organ segmentation model? In this paper, we propose the Partial- and Mutual-Prior incorporated framework (PRIMP) to learn a robust multi-organ segmentation model by deriving knowledge from single-organ datasets. Unlike existing methods that largely ignore the organs’ anatomical prior knowledge, our PRIMP is designed with two key prior shared across different subjects and datasets: (1) partial-prior, each organ has its own character (e.g., size and shape) and (2) mutual-prior, the relative position between different organs follows the comparatively fixed anatomical structure. Specifically, we propose to incorporate partial-prior of each organ by learning from the single-organ statistics, and inject mutual-prior of organs by learning from the multi-organ statistics. By doing so, the model is encouraged to capture organs’ anatomical invariance across different subjects and datasets, thus guaranteeing the anatomical reasonableness of the predictions, narrowing down the problem of domain gaps, capturing spatial information among different slices, thereby improving organs’ segmentation performance. Experiments on four publicly available datasets (LiTS, Pancreas, KiTS, BTCV) show that our PRIMP can improve the performance on both the multi-organ and single-organ datasets (17.40% and 3.06% above the baseline model on DSC, respectively) and can surpass the comparative approaches.
Chapter
Due to the inter-observer variation, the ground truth of lesion areas in pathological images is generated by majority-voting of annotations provided by different pathologists. Such a process is extremely laborious, since each pathologist needs to spend hours or even days for pixel-wise annotations. In this paper, we propose a reinforcement learning framework to automatically refine the set of annotations provided by a single pathologist based on several exemplars of ground truth. Particularly, we treat each pixel as an agent with a shared pixel-level action space. The multi-agent model observes several paired single-pathologist annotations and ground truth, and tries to customize the strategy to narrow down the gap between them with episodes of exploring. Furthermore, we integrate a discriminator to the multi-agent framework to evaluate the quality of annotation refinement. A quality reward is yielded by the discriminator to update the policy of agents. Experimental results on the publicly available Gleason 2019 dataset demonstrate the effectiveness of our reinforcement learning framework—the segmentation network trained with our refined single-pathologist annotations achieves a comparable accuracy to the one using majority-voting-based ground truth.
Chapter
Learning from external data is an effective and efficient way of training deep networks, which can substantially alleviate the burden on collecting training data and annotations. It is of great significance in improving the performance of CT image segmentation tasks, where collecting a large amount of voxel-wise annotations is expensive or even impractical. In this paper, we propose a generic selective learning method to maximize the performance gains of harnessing external data in CT image segmentation. The key idea is to learn a weight for each external data such that ‘good’ data can have large weights and thus contribute more to the training loss, thereby implicitly encouraging the network to mine more valuable knowledge from informative external data while suppressing to memorize irrelevant patterns from ‘useless’ or even ‘harmful’ data. Particularly, we formulate our idea as a constrained non-linear programming problem, solved by an iterative solution that alternatively conducts weights estimating and network updating. Extensive experiments on abdominal multi-organ CT segmentation datasets show the efficacy and performance gains of our method against existing methods. The code is publicly available (Released at https://github.com/YouyiSong/Codes-for-Selective-Learning).
Article
Full-text available
Accurate segmentation of breast masses is an essential step in computer aided diagnosis of breast cancer. The scarcity of annotated training data greatly hinders the model’s generalization ability, especially for the deep learning based methods. However, high-quality image-level annotations are time-consuming and cumbersome in medical image analysis scenarios. In addition, a large amount of weak annotations is under-utilized which comprise common anatomy features. To this end, inspired by teacher-student networks, we propose an Anatomy-Aware Weakly-Supervised learning Network (AAWS-Net) for extracting useful information from mammograms with weak annotations for efficient and accurate breast mass segmentation. Specifically, we adopt a weakly-supervised learning strategy in the Teacher to extract anatomy structure from mammograms with weak annotations by reconstructing the original image. Besides, knowledge distillation is used to suggest morphological differences between benign and malignant masses. Moreover, the prior knowledge learned from the Teacher is introduced to the Student in an end-to-end way, which improves the ability of the student network to locate and segment masses. Experiments on CBIS-DDSM have shown that our method yields promising performance compared with state-of-the-art alternative models for breast mass segmentation in terms of segmentation accuracy and IoU.
Chapter
Segmentation studies in medical image analysis are always associated with a particular task scenario. However, building datasets to train models to segment multiple types of organs and pathologies is challenging. For example, a dataset annotated for the pancreas and pancreatic tumors will result in a model that cannot segment other organs, like the liver and spleen, visible in the same abdominal computed tomography image. The lack of a well-annotated dataset is one limitation resulting in a lack of universal segmentation models. Federated learning (FL) is ideally suited for addressing this issue in the real-world context. In this work, we show that each medical center can use training data for distinct tasks to collaboratively build more generalizable segmentation models for multiple segmentation tasks without the requirement to centralize datasets in one place. The main challenge of this research is the heterogeneity of training data from various institutions and segmentation tasks. In this paper, we propose a multi-task segmentation framework using FL to learn segmentation models using several independent datasets with different annotations of organs or tumors. We include experiments on four publicly available single-task datasets, including MSD liver (w/ tumor), MSD spleen, MSD pancreas (w/ tumor), and KITS19. Experimental results on an external validation set to highlight the advantages of employing FL in multi-task organ and tumor segmentation.KeywordsFederated learningSegmentationPartial labels
Chapter
Coherent and systematic analysis for finding complex patterns in structured and unstructured cancer data has seen quite a rich and diverse implementation of distinct techniques in the recent past. The delicate and life-threatening aspect of Cancer has led to the huge need as well as the attraction of everyone to propose optimized techniques to garner a commendable result for the prediction of cancer subtypes. As a result, several Data Analysis techniques have led a revolution to provide the best outcomes, among which several have shown mammoth results. In this chapter, the focus is put directly on such techniques that have been implemented and adapted for cancer data analysis. The chapter goes into quite an in-depth review of each of the proposed architectures which have been very precisely screened by us and also do quite a lot to develop a concrete sense of each one of these taxonomies by putting each of them under scanning through various evaluation metrics. Furthermore, the chapter issues several future scopes and recommendations from the perspective of the authors to ignite the thought of the ones interested in pushing this field into further sub-stratum.
Article
Full-text available
Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network. The typical application is to transfer from a powerful large network or ensemble to a small network, that is better suited to low-memory or fast execution requirements. In this paper, we present a deep mutual learning (DML) strategy where, rather than one way transfer between a static pre-defined teacher and a student, an ensemble of students learn collaboratively and teach each other throughout the training process. Our experiments show that a variety of network architectures benefit from mutual learning and achieve compelling results on CIFAR-100 recognition and Market-1501 person re-identification benchmarks. Surprisingly, it is revealed that no prior powerful teacher network is necessary -- mutual learning of a collection of simple student networks works, and moreover outperforms distillation from a more powerful yet static teacher.
Conference Paper
Segmenting pancreas from abdominal CT scans is an important prerequisite for pancreatic cancer diagnosis and precise treatment planning. However, automated pancreas segmentation faces challenges posed by shape and size variances, low contrast with regard to adjacent tissues and in particular negligibly small proportion to the whole abdominal volume. Current coarse-to-fine frameworks, either using tri-planar schemes or stacking 2D pre-segmentation as prior to 3D networks, have limitation on effectively capturing 3D information. While iterative updates on region of interest (ROI) in refinement stage alleviate accumulated errors caused by coarse segmentation, extra computational burden is introduced. In this paper, we harness 2D networks and 3D features to improve segmentation accuracy and efficiency. Firstly, in the 3D coarse segmentation network, a new bias-dice loss function is defined to increase ROI recall rates to improve efficiency by avoiding iterative ROI refinements. Secondly, for a full utilization of 3D information, dimension adaptation module (DAM) is introduced to bridge 2D networks and 3D information. Finally, a fusion decision module and parallel training strategy are proposed to fuse multi-source feature cues extracted from three sub-networks to make final predictions. The proposed method is evaluated in NIH dataset and outperforms the state-of-the-art methods in comparison, with mean Dice-Sørensen coefficient (DSC) of 85.22%, and with averagely 0.4 min for each instance.
Chapter
Convolutional neural networks (CNNs) have achieved great successes in many computer vision problems. Unlike existing works that designed CNN architectures to improve performance on a single task of a single domain and not generalizable, we present IBN-Net, a novel convolutional architecture, which remarkably enhances a CNN’s modeling ability on one domain (e.g. Cityscapes) as well as its generalization capacity on another domain (e.g. GTA5) without finetuning. IBN-Net carefully integrates Instance Normalization (IN) and Batch Normalization (BN) as building blocks, and can be wrapped into many advanced deep networks to improve their performances. This work has three key contributions. (1) By delving into IN and BN, we disclose that IN learns features that are invariant to appearance changes, such as colors, styles, and virtuality/reality, while BN is essential for preserving content related information. (2) IBN-Net can be applied to many advanced deep architectures, such as DenseNet, ResNet, ResNeXt, and SENet, and consistently improve their performance without increasing computational cost. (3) When applying the trained networks to new domains, e.g. from GTA5 to Cityscapes, IBN-Net achieves comparable improvements as domain adaptation methods, even without using data from the target domain. With IBN-Net, we won the 1st place on the WAD 2018 Challenge Drivable Area track, with an mIoU of 86.18%.
Article
Automatic segmentation of abdominal anatomy on computed tomography (CT) images can support diagnosis, treatment planning and treatment delivery workflows. Segmentation methods using statistical models and multi-atlas label fusion (MALF) require inter-subject image registrations which are challenging for abdominal images, but alternative methods without registration have not yet achieved higher accuracy for most abdominal organs. We present a registration-free deeplearning- based segmentation algorithm for eight organs that are relevant for navigation in endoscopic pancreatic and biliary procedures, including the pancreas, the GI tract (esophagus, stomach, duodenum) and surrounding organs (liver, spleen, left kidney, gallbladder). We directly compared the segmentation accuracy of the proposed method to existing deep learning and MALF methods in a cross-validation on a multi-centre data set with 90 subjects. The proposed method yielded significantly higher Dice scores for all organs and lower mean absolute distances for most organs, including Dice scores of 0.78 vs. 0.71, 0.74 and 0.74 for the pancreas, 0.90 vs 0.85, 0.87 and 0.83 for the stomach and 0.76 vs 0.68, 0.69 and 0.66 for the esophagus. We conclude that deep-learning-based segmentation represents a registration-free method for multi-organ abdominal CT segmentation whose accuracy can surpass current methods, potentially supporting image-guided navigation in gastrointestinal endoscopy procedures.
Article
Spatial pyramid pooling module or encode-decoder structure are used in deep neural networks for semantic segmentation task. The former networks are able to encode multi-scale contextual information by probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view, while the latter networks can capture sharper object boundaries by gradually recovering the spatial information. In this work, we propose to combine the advantages from both methods. Specifically, our proposed model, DeepLabv3+, extends DeepLabv3 by adding a simple yet effective decoder module to refine the segmentation results especially along object boundaries. We further explore the Xception model and apply the depthwise separable convolution to both Atrous Spatial Pyramid Pooling and decoder modules, resulting in a faster and stronger encoder-decoder network. We demonstrate the effectiveness of the proposed model on the PASCAL VOC 2012 semantic image segmentation dataset and achieve a performance of 89% on the test set without any post-processing. Our paper is accompanied with a publicly available reference implementation of the proposed models in Tensorflow.
Article
Segmentation of key brain tissues from 3D medical images is of great significance for brain disease diagnosis, progression assessment and monitoring of neurologic conditions. While manual segmentation is time-consuming, laborious, and subjective, automated segmentation is quite challenging due to the complicated anatomical environment of brain and the large variations of brain tissues. We propose a novel voxelwise residual network (VoxResNet) with a set of effective training schemes to cope with this challenging problem. The main merit of residual learning is that it can alleviate the degradation problem when training a deep network so that the performance gains achieved by increasing the network depth can be fully leveraged. With this technique, our VoxResNet is built with 25 layers, and hence can generate more representative features to deal with the large variations of brain tissues than its rivals using hand-crafted features or shallower networks. In order to effectively train such a deep network with limited training data for brain segmentation, we seamlessly integrate multi-modality and multi-level contextual information into our network, so that the complementary information of different modalities can be harnessed and features of different scales can be exploited. Furthermore, an auto-context version of the VoxResNet is proposed by combining the low-level image appearance features, implicit shape information, and high-level context together for further improving the segmentation performance. Extensive experiments on the well-known benchmark (i.e., MRBrainS) of brain segmentation from 3D magnetic resonance (MR) images corroborated the efficacy of the proposed VoxResNet. Our method achieved the first place in the challenge out of 37 competitors including several state-of-the-art brain segmentation methods. Our method is inherently general and can be readily applied as a powerful tool to many brain-related studies, where accurate segmentation of brain structures is critical.
Article
The recently proposed temporal ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks. It maintains an exponential moving average of label predictions on each training example, and penalizes predictions that are inconsistent with this target. However, because the targets change only once per epoch, temporal ensembling becomes unwieldy when using large datasets. To overcome this problem, we propose a method that averages model weights instead of label predictions. As an additional benefit, the method improves test accuracy and enables training with fewer labels than earlier methods. We report state-of-the-art results on semi-supervised SVHN, reducing the error rate from 5.12% to 4.41% with 500 labels, and achieving 5.39% error rate with 250 labels. By using extra unlabeled data, we reduce the error rate to 2.76% on 500-label SVHN.
Conference Paper
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
A large annotated medical image dataset for the development and evaluation of segmentation algorithms
  • A L Simpson
Two at once: enhancing learning and generalization capacities via IBN-net
  • X Pan
  • P Luo
  • J Shi
  • X Tang
Mutual mean-teaching: pseudo label refinery for unsupervised domain adaptation on person re-identification
  • Y Ge
  • D Chen
  • H Li
The kits19 challenge data: 300 kidney tumor cases with clinical context, CT semantic segmentations, and surgical outcomes
  • N Heller