Chapter

Multi-organ Segmentation via Co-training Weight-Averaged Models from Few-Organ Datasets

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Multi-organ segmentation requires to segment multiple organs of interest from each image. However, it is generally quite difficult to collect full annotations of all the organs on the same images, as some medical centers might only annotate a portion of the organs due to their own clinical practice. In most scenarios, one might obtain annotations of a single or a few organs from one training set, and obtain annotations of the other organs from another set of training images. Existing approaches mostly train and deploy a single model for each subset of organs, which are memory intensive and also time inefficient. In this paper, we propose to co-train weight-averaged models for learning a unified multi-organ segmentation network from few-organ datasets. Specifically, we collaboratively train two networks and let the coupled networks teach each other on un-annotated organs. To alleviate the noisy teaching supervisions between the networks, the weighted-averaged models are adopted to produce more reliable soft labels. In addition, a novel region mask is utilized to selectively apply the consistent constraint on the un-annotated organ regions that require collaborative teaching, which further boosts the performance. Extensive experiments on three publicly available single-organ datasets LiTS [1], KiTS [8], Pancreas [12] and manually-constructed single-organ datasets from MOBA [7] show that our method can better utilize the few-organ datasets and achieves superior performance with less inference computational cost.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... As a common method in semi-supervised learning that jointly uses labeled and unlabeled data to improve the generalization of the model through collaboration among multiple learners [23], it has been broadly applied to image classification [24], target recognition [25] and image segmentation [26]. For medical image analysis, co-training has been adopted for semi-supervised leaning [27,28] or consistency learning in different views [25,29]. For cell segmentation, Zhao et al. [30] combined co-training with divergence loss to enlarge the prediction difference between the two models. ...
... Addressing the challenge caused by imposing additional constraints, instead of directly treating the additional supervision as a golden standard, self-supervised learning (SSL) provides a more effective way to exploit the implicit supervision. Recently, we have witnessed the advances of SSL in natural image analysis [31][32][33][34][35], which can be categorized into three types: contrastive learning [31,32], clustering [28,33], and consistency learning [35]. Extensive recent studies [36][37][38] in pathology image analysis have demonstrated the effectiveness of SSL techniques, which were originally proposed for natural images (e.g., CPC [39], SimCLR [31], and MoCo [32]). ...
Preprint
Full-text available
Nuclei segmentation is a crucial task for whole slide image analysis in digital pathology. Generally, the segmentation performance of fully-supervised learning heavily depends on the amount and quality of the annotated data. However, it is time-consuming and expensive for professional pathologists to provide accurate pixel-level ground truth, while it is much easier to get coarse labels such as point annotations. In this paper, we propose a weakly-supervised learning method for nuclei segmentation that only requires point annotations for training. The proposed method achieves label propagation in a coarse-to-fine manner as follows. First, coarse pixel-level labels are derived from the point annotations based on the Voronoi diagram and the k-means clustering method to avoid overfitting. Second, a co-training strategy with an exponential moving average method is designed to refine the incomplete supervision of the coarse labels. Third, a self-supervised visual representation learning method is tailored for nuclei segmentation of pathology images that transforms the hematoxylin component images into the H\&E stained images to gain better understanding of the relationship between the nuclei and cytoplasm. We comprehensively evaluate the proposed method using two public datasets. Both visual and quantitative results demonstrate the superiority of our method to the state-of-the-art methods, and its competitive performance compared to the fully-supervised methods. The source codes for implementing the experiments will be released after acceptance.
... Semi-supervised method [139] Liver Liver segmentation Semi-supervised adversarial learning model (Segmentation Network and Discriminator Network) [140] Automatic 3D liver location and segmentation Graph cut and CNN [141] Extracting the Liver from CT images Graph Cut [142] Automatic liver segmentation on volumetric CT images Supervoxel-Based Graph Cuts [143] Automatic liver segmentation from abdominal CT volumes Graph Cuts and Border Marching [144] Fully automatic liver segmentation in CT images Modified Graph Cuts [145] Kidney Precise estimation of renal vascular dominant regions Tensor-cut, spatially aware FCN, and Voronoi diagrams [146] Automatic renal segmentation for MR Urography 3D-GraphCut and Random Forests [147] Segmenting kidneys in 2D US images Graph Cuts [148] Pancreas Semi-supervised medical image segmentation and domain adaptation Uncertainty-aware multi-view co-training (UMCT) [149] Medical image segmentation Semi-supervised task-driven data augmentation method [150] 3D semi-supervised learning UMCT [151] Organ segmentation refinement Uncertainty-based graph convolutional networks [152] Spleen Splenomegaly segmentation Co-learning and DCNN [153] Multi-organ Unified Multi-organ segmentation Co-training weight-averaged models [154] Multiple-organ segmentation Graph Cuts [155] Automatic multiorgan segmentation for 3-D radiological images Graph Cuts tion, shape, and location priors by [156] proved to be effective in the segmentation of gallbladder with computational efficiency that was high. [148] proposed an uncertainty-aware multi-view cotraining framework that leveraged on unlabeled data for better performance. ...
... Huo et al. lastly, employed graph cuts refinement to realize the segmentation finally from the probability maps from multi-atlas segmentation (MAS). [152] designed a co-learning strategy to train a deep network from scans that were heterogeneously labeled hence, they proposed a new method of deep convolutional neural network (DCNN) which integrated heterogeneous multi-resource labeled cohorts for the segmentation of splenomegaly. Tang et al. introduced a loss function based on the Dice similarity coefficient to learn adaptively multi-organ information from varied resources. ...
Article
Abdominal organ segmentation is the segregation of a single or multiple abdominal organ(s) into semantic image segments of pixels identified with homogeneous features such as color and texture, and intensity. The abdominal organ(s) condition is mostly connected with greater amount of morbidity and mortality since most patients often have asymptomatic abdominal condition and symptoms , which is often recognized late hence, the abdomen has been the third most common cause of damage to the human body. That notwithstanding , there may be improved outcome when the condition of an abdominal organ is detected earlier. Over the years supervised and semi-supervised machine learning methods have been used to segment abdominal organ(s) in order to detect the organ(s) condition. The supervised methods performs well when the training data used is a representative of the data targeted but, the methods require large manual annotated data and has adaptation problem. The semi-supervised methods are fast but records poor performance than the supervised if assumptions made about the data fails to hold. Current state-of-the art methods of supervised segmentation are largely based on deep learning techniques due to its good accuracy and success in real world applications though it requires large amount of training data for automatic feature extraction else deep learning can hardly be used. As regards the semi-supervised methods of segment-ation self-training and graph-based techniques have attracted much research attention. Self-training can be used with any classifier but does not have a mechanism to rectify mistakes early. Graph-based techniques thrives on its convexity, scalability, and effectiveness in application but have an out-of-sample problem. In this review paper, a study has been carried out on supervised and semi-supervised methods of performing abdominal organ segmentation. An observation of the current approaches connection and gaps are identified, and prospective future research opportunities are enumerated.
... As shown in Figure 9, SenseCare Radiotherapy Contouring system aids in the radiologists by automatically contouring OARs across the whole body, including the head and neck (100-103), chest (104) and abdomen (105,106), as well as common targets including breast cancer, rectal cancer (107), etc. The system can be seamlessly connected to different kinds of CT machines and Treatment Planning Systems (TPS), automatically run AI calculation in background, support a 3D interactive view and editing the delineated structures, and export the results back via the standard RT structure DICOM protocol, in a complete closed loop of workflow. ...
Article
Full-text available
Introduction Clinical research on smart health has an increasing demand for intelligent and clinic-oriented medical image computing algorithms and platforms that support various applications. However, existing research platforms for medical image informatics have limited support for Artificial Intelligence (AI) algorithms and clinical applications. Methods To this end, we have developed SenseCare research platform, which is designed to facilitate translational research on intelligent diagnosis and treatment planning in various clinical scenarios. It has several appealing functions and features such as advanced 3D visualization, concurrent and efficient web-based access, fast data synchronization and high data security, multi-center deployment, support for collaborative research, etc. Results and discussion SenseCare provides a range of AI toolkits for different tasks, including image segmentation, registration, lesion and landmark detection from various image modalities ranging from radiology to pathology. It also facilitates the data annotation and model training processes, which makes it easier for clinical researchers to develop and deploy customized AI models. In addition, it is clinic-oriented and supports various clinical applications such as diagnosis and surgical planning for lung cancer, liver tumor, coronary artery disease, etc. By simplifying AI-based medical image analysis, SenseCare has a potential to promote clinical research in a wide range of disease diagnosis and treatment applications.
... Zhou et al. (2019a) proposed prior-aware loss function to incoporate the domain specific information and optimize it with max-min stochastic gradient optimization algorithm using labeled dataset and pseudo labels generated from partial-labeled dataset. Huang et al. (2020) used single-organ trained models to obtain pseudo-labels for multiple organs and co-trained the two models with weighted focal and dice loss. Fang and Yan (2020) proposed target adaptive loss (TAL) for segmentation network trained through combining the multi-scale features and pyramid of convolutional features. ...
Article
Full-text available
Abdominal organs play a significant role in regulating various functional systems. Any impairment in its functioning can lead to cancerous diseases. Diagnosing these diseases mainly relies on radiologists’ subjective assessment, which varies according to professional abilities and clinical experience. Computer-Aided Diagnosis (CAD) system is designed to assist clinicians in identifying various pathological changes. Hence, automatic pancreas segmentation is a vital input to the CAD system in the diagnosis of cancer at its early stages. Automatic segmentation is achieved through traditional methods like atlas-based and statistical models, and nowadays, it is achieved through artificial intelligence approaches like machine learning and deep learning using various imaging modalities. This study investigates and analyses the various state-of-the-art multi-organ and pancreas segmentation approaches to identify the research gaps and future perspectives for the research community. The objective is achieved by framing the research questions using the PICOC framework and then selecting 140 research articles using a systematic process through the Covidence tool to conclude the answers to the respective questions. The literature search has been conducted on five databases of original studies published from 2003 to 2023. Initially, the literature analysis is presented in terms of publication, and the comparative analysis of the current study is presented with existing review studies. Then, existing studies are analyzed, focusing on semi-automatic and automatic multi-organ segmentation and pancreas segmentation, using various learning methods. Finally, the various critical issues, the research gaps and the future perspectives of segmentation methods based on published evidence are summarized.
... In recent years, numerous methods have been proposed to enhance the quality of these pseudo-labels. Huang et al. [225] proposed a weight-averaging joint training framework that can correct the noise in the pseudo labels to train a more robust model. Zhang et al. [226] proposed a multi-teacher knowledge distillation framework, which utilizes pseudo labels predicted by teacher models trained on partially labeled datasets to train a student model for multi-organ segmentation. ...
Article
Full-text available
Accurate segmentation of multiple organs in the head, neck, chest, and abdomen from medical images is an essential step in computer-aided diagnosis, surgical navigation, and radiation therapy. In the past few years, with a data-driven feature extraction approach and end-to-end training, automatic deep learning-based multi-organ segmentation methods have far outperformed traditional methods and become a new research topic. This review systematically summarizes the latest research in this field. We searched Google Scholar for papers published from January 1, 2016 to December 31, 2023, using keywords “multi-organ segmentation” and “deep learning”, resulting in 327 papers. We followed the PRISMA guidelines for paper selection, and 195 studies were deemed to be within the scope of this review. We summarized the two main aspects involved in multi-organ segmentation: datasets and methods. Regarding datasets, we provided an overview of existing public datasets and conducted an in-depth analysis. Concerning methods, we categorized existing approaches into three major classes: fully supervised, weakly supervised and semi-supervised, based on whether they require complete label information. We summarized the achievements of these methods in terms of segmentation accuracy. In the discussion and conclusion section, we outlined and summarized the current trends in multi-organ segmentation.
... Formulated into a special unsupervised ensemble distillation problem, multiple single-organ models served as teachers from different specialties and collaboratively teach one general student, i.e. the multi-organ segmentation model. Further, the integration of a co-training strategy and weight-averaged models unified multi-organ segmentation from few-organ datasets [176]. Self-distilling a Transformer-based U-Net by simultaneously learning global semantic information and local spatialdetailed features was also investigated in [177]. ...
Article
In recent years, the segmentation of anatomical or pathological structures using deep learning has experienced a widespread interest in medical image analysis. Remarkably successful performance has been reported in many imaging modalities and for a variety of clinical contexts to support clinicians in computer-assisted diagnosis, therapy or surgical planning purposes. However, despite the increasing amount of medical image segmentation challenges, there remains little consensus on which methodology perform best. Therefore, we examine in this paper the numerous developments and breakthroughs brought since the rise of U-Net inspired architectures. Especially, we focus on the technical challenges and emerging trends that the community is now focusing on, including conditional generative adversarial and cascaded networks, medical Transformers, contrastive learning, knowledge distillation, active learning, prior knowledge embedding, cross-modality learning, multi-structure analysis, federated learning or semi-supervised and self-supervised paradigms. We also suggest possible avenues to be further investigated in future research efforts.
... Chen et al. trained one network with a shared encoder and separate decoders for each dataset to generate a generalized encoder for transfer learning [2]. However, most approaches are primarily geared towards multi-organ segmentation as they do not support overlapping target structures, like vessels or cancer classes within an organ [6,8,23,12]. So far, all previous methods do not convincingly leverage cross-dataset synergies. ...
Preprint
Full-text available
The medical imaging community generates a wealth of datasets, many of which are openly accessible and annotated for specific diseases and tasks such as multi-organ or lesion segmentation. Current practices continue to limit model training and supervised pre-training to one or a few similar datasets, neglecting the synergistic potential of other available annotated data. We propose MultiTalent, a method that leverages multiple CT datasets with diverse and conflicting class definitions to train a single model for a comprehensive structure segmentation. Our results demonstrate improved segmentation performance compared to previous related approaches, systematically, also compared to single dataset training using state-of-the-art methods, especially for lesion segmentation and other challenging structures. We show that MultiTalent also represents a powerful foundation model that offers a superior pre-training for various segmentation tasks compared to commonly used supervised or unsupervised pre-training baselines. Our findings offer a new direction for the medical imaging community to effectively utilize the wealth of available data for improved segmentation performance. The code and model weights will be published here: [tba]
... One emerging trend was leveraging available sparse labeled images to perform multi-organ segmentation. Huang et al. (2020) attempted to perform co-training of single-organ datasets (liver, kidney, and pancreas). Fang and Yan (2020) proposed a pyramid-input and pyramid-output network to condense multi-scale features to reduce the semantic gaps. ...
Article
In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with varied sizes and appearances with various lesion-to-background levels (hyper-/hypo-dense), created in collaboration with seven hospitals and research institutions. Seventy-five submitted liver and liver tumor segmentation algorithms were trained on a set of 131 computed tomography (CT) volumes and were tested on 70 unseen test images acquired from different patients. We found that not a single algorithm performed best for both liver and liver tumors in the three events. The best liver segmentation algorithm achieved a Dice score of 0.963, whereas, for tumor segmentation, the best algorithms achieved Dices scores of 0.674 (ISBI 2017), 0.702 (MICCAI 2017), and 0.739 (MICCAI 2018). Retrospectively, we performed additional analysis on liver tumor detection and revealed that not all top-performing segmentation algorithms worked well for tumor detection. The best liver tumor detection method achieved a lesion-wise recall of 0.458 (ISBI 2017), 0.515 (MICCAI 2017), and 0.554 (MICCAI 2018), indicating the need for further research. LiTS remains an active benchmark and resource for research, e.g., contributing the liver-related segmentation tasks in http://medicaldecathlon.com/. In addition, both data and online evaluation are accessible via www.lits-challenge.com.
... Weakly supervised learning explores the use of weak annotations like noisy annotations and sparse annotations [22]. Besides, some approaches also aim at integrating multiple related datasets to learn general knowledge [23], [24]. To issue the problem of limited labeled COVID-19 data, in this work, we aim at utilizing existing non-COVID lung lesion datasets for generalizing useful information to related COVID-19 task, so as to achieve better segmentation performance with limited in-domain training data. ...
Article
The novel Coronavirus disease (COVID-19) is a highly contagious virus and has spread all over the world, posing an extremely serious threat to all countries. Automatic lung infection segmentation from computed tomography (CT) plays an important role in the quantitative analysis of COVID-19. However, the major challenge lies in the inadequacy of annotated COVID-19 datasets. Currently, there are several public non-COVID lung lesion segmentation datasets, providing the potential for generalizing useful information to the related COVID-19 segmentation task. In this paper, we propose a novel relation-driven collaborative learning model to exploit shared knowledge from non-COVID lesions for annotation-efficient COVID-19 CT lung infection segmentation. The model consists of a general encoder to capture general lung lesion features based on multiple non-COVID lesions, and a target encoder to focus on task-specific features based on COVID-19 infections. Features extracted from the two parallel encoders are concatenated for the subsequent decoder part. We develop a collaborative learning scheme to regularize feature-level relation consistency of given input and encourage the model to learn more general and discriminative representation of COVID-19 infections. Extensive experiments demonstrate that trained with limited COVID-19 data, exploiting shared knowledge from non-COVID lesions can further improve state-of-the-art performance with up to 3.0% in dice similarity coefficient and 4.2% in normalized surface dice. In addition, experimental results on large scale 2D dataset with CT slices show that our method significantly outperforms cutting-edge segmentation methods on all evaluation metrics. Our proposed method promotes new insights into annotation-efficient deep learning for COVID-19 infection segmentation and illustrates strong potential for real-world applications in the global fight against COVID-19 in the absence of sufficient high-quality annotations.
... Concretely, they inserted the conditional information as an intermediate activation between convolutional operation and the activation function. Huang et al. [276] tackled the problem of partially-supervised multi-organ segmentation in the co-training framework, where they collaboratively trained multiple networks, and each network was taught by other networks on un-annotated organs. Yan et al. [277] proposed to develop a universal lesion detection algorithm to detect a comprehensive variety of lesions from multiple datasets with partial labels. ...
Article
Full-text available
Despite the remarkable performance of deep learning methods on various tasks, most cutting-edge models rely heavily on large-scale annotated training examples, which are often unavailable for clinical and health care tasks. The labeling costs for medical images are very high, especially in medical image segmentation, which typically requires intensive pixel/voxel-wise labeling. Therefore, the strong capability of learning and generalizing from limited supervision, including a limited amount of annotations, sparse annotations, and inaccurate annotations, is crucial for the successful application of deep learning models in medical image segmentation. However, due to its intrinsic difficulty, segmentation with limited supervision is challenging and specific model design and/or learning strategies are needed. In this paper, we provide a systematic and up-to-date review of the solutions above, with summaries and comments about the methodologies. We also highlight several problems in this field, discussed future directions observing further investigations.
Article
Multi-organ segmentation is a fundamental task and existing approaches usually rely on large-scale fully-labeled images for training. However, data privacy and incomplete/partial labels make those approaches struggle in practice. Federated learning is an emerging tool to address data privacy but federated learning with partial labels is under-explored. In this work, we explore generating full supervision by building and aggregating inter-organ dependency based on partial labels and propose a single-encoder-multi-decoder framework named FedIOD. To simulate the annotation process where each organ is labeled by referring to other closely-related organs, a transformer module is introduced and the learned self-attention matrices modeling pairwise inter-organ dependency are used to build pseudo full labels. By using those pseudo-full labels for regularization in each client, the shared encoder is trained to extract rich and complete organ-related features rather than being biased toward certain organs. Then, each decoder in FedIOD projects the shared organ-related features into a specific space trained by the corresponding partial labels. Experimental results based on five widely-used datasets, including LiTS, KiTS, MSD, BCTV, and ACDC, demonstrate the effectiveness of FedIOD, outperforming the state-of-the-art approaches under in-federation evaluation and achieving the second-best performance under out-of-federation evaluation for multi-organ segmentation from partial labels. The source code is publicly available at https://github.com/vagabond-healer/FedIOD .
Article
Full-text available
Precise delineation of multiple organs or abnormal regions in the human body from medical images plays an essential role in computer-aided diagnosis, surgical simulation, image-guided interventions, and especially in radiotherapy treatment planning. Thus, it is of great significance to explore automatic segmentation approaches, among which deep learning-based approaches have evolved rapidly and witnessed remarkable progress in multi-organ segmentation. However, obtaining an appropriately sized and fine-grained annotated dataset of multiple organs is extremely hard and expensive. Such scarce annotation limits the development of high-performance multi-organ segmentation models but promotes many annotation-efficient learning paradigms. Among these, studies on transfer learning leveraging external datasets, semi-supervised learning including unannotated datasets and partially-supervised learning integrating partially-labeled datasets have led the dominant way to break such dilemmas in multi-organ segmentation. We first review the fully supervised method, then present a comprehensive and systematic elaboration of the 3 abovementioned learning paradigms in the context of multi-organ segmentation from both technical and methodological perspectives, and finally summarize their challenges and future trends.
Article
Deep learning models have demonstrated remarkable success in multi-organ segmentation but typically require large-scale datasets with all organs of interest annotated. However, medical image datasets are often low in sample size and only partially labeled, i.e., only a subset of organs are annotated. Therefore, it is crucial to investigate how to learn a unified model on the available partially labeled datasets to leverage their synergistic potential. In this paper, we systematically investigate the partial-label segmentation problem with theoretical and empirical analyses on the prior techniques. We revisit the problem from a perspective of partial label supervision signals and identify two signals derived from ground truth and one from pseudo labels. We propose a novel two-stage framework termed COSST, which effectively and efficiently integrates comprehensive supervision signals with self-training. Concretely, we first train an initial unified model using two ground truth-based signals and then iteratively incorporate the pseudo label signal to the initial model using self-training. To mitigate performance degradation caused by unreliable pseudo labels, we assess the reliability of pseudo labels via outlier detection in latent space and exclude the most unreliable pseudo labels from each self-training iteration. Extensive experiments are conducted on one public and three private partial-label segmentation tasks over 12 CT datasets. Experimental results show that our proposed COSST achieves significant improvement over the baseline method, i.e., individual networks trained on each partially labeled dataset. Compared to the state-of-the-art partial-label segmentation methods, COSST demonstrates consistent superior performance on various segmentation tasks and with different training data sizes.
Chapter
Developing a generalized segmentation model capable of simultaneously delineating multiple organs and diseases is highly desirable. Federated learning (FL) is a key technology enabling the collaborative development of a model without exchanging training data. However, the limited access to fully annotated training data poses a major challenge to training generalizable models. We propose “ConDistFL”, a framework to solve this problem by combining FL with knowledge distillation. Local models can extract the knowledge of unlabeled organs and tumors from partially annotated data from the global model with an adequately designed conditional probability representation. We validate our framework on four distinct partially annotated abdominal CT datasets from the MSD and KiTS19 challenges. The experimental results show that the proposed framework significantly outperforms FedAvg and FedOpt baselines. Moreover, the performance on an external test dataset demonstrates superior generalizability compared to models trained on each dataset separately. Our ablation study suggests that ConDistFL can perform well without frequent aggregation, reducing the communication cost of FL. Our implementation will be available at https://github.com/NVIDIA/NVFlare/tree/main/research/condist-fl.
Chapter
Partially Supervised Multi-Organ Segmentation (PSMOS) has attracted increasing attention. However, facing with challenges from lacking sufficiently labeled data and cross-site data discrepancy, PSMOS remains largely an unsolved problem. In this paper, to fully take advantage of the unlabeled data, we propose to incorporate voxel-to-organ affinity in embedding space into a consistency learning framework, ensuring consistency in both label space and latent feature space. Furthermore, to mitigate the cross-site data discrepancy, we propose to propagate the organ-specific feature centers and inter-organ affinity relationships across different sites, calibrating the multi-site feature distribution from a statistical perspective. Extensive experiments manifest that our method generates favorable results compared with other state-of-the-art methods, especially on hard organs with relatively smaller sizes.
Chapter
The medical imaging community generates a wealth of data-sets, many of which are openly accessible and annotated for specific diseases and tasks such as multi-organ or lesion segmentation. Current practices continue to limit model training and supervised pre-training to one or a few similar datasets, neglecting the synergistic potential of other available annotated data. We propose MultiTalent, a method that leverages multiple CT datasets with diverse and conflicting class definitions to train a single model for a comprehensive structure segmentation. Our results demonstrate improved segmentation performance compared to previous related approaches, systematically, also compared to single-dataset training using state-of-the-art methods, especially for lesion segmentation and other challenging structures. We show that MultiTalent also represents a powerful foundation model that offers a superior pre-training for various segmentation tasks compared to commonly used supervised or unsupervised pre-training baselines. Our findings offer a new direction for the medical imaging community to effectively utilize the wealth of available data for improved segmentation performance. The code and model weights will be published here: https://github.com/MIC-DKFZ/MultiTalent.
Article
Medical image benchmarks for the segmentation of organs and tumors suffer from the partially labeling issue due to its intensive cost of labor and expertise. Current mainstream approaches follow the practice of one network solving one task. With this pipeline, not only the performance is limited by the typically small dataset of a single task, but also the computation cost linearly increases with the number of tasks. To address this, we propose a Transformer based dynamic on-demand network (TransDoDNet) that learns to segment organs and tumors on multiple partially labeled datasets. Specifically, TransDoDNet has a hybrid backbone that is composed of the convolutional neural network and Transformer. A dynamic head enables the network to accomplish multiple segmentation tasks flexibly. Unlike existing approaches that fix kernels after training, the kernels in the dynamic head are generated adaptively by the Transformer, which employs the self-attention mechanism to model long-range organ-wise dependencies and decodes the organ embedding that can represent each organ. We create a large-scale partially labeled Multi-Organ and Tumor Segmentation benchmark, termed MOTS, and demonstrate the superior performance of our TransDoDNet over other competitors on seven organ and tumor segmentation tasks. This study also provides a general 3D medical image segmentation model, which has been pre-trained on the large-scale MOTS benchmark and has demonstrated advanced performance over current predominant self-supervised learning methods. Code and data are available at https://github.com/jianpengz/DoDNet .
Article
Nuclei segmentation is a crucial task for whole slide image analysis in digital pathology. Generally, the segmentation performance of fully-supervised learning heavily depends on the amount and quality of the annotated data. However, it is time-consuming and expensive for professional pathologists to provide accurate pixel-level ground truth, while it is much easier to get coarse labels such as point annotations. In this paper, we propose a weakly-supervised learning method for nuclei segmentation that only requires point annotations for training. First, coarse pixel-level labels are derived from the point annotations based on the Voronoi diagram and the k-means clustering method to avoid overfitting. Second, a co-training strategy with an exponential moving average method is designed to refine the incomplete supervision of the coarse labels. Third, a self-supervised visual representation learning method is tailored for nuclei segmentation of pathology images that transforms the hematoxylin component images into the H&E stained images to gain better understanding of the relationship between the nuclei and cytoplasm. We comprehensively evaluate the proposed method using two public datasets. Both visual and quantitative results demonstrate the superiority of our method to the state-of-the-art methods, and its competitive performance compared to the fully-supervised methods. Codes are available at https://github.com/hust-linyi/SC-Net.
Article
Multi‐organ segmentation is a critical prerequisite for many clinical applications. Deep learning‐based approaches have recently achieved promising results on this task. However, they heavily rely on massive data with multi‐organ annotated, which is labor‐ and expert‐intensive and thus difficult to obtain. In contrast, single‐organ datasets are easier to acquire, and many well‐annotated ones are publicly available. It leads to the partially labeled issue: How to learn a unified multi‐organ segmentation model from several single‐organ datasets? Pseudo‐label‐based methods and conditional information‐based methods make up the majority of existing solutions, where the former largely depends on the accuracy of pseudo‐labels, and the latter has a limited capacity for task‐related features. In this paper, we propose the Conditional Dynamic Attention Network (CDANet). Our approach is designed with two key components: (1) multisource parameter generator, fusing the conditional and multiscale information to better distinguish among different tasks, and (2) dynamic attention module, promoting more attention to task‐related features. We have conducted extensive experiments on seven partially labeled challenging datasets. The results show that our method achieved competitive results compared with the advanced approaches, with an average Dice score of 75.08%. Additionally, the Hausdorff Distance is 26.31, which is a competitive result.
Article
Collecting sufficient high-quality training data for deep neural networks is often expensive or even unaffordable in medical image segmentation tasks. We thus propose to train the network by using external data that can be collected in a cheaper way, e.g ., crowd-sourcing. We show that by data discernment, the network is able to mine valuable knowledge from external data, even though the data distribution is very different from that of the original (internal) data. We discern the external data by learning an importance weight for each of them, with the goal to enhance the contribution of informative external data to network updating, while suppressing the data that are ‘useless’ or even ‘harmful’. An iterative algorithm that alternatively estimates the importance weight and updates the network is developed by formulating the data discernment as a constrained nonlinear programming problem. It estimates the importance weight according to the distribution discrepancy between the external data and the internal dataset, and imposes a constraint to drive the network to learn more effectively, compared with the network without using the external data. We evaluate the proposed algorithm on two tasks: abdominal CT image and cervical smear image segmentation, using totally 6 publicly available datasets. The effectiveness of the algorithm is demonstrated by extensive experiments. Source codes are available at: https://github.com/YouyiSong/Data-Discernment.
Article
Accurate bowel segmentation is essential for diagnosis and treatment of bowel cancers. Unfortunately, segmenting the entire bowel in CT images is quite challenging due to unclear boundary, large shape, size, and appearance variations, as well as diverse filling status within the bowel. In this paper, we present a novel two-stage framework, named BowelNet, to handle the challenging task of bowel segmentation in CT images, with two stages of 1) jointly localizing all types of the bowel, and 2) finely segmenting each type of the bowel. Specifically, in the first stage, we learn a unified localization network from both partially- and fully-labeled CT images to robustly detect all types of the bowel. To better capture unclear bowel boundary and learn complex bowel shapes, in the second stage, we propose to jointly learn semantic information (i.e., bowel segmentation mask) and geometric representations (i.e., bowel boundary and bowel skeleton) for fine bowel segmentation in a multi-task learning scheme. Moreover, we further propose to learn a meta segmentation network via pseudo labels to improve segmentation accuracy. By evaluating on a large abdominal CT dataset, our proposed BowelNet method can achieve Dice scores of 0.764, 0.848, 0.835, 0.774, and 0.824 in segmenting the duodenum, jejunum-ileum, colon, sigmoid, and rectum, respectively. These results demonstrate the effectiveness of our proposed BowelNet framework in segmenting the entire bowel from CT images.
Article
Automatic multi-organ segmentation in medical images is crucial for many clinical applications. The art methods have reported promising results but rely on massive annotated data. However, such data is hard to obtain due to the need for considerable expertise. In contrast, obtaining a single-organ dataset is relatively easier, and many well-annotated ones are publicly available. To this end, this work raises the partially supervised problem: can we use these single-organ datasets to learn a multi-organ segmentation model? In this paper, we propose the Partial- and Mutual-Prior incorporated framework (PRIMP) to learn a robust multi-organ segmentation model by deriving knowledge from single-organ datasets. Unlike existing methods that largely ignore the organs’ anatomical prior knowledge, our PRIMP is designed with two key prior shared across different subjects and datasets: (1) partial-prior, each organ has its own character (e.g., size and shape) and (2) mutual-prior, the relative position between different organs follows the comparatively fixed anatomical structure. Specifically, we propose to incorporate partial-prior of each organ by learning from the single-organ statistics, and inject mutual-prior of organs by learning from the multi-organ statistics. By doing so, the model is encouraged to capture organs’ anatomical invariance across different subjects and datasets, thus guaranteeing the anatomical reasonableness of the predictions, narrowing down the problem of domain gaps, capturing spatial information among different slices, thereby improving organs’ segmentation performance. Experiments on four publicly available datasets (LiTS, Pancreas, KiTS, BTCV) show that our PRIMP can improve the performance on both the multi-organ and single-organ datasets (17.40% and 3.06% above the baseline model on DSC, respectively) and can surpass the comparative approaches.
Chapter
Segmentation studies in medical image analysis are always associated with a particular task scenario. However, building datasets to train models to segment multiple types of organs and pathologies is challenging. For example, a dataset annotated for the pancreas and pancreatic tumors will result in a model that cannot segment other organs, like the liver and spleen, visible in the same abdominal computed tomography image. The lack of a well-annotated dataset is one limitation resulting in a lack of universal segmentation models. Federated learning (FL) is ideally suited for addressing this issue in the real-world context. In this work, we show that each medical center can use training data for distinct tasks to collaboratively build more generalizable segmentation models for multiple segmentation tasks without the requirement to centralize datasets in one place. The main challenge of this research is the heterogeneity of training data from various institutions and segmentation tasks. In this paper, we propose a multi-task segmentation framework using FL to learn segmentation models using several independent datasets with different annotations of organs or tumors. We include experiments on four publicly available single-task datasets, including MSD liver (w/ tumor), MSD spleen, MSD pancreas (w/ tumor), and KITS19. Experimental results on an external validation set to highlight the advantages of employing FL in multi-task organ and tumor segmentation.KeywordsFederated learningSegmentationPartial labels
Article
Image segmentation is a fundamental building block of automatic medical applications. It has been greatly improved since the emergence of deep neural networks. However, deep-learning-based models often require a large amount of manual annotations, which has seriously hindered their practical usage. To alleviate this problem, numerous works were proposed by utilizing unlabeled data based on semi-supervised frameworks. Recently, the Mean-Teacher (MT) model has been successfully applied in many scenarios due to its effective learning strategy. Nevertheless, the existing MT model still has certain limitations. Firstly, to gain extra generalization ability through consistency training, various sorts of perturbations are often added to the training data. However, if the variation is too weak, it may cause the Lazy Student Phenomenon, and cause fluctuations in the learning model. On the contrary, large image perturbations may enlarge the performance gap between the teacher and student. In this case, the student may lose its learning momentum, and more seriously, drag down the overall performance of the whole system. In order to address these issues, we introduce a novel semi-supervised medical image segmentation framework, in which a Cross-Mix Teaching paradigm is proposed to provide extra data flexibility, thus effectively avoid Lazy Student Phenomenon. Moreover, a lightweight Transductive Monitor is applied to serve as the bridge that connects the teacher and student for active knowledge distillation. In the light of this cross-network information mixing and transfer mechanism, our method is able to continuously explore the discriminative information contained in unlabeled data. Extensive experiments on challenging medical image data sets demonstrate that our method outperforms current state-of-the-art semi-supervised segmentation methods under severe lack of supervision.
Preprint
The spleen is one of the most commonly injured solid organs in blunt abdominal trauma. The development of automatic segmentation systems from multi-phase CT for splenic vascular injury can augment severity grading for improving clinical decision support and outcome prediction. However, accurate segmentation of splenic vascular injury is challenging for the following reasons: 1) Splenic vascular injury can be highly variant in shape, texture, size, and overall appearance; and 2) Data acquisition is a complex and expensive procedure that requires intensive efforts from both data scientists and radiologists, which makes large-scale well-annotated datasets hard to acquire in general. In light of these challenges, we hereby design a novel framework for multi-phase splenic vascular injury segmentation, especially with limited data. On the one hand, we propose to leverage external data to mine pseudo splenic masks as the spatial attention, dubbed external attention, for guiding the segmentation of splenic vascular injury. On the other hand, we develop a synthetic phase augmentation module, which builds upon generative adversarial networks, for populating the internal data by fully leveraging the relation between different phases. By jointly enforcing external attention and populating internal data representation during training, our proposed method outperforms other competing methods and substantially improves the popular DeepLab-v3+ baseline by more than 7% in terms of average DSC, which confirms its effectiveness.
Article
The spleen is one of the most commonly injured solid organs in blunt abdominal trauma. The development of automatic segmentation systems from multi-phase CT for splenic vascular injury can augment severity grading for improving clinical decision support and outcome prediction. However, accurate segmentation of splenic vascular injury is challenging for the following reasons: 1) Splenic vascular injury can be highly variant in shape, texture, size, and overall appearance; and 2) Data acquisition is a complex and expensive procedure that requires intensive efforts from both data scientists and radiologists, which makes large-scale well-annotated datasets hard to acquire in general. In light of these challenges, we hereby design a novel framework for multi-phase splenic vascular injury segmentation, especially with limited data. On the one hand, we propose to leverage external data to mine pseudo splenic masks as the spatial attention, dubbed external attention , for guiding the segmentation of splenic vascular injury. On the other hand, we develop a synthetic phase augmentation module, which builds upon generative adversarial networks, for populating the internal data by fully leveraging the relation between different phases. By jointly enforcing external attention and populating internal data representation during training, our proposed method outperforms other competing methods and substantially improves the popular DeepLab-v3+ baseline by more than 7% in terms of average DSC, which confirms its effectiveness.
Article
Image segmentation is widely used in the medical field. Convolutional neural network has become more diverse and effective in recent years. However, at present, most networks are designed for a single dataset (i.e., a single organ or target). The designed network is only suitable for a single dataset, and its accuracy is very different (especially small-size image datasets). In response to this problem, a collaborative network can be designed to simultaneously extract the specific and common features of a multi-dataset (i.e., multiple organs or targets). The network can be used for multi-dataset segmentation and help to balance the segmentation performance of different datasets, especially to improve the accuracy of small-size image datasets. By exploring the adapters modified by the convolution kernels, the adaptive weight update strategy and the network branched structure, the paper proposes a multi-dataset collaborative image segmentation network, called Md-Unet, which integrates a shared-specific adapter (SSA), an asymmetric similarity loss function with the proposed adaptive weight update strategy, and a dual-branch. Experimental results showed that compared with the baseline 3D U²Net, the accuracy of the module using the SSA was improved by 3.7%, using several loss functions with the proposed adaptive weight update strategy was improved by 0.64%-30.63%, and using dual-branch integrated architecture was improved by 17.47%. Moreover, Md-Unet had a significant improvement on small-size image datasets compared with single-dataset models.
Chapter
Due to the inter-observer variation, the ground truth of lesion areas in pathological images is generated by majority-voting of annotations provided by different pathologists. Such a process is extremely laborious, since each pathologist needs to spend hours or even days for pixel-wise annotations. In this paper, we propose a reinforcement learning framework to automatically refine the set of annotations provided by a single pathologist based on several exemplars of ground truth. Particularly, we treat each pixel as an agent with a shared pixel-level action space. The multi-agent model observes several paired single-pathologist annotations and ground truth, and tries to customize the strategy to narrow down the gap between them with episodes of exploring. Furthermore, we integrate a discriminator to the multi-agent framework to evaluate the quality of annotation refinement. A quality reward is yielded by the discriminator to update the policy of agents. Experimental results on the publicly available Gleason 2019 dataset demonstrate the effectiveness of our reinforcement learning framework—the segmentation network trained with our refined single-pathologist annotations achieves a comparable accuracy to the one using majority-voting-based ground truth.
Chapter
Learning from external data is an effective and efficient way of training deep networks, which can substantially alleviate the burden on collecting training data and annotations. It is of great significance in improving the performance of CT image segmentation tasks, where collecting a large amount of voxel-wise annotations is expensive or even impractical. In this paper, we propose a generic selective learning method to maximize the performance gains of harnessing external data in CT image segmentation. The key idea is to learn a weight for each external data such that ‘good’ data can have large weights and thus contribute more to the training loss, thereby implicitly encouraging the network to mine more valuable knowledge from informative external data while suppressing to memorize irrelevant patterns from ‘useless’ or even ‘harmful’ data. Particularly, we formulate our idea as a constrained non-linear programming problem, solved by an iterative solution that alternatively conducts weights estimating and network updating. Extensive experiments on abdominal multi-organ CT segmentation datasets show the efficacy and performance gains of our method against existing methods. The code is publicly available (Released at https://github.com/YouyiSong/Codes-for-Selective-Learning).
Article
Full-text available
Accurate segmentation of breast masses is an essential step in computer aided diagnosis of breast cancer. The scarcity of annotated training data greatly hinders the model’s generalization ability, especially for the deep learning based methods. However, high-quality image-level annotations are time-consuming and cumbersome in medical image analysis scenarios. In addition, a large amount of weak annotations is under-utilized which comprise common anatomy features. To this end, inspired by teacher-student networks, we propose an Anatomy-Aware Weakly-Supervised learning Network (AAWS-Net) for extracting useful information from mammograms with weak annotations for efficient and accurate breast mass segmentation. Specifically, we adopt a weakly-supervised learning strategy in the Teacher to extract anatomy structure from mammograms with weak annotations by reconstructing the original image. Besides, knowledge distillation is used to suggest morphological differences between benign and malignant masses. Moreover, the prior knowledge learned from the Teacher is introduced to the Student in an end-to-end way, which improves the ability of the student network to locate and segment masses. Experiments on CBIS-DDSM have shown that our method yields promising performance compared with state-of-the-art alternative models for breast mass segmentation in terms of segmentation accuracy and IoU.
Chapter
Coherent and systematic analysis for finding complex patterns in structured and unstructured cancer data has seen quite a rich and diverse implementation of distinct techniques in the recent past. The delicate and life-threatening aspect of Cancer has led to the huge need as well as the attraction of everyone to propose optimized techniques to garner a commendable result for the prediction of cancer subtypes. As a result, several Data Analysis techniques have led a revolution to provide the best outcomes, among which several have shown mammoth results. In this chapter, the focus is put directly on such techniques that have been implemented and adapted for cancer data analysis. The chapter goes into quite an in-depth review of each of the proposed architectures which have been very precisely screened by us and also do quite a lot to develop a concrete sense of each one of these taxonomies by putting each of them under scanning through various evaluation metrics. Furthermore, the chapter issues several future scopes and recommendations from the perspective of the authors to ignite the thought of the ones interested in pushing this field into further sub-stratum.
Article
Full-text available
Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network. The typical application is to transfer from a powerful large network or ensemble to a small network, that is better suited to low-memory or fast execution requirements. In this paper, we present a deep mutual learning (DML) strategy where, rather than one way transfer between a static pre-defined teacher and a student, an ensemble of students learn collaboratively and teach each other throughout the training process. Our experiments show that a variety of network architectures benefit from mutual learning and achieve compelling results on CIFAR-100 recognition and Market-1501 person re-identification benchmarks. Surprisingly, it is revealed that no prior powerful teacher network is necessary -- mutual learning of a collection of simple student networks works, and moreover outperforms distillation from a more powerful yet static teacher.
Conference Paper
Segmenting pancreas from abdominal CT scans is an important prerequisite for pancreatic cancer diagnosis and precise treatment planning. However, automated pancreas segmentation faces challenges posed by shape and size variances, low contrast with regard to adjacent tissues and in particular negligibly small proportion to the whole abdominal volume. Current coarse-to-fine frameworks, either using tri-planar schemes or stacking 2D pre-segmentation as prior to 3D networks, have limitation on effectively capturing 3D information. While iterative updates on region of interest (ROI) in refinement stage alleviate accumulated errors caused by coarse segmentation, extra computational burden is introduced. In this paper, we harness 2D networks and 3D features to improve segmentation accuracy and efficiency. Firstly, in the 3D coarse segmentation network, a new bias-dice loss function is defined to increase ROI recall rates to improve efficiency by avoiding iterative ROI refinements. Secondly, for a full utilization of 3D information, dimension adaptation module (DAM) is introduced to bridge 2D networks and 3D information. Finally, a fusion decision module and parallel training strategy are proposed to fuse multi-source feature cues extracted from three sub-networks to make final predictions. The proposed method is evaluated in NIH dataset and outperforms the state-of-the-art methods in comparison, with mean Dice-Sørensen coefficient (DSC) of 85.22%, and with averagely 0.4 min for each instance.
Chapter
Convolutional neural networks (CNNs) have achieved great successes in many computer vision problems. Unlike existing works that designed CNN architectures to improve performance on a single task of a single domain and not generalizable, we present IBN-Net, a novel convolutional architecture, which remarkably enhances a CNN’s modeling ability on one domain (e.g. Cityscapes) as well as its generalization capacity on another domain (e.g. GTA5) without finetuning. IBN-Net carefully integrates Instance Normalization (IN) and Batch Normalization (BN) as building blocks, and can be wrapped into many advanced deep networks to improve their performances. This work has three key contributions. (1) By delving into IN and BN, we disclose that IN learns features that are invariant to appearance changes, such as colors, styles, and virtuality/reality, while BN is essential for preserving content related information. (2) IBN-Net can be applied to many advanced deep architectures, such as DenseNet, ResNet, ResNeXt, and SENet, and consistently improve their performance without increasing computational cost. (3) When applying the trained networks to new domains, e.g. from GTA5 to Cityscapes, IBN-Net achieves comparable improvements as domain adaptation methods, even without using data from the target domain. With IBN-Net, we won the 1st place on the WAD 2018 Challenge Drivable Area track, with an mIoU of 86.18%.
Article
Automatic segmentation of abdominal anatomy on computed tomography (CT) images can support diagnosis, treatment planning and treatment delivery workflows. Segmentation methods using statistical models and multi-atlas label fusion (MALF) require inter-subject image registrations which are challenging for abdominal images, but alternative methods without registration have not yet achieved higher accuracy for most abdominal organs. We present a registration-free deeplearning- based segmentation algorithm for eight organs that are relevant for navigation in endoscopic pancreatic and biliary procedures, including the pancreas, the GI tract (esophagus, stomach, duodenum) and surrounding organs (liver, spleen, left kidney, gallbladder). We directly compared the segmentation accuracy of the proposed method to existing deep learning and MALF methods in a cross-validation on a multi-centre data set with 90 subjects. The proposed method yielded significantly higher Dice scores for all organs and lower mean absolute distances for most organs, including Dice scores of 0.78 vs. 0.71, 0.74 and 0.74 for the pancreas, 0.90 vs 0.85, 0.87 and 0.83 for the stomach and 0.76 vs 0.68, 0.69 and 0.66 for the esophagus. We conclude that deep-learning-based segmentation represents a registration-free method for multi-organ abdominal CT segmentation whose accuracy can surpass current methods, potentially supporting image-guided navigation in gastrointestinal endoscopy procedures.
Article
Spatial pyramid pooling module or encode-decoder structure are used in deep neural networks for semantic segmentation task. The former networks are able to encode multi-scale contextual information by probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view, while the latter networks can capture sharper object boundaries by gradually recovering the spatial information. In this work, we propose to combine the advantages from both methods. Specifically, our proposed model, DeepLabv3+, extends DeepLabv3 by adding a simple yet effective decoder module to refine the segmentation results especially along object boundaries. We further explore the Xception model and apply the depthwise separable convolution to both Atrous Spatial Pyramid Pooling and decoder modules, resulting in a faster and stronger encoder-decoder network. We demonstrate the effectiveness of the proposed model on the PASCAL VOC 2012 semantic image segmentation dataset and achieve a performance of 89% on the test set without any post-processing. Our paper is accompanied with a publicly available reference implementation of the proposed models in Tensorflow.
Article
Segmentation of key brain tissues from 3D medical images is of great significance for brain disease diagnosis, progression assessment and monitoring of neurologic conditions. While manual segmentation is time-consuming, laborious, and subjective, automated segmentation is quite challenging due to the complicated anatomical environment of brain and the large variations of brain tissues. We propose a novel voxelwise residual network (VoxResNet) with a set of effective training schemes to cope with this challenging problem. The main merit of residual learning is that it can alleviate the degradation problem when training a deep network so that the performance gains achieved by increasing the network depth can be fully leveraged. With this technique, our VoxResNet is built with 25 layers, and hence can generate more representative features to deal with the large variations of brain tissues than its rivals using hand-crafted features or shallower networks. In order to effectively train such a deep network with limited training data for brain segmentation, we seamlessly integrate multi-modality and multi-level contextual information into our network, so that the complementary information of different modalities can be harnessed and features of different scales can be exploited. Furthermore, an auto-context version of the VoxResNet is proposed by combining the low-level image appearance features, implicit shape information, and high-level context together for further improving the segmentation performance. Extensive experiments on the well-known benchmark (i.e., MRBrainS) of brain segmentation from 3D magnetic resonance (MR) images corroborated the efficacy of the proposed VoxResNet. Our method achieved the first place in the challenge out of 37 competitors including several state-of-the-art brain segmentation methods. Our method is inherently general and can be readily applied as a powerful tool to many brain-related studies, where accurate segmentation of brain structures is critical.
Article
The recently proposed temporal ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks. It maintains an exponential moving average of label predictions on each training example, and penalizes predictions that are inconsistent with this target. However, because the targets change only once per epoch, temporal ensembling becomes unwieldy when using large datasets. To overcome this problem, we propose a method that averages model weights instead of label predictions. As an additional benefit, the method improves test accuracy and enables training with fewer labels than earlier methods. We report state-of-the-art results on semi-supervised SVHN, reducing the error rate from 5.12% to 4.41% with 500 labels, and achieving 5.39% error rate with 250 labels. By using extra unlabeled data, we reduce the error rate to 2.76% on 500-label SVHN.
Conference Paper
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
A large annotated medical image dataset for the development and evaluation of segmentation algorithms
  • A L Simpson
Two at once: enhancing learning and generalization capacities via IBN-net
  • X Pan
  • P Luo
  • J Shi
  • X Tang
Mutual mean-teaching: pseudo label refinery for unsupervised domain adaptation on person re-identification
  • Y Ge
  • D Chen
  • H Li
The kits19 challenge data: 300 kidney tumor cases with clinical context, CT semantic segmentations, and surgical outcomes
  • N Heller