Article

SegRap2023: A benchmark of organs-at-risk and gross tumor volume Segmentation for Radiotherapy Planning of Nasopharyngeal Carcinoma

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Medical image segmentation has been the focus of numerous medical imaging challenges, including notable ones such as the Head and Neck Organ-at-Risk CT & MR Segmentation Challenge (HaN-Seg) (Podobnik et al., 2024), the Fast and Low-Resource Semi-Supervised Abdominal Organ Segmentation in CT (FLARE 2022) (FLARE, 2022), the Segmentation of Organs-at-Risk and Gross Tumor Volume of NPC for Radiotherapy Planning (SegRap2023) (Luo et al., 2023), and the KiTS21 Challenge (Heller et al., 2023). Despite their contributions, none addressed the segmentation of the aorta. ...
Preprint
Full-text available
Multi-class segmentation of the aorta in computed tomography angiography (CTA) scans is essential for diagnosing and planning complex endovascular treatments for patients with aortic dissections. However, existing methods reduce aortic segmentation to a binary problem, limiting their ability to measure diameters across different branches and zones. Furthermore, no open-source dataset is currently available to support the development of multi-class aortic segmentation methods. To address this gap, we organized the AortaSeg24 MICCAI Challenge, introducing the first dataset of 100 CTA volumes annotated for 23 clinically relevant aortic branches and zones. This dataset was designed to facilitate both model development and validation. The challenge attracted 121 teams worldwide, with participants leveraging state-of-the-art frameworks such as nnU-Net and exploring novel techniques, including cascaded models, data augmentation strategies, and custom loss functions. We evaluated the submitted algorithms using the Dice Similarity Coefficient (DSC) and Normalized Surface Distance (NSD), highlighting the approaches adopted by the top five performing teams. This paper presents the challenge design, dataset details, evaluation metrics, and an in-depth analysis of the top-performing algorithms. The annotated dataset, evaluation code, and implementations of the leading methods are publicly available to support further research. All resources can be accessed at https://aortaseg24.grand-challenge.org.
... Deep learning (DL), a subset of AI, has shown remarkable success in medical image segmentation, particularly in challenging domains like HNC. Various public challenges, such as the HECKTOR [3] and SegRap [4] challenges, have driven advancements in this field by providing datasets and benchmarks for AI model development. However, no largescale, publicly available datasets for MRI-guided RT in HNC exist, highlighting the need for community-driven efforts to develop AI tools for clinical translation. ...
Preprint
Radiation therapy (RT) is essential in treating head and neck cancer (HNC), with magnetic resonance imaging(MRI)-guided RT offering superior soft tissue contrast and functional imaging. However, manual tumor segmentation is time-consuming and complex, and therfore remains a challenge. In this study, we present our solution as team TUMOR to the HNTS-MRG24 MICCAI Challenge which is focused on automated segmentation of primary gross tumor volumes (GTVp) and metastatic lymph node gross tumor volume (GTVn) in pre-RT and mid-RT MRI images. We utilized the HNTS-MRG2024 dataset, which consists of 150 MRI scans from patients diagnosed with HNC, including original and registered pre-RT and mid-RT T2-weighted images with corresponding segmentation masks for GTVp and GTVn. We employed two state-of-the-art models in deep learning, nnUNet and MedNeXt. For Task 1, we pretrained models on pre-RT registered and mid-RT images, followed by fine-tuning on original pre-RT images. For Task 2, we combined registered pre-RT images, registered pre-RT segmentation masks, and mid-RT data as a multi-channel input for training. Our solution for Task 1 achieved 1st place in the final test phase with an aggregated Dice Similarity Coefficient of 0.8254, and our solution for Task 2 ranked 8th with a score of 0.7005. The proposed solution is publicly available at Github Repository.
... Unlike MedSAM, it supports points, boxes, and mask prompts, allowing for refinement (Cheng et al., 2023). CT Lung lesions 58 D5 LNQ (Dorent et al., 2024) CT Mediastinal lymph nodes 513 D6 LiverMets (Simpson et al., 2023) CT Liver metastases 171 D7 Adrenal ACC (Moawad et al., 2023) CT Adrenal tumors 53 D8 HCC Tace (Moawad et al., 2021) CT Liver, Liver tumors 65 D9 Pengwin CT Bone fragments 100 D10 Segrap (Luo et al., 2023) CT 45 Organs at risk 30 SAM-Med 3D incorporates a transformer-based 3D image encoder, 3D prompt encoder, and 3D mask decoder. It was trained from scratch using 22,000 3D images and 143,000 corresponding 3D masks and supports point and mask prompts and also allows for refinement . ...
Preprint
Full-text available
Current interactive segmentation approaches, inspired by the success of META's Segment Anything model, have achieved notable advancements, however, they come with substantial limitations that hinder their practical application in real clinical scenarios. These include unrealistic human interaction requirements, such as slice-by-slice operations for 2D models on 3D data, a lack of iterative refinement, and insufficient evaluation experiments. These shortcomings prevent accurate assessment of model performance and lead to inconsistent outcomes across studies. IntRaBench overcomes these challenges by offering a comprehensive and reproducible framework for evaluating interactive segmentation methods in realistic, clinically relevant scenarios. It includes diverse datasets, target structures, and segmentation models, and provides a flexible codebase that allows seamless integration of new models and prompting strategies. Additionally, we introduce advanced techniques to minimize clinician interaction, ensuring fair comparisons between 2D and 3D models. By open-sourcing IntRaBench, we invite the research community to integrate their models and prompting techniques, ensuring continuous and transparent evaluation of interactive segmentation models in 3D medical imaging.
... SegRap2023 datasets [24]: The SegRap2023 dataset provides CT images of the head and neck organs at risk, as well as accurate segmentation annotations. SegRap2023 contains 200 datasets, including 120 cases of training data, 20 cases of validation data, and 60 cases of test data. ...
Article
Full-text available
Accurate segmentation of organs at risk (OARs) is a crucial step in the precise planning of radiotherapy for head and neck tumors. However, manual segmentation methods using CT images, which are still predominantly applied in clinical settings, are inefficient and expensive. Additionally, existing segmentation methods struggle with small organs and have difficulty managing the complex interdependencies between organs. To address these issues, this study proposed an OAR-UNet segmentation method based on a U-shaped architecture with two key designs. To tackle the challenge of segmenting small organs, a Local Feature Perception Module (LFPM) is developed to enhance the sensitivity of the method to subtle structures. Furthermore, a Cross-shaped Transformer Block (CSTB) with a cross-shaped attention mechanism is introduced to improve the ability of the model to capture and process long-distance dependency information. To accelerate the convergence of the Transformer, we designed a Local Encoding Module (LEM) based on depthwise separable convolutions. In our experimental evaluation, we utilized two publicly available datasets, SegRap2023 and PDDCA, achieving Dice coefficients of 78.22% and 89.42%, respectively. These results demonstrate that our method outperforms both previous classic methods and state-of-the-art (SOTA) methods.
... IMRT has demonstrated noticeable advancements in enhancing the 5-year locoregional control rate and reducing radiationassociated toxicities among NPC patients [4]. Accurate delineation of the GTV based on Magnetic Resonance Imaging (MRI) is critical in radiation therapy, particularly in IMRT for NPC [5]- [7]. Our focus is on MRI-based automatic GTV segmentation for its detailed soft tissue characterization, and we acknowledge the indispensable role of CT in radiation planning [8], [9], which complements the MRI. ...
Article
Full-text available
Nasopharyngeal carcinoma (NPC) is a prevalent and clinically significant malignancy that predominantly impacts the head and neck area. Precise delineation of the Gross Tumor Volume (GTV) plays a pivotal role in ensuring effective radiotherapy for NPC. Despite recent methods that have achieved promising results on GTV segmentation, they are still limited by lacking carefully-annotated data and hard-to-access data from multiple hospitals in clinical practice. Although some unsupervised domain adaptation (UDA) has been proposed to alleviate this problem, unconditionally mapping the distribution distorts the underlying structural information, leading to inferior performance. To address this challenge, we devise a novel Sourece-Free Active Domain Adaptation framework to facilitate domain adaptation for the GTV segmentation task. Specifically, we design a dual reference strategy to select domain-invariant and domain-specific representative samples from a specific target domain for annotation and model fine-tuning without relying on source-domain data. Our approach not only ensures data privacy but also reduces the workload for oncologists as it just requires annotating a few representative samples from the target domain and does not need to access the source data. We collect a large-scale clinical dataset comprising 1057 NPC patients from five hospitals to validate our approach. Experimental results show that our method outperforms the previous active learning (e.g., AADA and MHPL) and UDA (e.g., Tent and CPR) methods, and achieves comparable results to the fully supervised upper bound, even with few annotations, highlighting the significant medical utility of our approach. In addition, there is no public dataset about multi-center NPC segmentation, we will release code and dataset for future research ( Git ).
Article
Full-text available
To develop a deep learning model using transfer learning for automatic detection and segmentation of neck lymph nodes (LNs) in computed tomography (CT) images, the study included 11,013 annotated LNs with a short-axis diameter ≥ 3 mm from 626 head and neck cancer patients across four hospitals. The nnUNet model was used as a baseline, pre-trained on a large-scale head and neck dataset, and then fine-tuned with 4,729 LNs from hospital A for detection and segmentation. Validation was conducted on an internal testing cohort (ITC A) and three external testing cohorts (ETCs B, C, and D), with 1684 and 4600 LNs, respectively. Detection was evaluated via sensitivity, positive predictive value (PPV), and false positive rate per case (FP/vol), while segmentation was assessed using the Dice similarity coefficient (DSC) and Hausdorff distance (HD95). For detection, the sensitivity, PPV, and FP/vol in ITC A were 54.6%, 69.0%, and 3.4, respectively. In ETCs, the sensitivity ranged from 45.7% at 3.9 FP/vol to 63.5% at 5.8 FP/vol. Segmentation achieved a mean DSC of 0.72 in ITC A and 0.72 to 0.74 in ETCs, as well as a mean HD95 of 3.78 mm in ITC A and 2.73 mm to 2.85 mm in ETCs. No significant sensitivity difference was found between contrast-enhanced and unenhanced CT images (p = 0.502) or repeated CT images (p = 0.815) during adaptive radiotherapy. The model’s segmentation accuracy was comparable to that of experienced oncologists. The model shows promise in automatically detecting and segmenting neck LNs in CT images, potentially reducing oncologists’ segmentation workload.
Preprint
Full-text available
Computed Tomography (CT) is one of the most popular modalities for medical imaging. By far, CT images have contributed to the largest publicly available datasets for volumetric medical segmentation tasks, covering full-body anatomical structures. Large amounts of full-body CT images provide the opportunity to pre-train powerful models, e.g., STU-Net pre-trained in a supervised fashion, to segment numerous anatomical structures. However, it remains unclear in which conditions these pre-trained models can be transferred to various downstream medical segmentation tasks, particularly segmenting the other modalities and diverse targets. To address this problem, a large-scale benchmark for comprehensive evaluation is crucial for finding these conditions. Thus, we collected 87 public datasets varying in modality, target, and sample size to evaluate the transfer ability of full-body CT pre-trained models. We then employed a representative model, STU-Net with multiple model scales, to conduct transfer learning across modalities and targets. Our experimental results show that (1) there may be a bottleneck effect concerning the dataset size in fine-tuning, with more improvement on both small- and large-scale datasets than medium-size ones. (2) Models pre-trained on full-body CT demonstrate effective modality transfer, adapting well to other modalities such as MRI. (3) Pre-training on the full-body CT not only supports strong performance in structure detection but also shows efficacy in lesion detection, showcasing adaptability across target tasks. We hope that this large-scale open evaluation of transfer learning can direct future research in volumetric medical image segmentation.
Article
Full-text available
The deep learning (DL)-based prediction of accurate lymph node (LN) clinical target volumes (CTVs) for nasopharyngeal carcinoma (NPC) radiotherapy (RT) remains challenging. One of the main reasons is the variability of contours despite standardization processes by expert guidelines in combination with scarce data sharing in the community. Therefore, we retrospectively generated a 262-subjects dataset from four centers to develop the DL models for LN CTVs delineation. This dataset included 440 computed tomography images from different scanning phases, disease stages and treatment strategies. Three clinical expert boards, each comprising two experts (totalling six experts), manually delineated six basic LN CTVs on separate cohorts as the ground truth according to LN involvement and clinical requirements. Several state-of-the-art segmentation algorithms were evaluated on this benchmark, showing promising results for LN CTV segmentation. In conclusion, this work built a multicenter LN CTV segmentation dataset, which may be the first dataset for automatic LN CTV delineation development and evaluation, serving as a benchmark for future research.
Article
Full-text available
Nasopharyngeal carcinoma (NPC) is a prevalent and clinically significant malignancy that predominantly impacts the head and neck area. Precise delineation of the Gross Tumor Volume (GTV) plays a pivotal role in ensuring effective radiotherapy for NPC. Despite recent methods that have achieved promising results on GTV segmentation, they are still limited by lacking carefully-annotated data and hard-to-access data from multiple hospitals in clinical practice. Although some unsupervised domain adaptation (UDA) has been proposed to alleviate this problem, unconditionally mapping the distribution distorts the underlying structural information, leading to inferior performance. To address this challenge, we devise a novel Sourece-Free Active Domain Adaptation framework to facilitate domain adaptation for the GTV segmentation task. Specifically, we design a dual reference strategy to select domain-invariant and domain-specific representative samples from a specific target domain for annotation and model fine-tuning without relying on source-domain data. Our approach not only ensures data privacy but also reduces the workload for oncologists as it just requires annotating a few representative samples from the target domain and does not need to access the source data. We collect a large-scale clinical dataset comprising 1057 NPC patients from five hospitals to validate our approach. Experimental results show that our method outperforms the previous active learning (e.g., AADA and MHPL) and UDA (e.g., Tent and CPR) methods, and achieves comparable results to the fully supervised upper bound, even with few annotations, highlighting the significant medical utility of our approach. In addition, there is no public dataset about multi-center NPC segmentation, we will release code and dataset for future research ( Git ).
Article
Full-text available
Purpose For the cancer in the head and neck (HaN), radiotherapy (RT) represents an important treatment modality. Segmentation of organs‐at‐risk (OARs) is the starting point of RT planning, however, existing approaches are focused on either computed tomography (CT) or magnetic resonance (MR) images, while multimodal segmentation has not been thoroughly explored yet. We present a dataset of CT and MR images of the same patients with curated reference HaN OAR segmentations for an objective evaluation of segmentation methods. Acquisition and validation methods The cohort consists of HaN images of 56 patients that underwent both CT and T1‐weighted MR imaging for image‐guided RT. For each patient, reference segmentations of up to 30 OARs were obtained by experts performing manual pixel‐wise image annotation. By maintaining the distribution of patient age and gender, and annotation type, the patients were randomly split into training Set 1 (42 cases or 75%) and test Set 2 (14 cases or 25%). Baseline auto‐segmentation results are also provided by training the publicly available deep nnU‐Net architecture on Set 1, and evaluating its performance on Set 2. Data format and usage notes The data are publicly available through an open‐access repository under the name HaN‐Seg: The Head and Neck Organ‐at‐Risk CT & MR Segmentation Dataset. Images and reference segmentations are stored in the NRRD file format, where the OAR filenames correspond to the nomenclature recommended by the American Association of Physicists in Medicine, and OAR and demographics information is stored in separate comma‐separated value files. Potential applications The HaN‐Seg: The Head and Neck Organ‐at‐Risk CT & MR Segmentation Challenge is launched in parallel with the dataset release to promote the development of automated techniques for OAR segmentation in the HaN. Other potential applications include out‐of‐challenge algorithm development and benchmarking, as well as external validation of the developed algorithms.
Article
Full-text available
During the production of strip steel, there are often some defects on the surface of the product. Therefore, the detection of such defect is the key to produce high-quality products. At the same time, the surface defects of the steel cause huge economic losses to the high-tech industry. A steel surface defect detection algorithm based on improved YOLO-V7 is proposed to address the problems of low detection speed and poor detection accuracy of traditional steel surface defect detection methods. First, we use the de-weighted BiFPN structure to make full use of the deep, shallow and original feature information to strengthen feature fusion, reduce the loss of feature information during the convolution process, and improve the detection accuracy. Secondly, the ECA attention mechanism is combined in the backbone part to strengthen the important feature channels. Finally, the original bounding box loss function is replaced by the SIoU loss function, where the penalty term is redefined by taking the vector angle between the required regressions into account. The experimental results show that the improved model proposed in this paper has higher performance compared with other comparison models. Based on our experiments, the proposed model yields 80.2% mAP on the GC10-DET dataset and 81.9% mAP on the NEU-DET dataset with high detection speed, which is better than other existing models.
Article
Full-text available
In radiotherapy for cancer patients, an indispensable process is to delineate organs-at-risk (OARs) and tumors. However, it is the most time-consuming step as manual delineation is always required from radiation oncologists. Herein, we propose a lightweight deep learning framework for radiotherapy treatment planning (RTP), named RTP-Net, to promote an automatic, rapid, and precise initialization of whole-body OARs and tumors. Briefly, the framework implements a cascade coarse-to-fine segmentation, with adaptive module for both small and large organs, and attention mechanisms for organs and boundaries. Our experiments show three merits: 1) Extensively evaluates on 67 delineation tasks on a large-scale dataset of 28,581 cases; 2) Demonstrates comparable or superior accuracy with an average Dice of 0.95; 3) Achieves near real-time delineation in most tasks with <2 s. This framework could be utilized to accelerate the contouring process in the All-in-One radiotherapy scheme, and thus greatly shorten the turnaround time of patients.
Article
Full-text available
Accurate organ-at-risk (OAR) segmentation is critical to reduce radiotherapy complications. Consensus guidelines recommend delineating over 40 OARs in the head-and-neck (H&N). However, prohibitive labor costs cause most institutions to delineate a substantially smaller subset of OARs, neglecting the dose distributions of other OARs. Here, we present an automated and highly effective stratified OAR segmentation (SOARS) system using deep learning that precisely delineates a comprehensive set of 42 H&N OARs. We train SOARS using 176 patients from an internal institution and independently evaluate it on 1327 external patients across six different institutions. It consistently outperforms other state-of-the-art methods by at least 3–5% in Dice score for each institutional evaluation (up to 36% relative distance error reduction). Crucially, multi-user studies demonstrate that 98% of SOARS predictions need only minor or no revisions to achieve clinical acceptance (reducing workloads by 90%). Moreover, segmentation and dosimetric accuracy are within or smaller than the inter-user variation.
Article
Full-text available
Whole abdominal organ segmentation is important in diagnosing abdomen lesions, radiotherapy, and follow-up. However, oncologists’ delineating all abdominal organs from 3D volumes is time-consuming and very expensive. Deep learning-based medical image segmentation has shown the potential to reduce manual delineation efforts, but it still requires a large-scale fine annotated dataset for training, and there is a lack of large-scale datasets covering the whole abdomen region with accurate and detailed annotations for the whole abdominal organ segmentation. In this work, we establish a new large-scale Whole abdominal ORgan Dataset (WORD) for algorithm research and clinical application development. This dataset contains 150 abdominal CT volumes (30495 slices). Each volume has 16 organs with fine pixel-level annotations and scribble-based sparse annotations, which may be the largest dataset with whole abdominal organ annotation. Several state-of-the-art segmentation methods are evaluated on this dataset. And we also invited three experienced oncologists to revise the model predictions to measure the gap between the deep learning method and oncologists. Afterwards, we investigate the inference-efficient learning on the WORD, as the high-resolution image requires large GPU memory and a long inference time in the test stage. We further evaluate the scribble-based annotation-efficient learning on this dataset, as the pixel-wise manual annotation is time-consuming and expensive. The work provided a new benchmark for the abdominal multi-organ segmentation task, and these experiments can serve as the baseline for future research and clinical application development.
Article
Full-text available
This paper relates the post-analysis of the first edition of the HEad and neCK TumOR (HECKTOR) challenge. This challenge was held as a satellite event of the 23rd International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2020, and was the first of its kind focusing on lesion segmentation in combined FDG-PET and CT image modalities. The challenge’s task is the automatic segmentation of the Gross Tumor Volume (GTV) of Head and Neck (H&N) oropharyngeal primary tumors in FDG-PET/CT images. To this end, the participants were given a training set of 201 cases from four different centers and their methods were tested on a held-out set of 53 cases from a fifth center. The methods were ranked according to the Dice Score Coefficient (DSC) averaged across all test cases. An additional inter-observer agreement study was organized to assess the difficulty of the task from a human perspective. 64 teams registered to the challenge, among which 10 provided a paper detailing their approach. The best method obtained an average DSC of 0.7591, showing a large improvement over our proposed baseline method and the inter-observer agreement, associated with DSCs of 0.6610 and 0.61, respectively. The automatic methods proved to successfully leverage the wealth of metabolic and structural properties of combined PET and CT modalities, significantly outperforming human inter-observer agreement level, semi-automatic thresholding based on PET images as well as other single modality-based methods. This promising performance is one step forward towards large-scale radiomics studies in H&N cancer, obviating the need for error-prone and time-consuming manual delineation of GTVs.
Article
Full-text available
With the extensive application of intensity‐modulated conformal radiotherapy and the deepening of the concept of comprehensive treatment, the therapeutic effect of nasopharyngeal carcinoma and the quality of life of patients have been significantly improved. However, guidelines for radiotherapy of nasopharyngeal carcinoma in China are in short supply. Dozens of experts from the Radiation Oncology Physicians Branch of the Chinese Medical Doctor Association and the Radiation Oncology Branch of the Chinese Medical Association have developed the publication Guidelines for radiotherapy of nasopharyngeal carcinoma in China after discussion. The guidelines include the epidemiology, diagnosis, clinical stage, treatment principle, and treatment of complications of radiotherapy of nasopharyngeal carcinoma. Importantly, the procedure of radiotherapy for nasopharyngeal carcinoma has been developed, which covers the imaging of nasopharyngeal carcinoma, radiotherapy localization, target area delineation, dose limitation, and plan evaluation. The guidelines will help to realize the homogenization of the diagnosis and treatment of nasopharyngeal carcinoma among radiotherapeutic medical staff in different levels of hospitals in China, thereby improving the overall level of diagnosis and treatment of nasopharyngeal carcinoma in China.
Article
Full-text available
Background Over half a million individuals are diagnosed with head and neck cancer each year globally. Radiotherapy is an important curative treatment for this disease, but it requires manual time to delineate radiosensitive organs at risk. This planning process can delay treatment while also introducing interoperator variability, resulting in downstream radiation dose differences. Although auto-segmentation algorithms offer a potentially time-saving solution, the challenges in defining, quantifying, and achieving expert performance remain. Objective Adopting a deep learning approach, we aim to demonstrate a 3D U-Net architecture that achieves expert-level performance in delineating 21 distinct head and neck organs at risk commonly segmented in clinical practice. Methods The model was trained on a data set of 663 deidentified computed tomography scans acquired in routine clinical practice and with both segmentations taken from clinical practice and segmentations created by experienced radiographers as part of this research, all in accordance with consensus organ at risk definitions. Results We demonstrated the model’s clinical applicability by assessing its performance on a test set of 21 computed tomography scans from clinical practice, each with 21 organs at risk segmented by 2 independent experts. We also introduced surface Dice similarity coefficient, a new metric for the comparison of organ delineation, to quantify the deviation between organ at risk surface contours rather than volumes, better reflecting the clinical task of correcting errors in automated organ segmentations. The model’s generalizability was then demonstrated on 2 distinct open-source data sets, reflecting different centers and countries to model training. Conclusions Deep learning is an effective and clinically applicable technique for the segmentation of the head and neck anatomy for radiotherapy. With appropriate validation studies and regulatory approvals, this system could improve the efficiency, consistency, and safety of radiotherapy pathways.
Article
Full-text available
Segmentation of organs or lesions from medical images plays an essential role in many clinical applications such as diagnosis and treatment planning. Though Convolutional Neural Networks (CNN) have achieved the state-of-the-art performance for automatic segmentation, they are often limited by the lack of clinically acceptable accuracy and robustness in complex cases. Therefore, interactive segmentation is a practical alternative to these methods. However, traditional interactive segmentation methods require a large amount of user interactions, and recently proposed CNN-based interactive segmentation methods are limited by poor performance on previously unseen objects. To solve these problems, we propose a novel deep learning-based interactive segmentation method that not only has high efficiency due to only requiring clicks as user inputs but also generalizes well to a range of previously unseen objects. Specifically, we first encode user-provided interior margin points via our proposed exponentialized geodesic distance that enables a CNN to achieve a good initial segmentation result of both previously seen and unseen objects, then we use a novel information fusion method that combines the initial segmentation with only few additional user clicks to efficiently obtain a refined segmentation. We validated our proposed framework through extensive experiments on 2D and 3D medical image segmentation tasks with a wide range of previous unseen objects that were not present in the training set. Experimental results showed that our proposed framework 1) achieves accurate results with fewer user interactions and less time compared with state-of-the-art interactive frameworks and 2) generalizes well to previously unseen objects.
Article
Full-text available
Background and purpose Delineating organs at risk (OARs) on computed tomography (CT) images is an essential step in radiation therapy; however, it is notoriously time-consuming and prone to inter-observer variation. Herein, we report a deep learning-based automatic segmentation (AS) algorithm (WBNet) that can accurately and efficiently delineate all major OARs in the entire body directly on CT scans. Materials and methods We collected 755 CT scans of the head and neck, thorax, abdomen, and pelvis and manually delineated 50 OARs on the CT images. The CT images with contours were split into training and test sets consisting of 505 and 250 cases, respectively, to develop and validate WBNet. The volumetric Dice similarity coefficient (DSC) and 95th-percentile Hausdorff distance (95% HD) were calculated to evaluate delineation quality for each OAR. We compared the performance of WBNet with three AS algorithms: one commercial multi-atlas-based automatic segmentation (ABAS) software, and two deep learning-based AS algorithms, namely, AnatomyNet and nnU-Net. We have also evaluated the time saving and dose accuracy of WBNet. Results WBNet achieved average DSCs of 0.84 and 0.81 on in-house and public datasets, respectively, which outperformed ABAS, AnatomyNet, and nnU-Net. WBNet could reduce the delineation time significantly and perform well in treatment planning, with clinically acceptable dose differences compared with those in manual delineation. Conclusion This study shows the feasibility and benefits of using WBNet in clinical practice.
Article
Full-text available
Accurate segmentation of Organs-at-Risk (OAR) from Head and Neck (HAN) Computed Tomography (CT) images with uncertainty information is critical for effective planning of radiation therapy for Nasopharyngeal Carcinoma (NPC) treatment. Despite the state-of-the-art performance achieved by Convolutional Neural Networks (CNNs) for the segmentation task, existing methods do not provide uncertainty estimation of the segmentation results for treatment planning, and their accuracy is still limited by the low contrast of soft tissues in CT, highly imbalanced sizes of OARs and large inter-slice spacing. To address these problems, we propose a novel framework for accurate OAR segmentation with reliable uncertainty estimation. First, we propose a Segmental Linear Function (SLF) to transform the intensity of CT images to make multiple organs more distinguishable than existing simple window width/level-based methods. Second, we introduce a novel 2.5D network (named as 3D-SepNet) specially designed for dealing with clinic CT scans with anisotropic spacing. Thirdly, we propose a novel hardness-aware loss function that pays attention to hard voxels for accurate segmentation. We also use an ensemble of models trained with different loss functions and intensity transforms to obtain robust results, which also leads to segmentation uncertainty without extra efforts. Our method won the third place of the HAN OAR segmentation task in StructSeg 2019 challenge and it achieved weighted average Dice of 80.52% and 95% Hausdorff Distance of 3.043 mm. Experimental results show that 1) our SLF for intensity transform helps to improve the accuracy of OAR segmentation from CT images; 2) With only 1/3 parameters of 3D UNet, our 3D-SepNet obtains better segmentation results for most OARs; 3) The proposed hard voxel weighting strategy used for training effectively improves the segmentation accuracy; 4) The segmentation uncertainty obtained by our method has a high correlation to mis-segmentations, which has a potential to assist more informed decisions in clinical practice. Our code is available at https://github.com/HiLab-git/SepNet.
Article
Full-text available
Radiotherapy is the main treatment method for nasopharynx cancer. Delineation of Gross Target Volume (GTV) from medical images is a prerequisite for radiotherapy. As manual delineation is time-consuming and laborious, automatic segmentation of GTV has a potential to improve the efficiency of this process. This work aims to automatically segment GTV of nasopharynx cancer from Computed Tomography (CT) images. However, it is challenged by the small target region, anisotropic resolution of clinical CT images, and the low contrast between the target region and surrounding soft tissues. To deal with these problems, we propose a 2.5D Convolutional Neural Network (CNN) to handle the different in-plane and through-plane resolutions. We also propose a spatial attention module to enable the network to focus on the small target, and use channel attention to further improve the segmentation performance. Moreover, we use a multi-scale sampling method for training so that the networks can learn features at different scales, which are combined with a multi-model ensemble method to improve the robustness of segmentation results. We also estimate the uncertainty of segmentation results based on our model ensemble, which is of great importance for indicating the reliability of automatic segmentation results for radiotherapy planning. Experiments with 2019 MICCAI StructSeg dataset showed that (1) Our proposed 2.5D network has a better performance on images with anisotropic resolution than the commonly used 3D networks. (2) Our attention mechanism can make the network pay more attention to the small GTV region and improve the segmentation accuracy. (3) The proposed multi-scale model ensemble achieves more robust results, and it can simultaneously obtain uncertainty information that can indicate potential mis-segmentations for better clinical decisions.
Article
Full-text available
Biomedical imaging is a driver of scientific discovery and a core component of medical care and is being stimulated by the field of deep learning. While semantic segmentation algorithms enable image analysis and quantification in many applications, the design of respective specialized solutions is non-trivial and highly dependent on dataset properties and hardware conditions. We developed nnU-Net, a deep learning-based segmentation method that automatically configures itself, including preprocessing, network architecture, training and post-processing for any new task. The key design choices in this process are modeled as a set of fixed parameters, interdependent rules and empirical decisions. Without manual intervention, nnU-Net surpasses most existing approaches, including highly specialized solutions on 23 public datasets used in international biomedical segmentation competitions. We make nnU-Net publicly available as an out-of-the-box tool, rendering state-of-the-art segmentation accessible to a broad audience by requiring neither expert knowledge nor computing resources beyond standard network training.
Article
Full-text available
The number of biomedical image analysis challenges organized per year is steadily increasing. These international competitions have the purpose of benchmarking algorithms on common data sets, typically to identify the best method for a given problem. Recent research, however, revealed that common practice related to challenge reporting does not allow for adequate interpretation and reproducibility of results. To address the discrepancy between the impact of challenges and the quality (control), the Biomedical Image Analysis ChallengeS (BIAS) initiative developed a set of recommendations for the reporting of challenges. The BIAS statement aims to improve the transparency of the reporting of a biomedical image analysis challenge regardless of field of application, image modality or task category assessed. This article describes how the BIAS statement was developed and presents a checklist which authors of biomedical image analysis challenges are encouraged to include in their submission when giving a paper on a challenge into review. The purpose of the checklist is to standardize and facilitate the review process and raise interpretability and reproducibility of challenge results by making relevant information explicit.
Article
Full-text available
This study aimed to develop an automated delineation method of nasopharynx gross tumor volume (GTVnx) for nasopharyngeal carcinoma (NPC) in computed tomography (CT) image for radiotherapy applications. Inspired by ResNet and SENet’s strong ability to extract image features, we proposed a modified version of the 3D U-Net model with Res-blocks and SE-block for delineation of GTVnx. Besides, an automatic pre-processing method was proposed to crop the 3D region of interest (ROI) of GTVnx. Radiotherapy simulation CT images and corresponding manually delineated target of 205 NPC patients diagnosed with stage T1-T4 were used as datasets for training. Automated delineation models were generated based on CT combining contrast-enhanced CT (CE-CT) and CT alone, respectively. We compared the automatic delineation results against the manual delineated contours by radiation oncologists with 5-fold cross-validation to evaluate the performance of the proposed model. We also compared with the framework using 3D CNN and 2D DDNN, respectively. Besides, the model generated by one medical group was assessed against the other two separate medical groups. Precision (PR), Sensitivity (SE), Dice Similarity Coefficient (DSC), Average Symmetric Surface Distance (ASSD), and 95% Hausdorff Distance (HD95) are calculated for quantitative evaluation. Experimental results show that the proposed method outperforms other automatic methods on the CT images. Automated delineation models based on CT combining CE-CT is superior to that based on CT alone. The presented method could be useful and robust for the 3D delineation of GTVnx for NPC in CT images during the planning of radiotherapy.
Article
Full-text available
Radiotherapy is the main treatment strategy for nasopharyngeal carcinoma. A major factor affecting radiotherapy outcome is the accuracy of target delineation. Target delineation is time-consuming, and the results can vary depending on the experience of the oncologist. Using deep learning methods to automate target delineation may increase its efficiency. We used a modified deep learning model called U-Net to automatically segment and delineate tumor targets in patients with nasopharyngeal carcinoma. Patients were randomly divided into a training set (302 patients), validation set (100 patients), and test set (100 patients). The U-Net model was trained using labeled computed tomography images from the training set. The U-Net was able to delineate nasopharyngeal carcinoma tumors with an overall dice similarity coefficient of 65.86% for lymph nodes and 74.00% for primary tumor, with respective Hausdorff distances of 32.10 and 12.85 mm. Delineation accuracy decreased with increasing cancer stage. Automatic delineation took approximately 2.6 hours, compared to 3 hours, using an entirely manual procedure. Deep learning models can therefore improve accuracy, consistency, and efficiency of target delineation in T stage, but additional physician input may be required for lymph nodes.
Article
Full-text available
Radiation therapy is one of the most widely used therapies for cancer treatment. A critical step in radiation therapy planning is to accurately delineate all organs at risk (OARs) to minimize potential adverse effects to healthy surrounding organs. However, manually delineating OARs based on computed tomography images is time-consuming and error-prone. Here, we present a deep learning model to automatically delineate OARs in head and neck, trained on a dataset of 215 computed tomography scans with 28 OARs manually delineated by experienced radiation oncologists. On a hold-out dataset of 100 computed tomography scans, our model achieves an average Dice similarity coefficient of 78.34% across the 28 OARs, significantly outperforming human experts and the previous state-of-the-art method by 10.05% and 5.18%, respectively. Our model takes only a few seconds to delineate an entire scan, compared to over half an hour by human experts. These findings demonstrate the potential for deep learning to improve the quality and reduce the treatment planning time of radiation therapy. To keep radiation therapy from damaging healthy tissue, expert radiologists have to segment CT scans into individual organs. A new deep learning-based method for delineating organs in the area of head and neck performs faster and more accurately than human experts.
Article
Full-text available
Purpose: Previous studies demonstrated that the radiation therapy, image technology, and the application of chemotherapy have developed in the last 2 decades. This study explored the survival trends and treatment failure patterns of patients with nonmetastatic nasopharyngeal carcinoma (NPC) treated with radiation therapy. Furthermore, we evaluated the survival benefit brought by the development of radiation therapy, image technology, and chemotherapy based on a large cohort from 1990 to 2012. Methods and materials: Data from 20,305 patients with nonmetastatic NPC treated between 1990 and 2012 were analyzed. Patients were divided into 4 calendar periods (1990-1996, 1997-2002, 2003-2007, and 2008-2012). Overall survival (OS) was the primary endpoint. Results: Magnetic resonance imaging has replaced computed tomography as the most important imaging technique since 2003. Conventional 2-dimensional radiation therapy, which was the main radiation therapy technique in our institution before 2008, was replaced by intensity modulated radiation therapy later. An increasing number of patients have undergone chemotherapy since 2003. The 5-year OS across the 4 calendar periods increased at each TNM stage with progression-free survival (PFS) and locoregional relapse-free survival (LRFS) showing a similar trend, whereas distant metastasis-free survival showed small differences. Multivariate analyses showed that the application of intensity modulated radiation therapy and magnetic resonance imaging were independent protective factors in OS, PFS, LRFS, and distant metastasis-free survival. Chemotherapy benefited patients in OS, PFS, and LRFS. The main pattern of treatment failure shifted from recurrence to distant metastasis. Conclusions: The development of radiation therapy, image technology, and chemotherapy increased survival rates among patients with NPC because of excellent locoregional control. Distant failure has become the greatest challenge for NPC treatment.
Article
Full-text available
Purpose Automatic segmentation of organs‐at‐risk (OARs) is a key step in radiation treatment planning to reduce human efforts and bias. Deep convolutional neural networks (DCNN) have shown great success in many medical image segmentation applications but there are still challenges in dealing with large 3D images for optimal results. The purpose of this study is to develop a novel DCNN method for thoracic OARs segmentation using cropped 3D images. Methods To segment the five organs (left and right lungs, heart, esophagus and spinal cord) from the thoracic CT scans, preprocessing to unify the voxel spacing and intensity was first performed, a 3D U‐Net was then trained on resampled thoracic images to localize each organ, then the original images were cropped to only contain one organ and served as the input to each individual organ segmentation network. The segmentation maps were then merged to get the final results. The network structures were optimized for each step, as well as the training and testing strategies. A novel testing augmentation with multiple iterations of image cropping was used. The networks were trained on 36 thoracic CT scans with expert annotations provided by the organizers of the 2017 AAPM Thoracic Auto‐segmentation Challenge and tested on the challenge testing dataset as well as a private dataset. Results The proposed method earned second place in the live phase of the challenge and first place in the subsequent ongoing phase using a newly developed testing augmentation approach. It showed superior‐than‐human performance on average in terms of Dice scores (spinal cord: 0.893 ± 0.044, right lung: 0.972 ± 0.021, left lung: 0.979 ± 0.008, heart: 0.925 ± 0.015, esophagus: 0.726 ± 0.094), mean surface distance (spinal cord: 0.662 ± 0.248 mm, right lung: 0.933 ± 0.574 mm, left lung: 0.586 ± 0.285 mm, heart: 2.297 ± 0.492 mm, esophagus: 2.341 ± 2.380 mm) and 95% Hausdorff distance (spinal cord: 1.893 ± 0.627 mm, right lung: 3.958 ± 2.845 mm, left lung: 2.103 ± 0.938 mm, heart: 6.570 ± 1.501 mm, esophagus: 8.714 ± 10.588 mm). It also achieved good performance in the private dataset and reduced the editing time to 7.5 min per patient following automatic segmentation. Conclusions The proposed DCNN method demonstrated good performance in automatic OAR segmentation from thoracic CT scans. It has the potential for eventual clinical adoption of deep learning in radiation treatment planning due to improved accuracy and reduced cost for OAR segmentation.
Article
Full-text available
Purpose Accurate and timely organs‐at‐risk (OARs) segmentation is key to efficient and high‐quality radiation therapy planning. The purpose of this work is to develop a deep learning‐based method to automatically segment multiple thoracic OARs on chest computed tomography (CT) for radiotherapy treatment planning. Methods We propose an adversarial training strategy to train deep neural networks for the segmentation of multiple organs on thoracic CT images. The proposed design of adversarial networks, called U‐Net‐generative adversarial network (U‐Net‐GAN), jointly trains a set of U‐Nets as generators and fully convolutional networks (FCNs) as discriminators. Specifically, the generator, composed of U‐Net, produces an image segmentation map of multiple organs by an end‐to‐end mapping learned from CT image to multiorgan‐segmented OARs. The discriminator, structured as an FCN, discriminates between the ground truth and segmented OARs produced by the generator. The generator and discriminator compete against each other in an adversarial learning process to produce the optimal segmentation map of multiple organs. Our segmentation results were compared with manually segmented OARs (ground truth) for quantitative evaluations in geometric difference, as well as dosimetric performance by investigating the dose‐volume histogram in 20 stereotactic body radiation therapy (SBRT) lung plans. Results This segmentation technique was applied to delineate the left and right lungs, spinal cord, esophagus, and heart using 35 patients’ chest CTs. The averaged dice similarity coefficient for the above five OARs are 0.97, 0.97, 0.90, 0.75, and 0.87, respectively. The mean surface distance of the five OARs obtained with proposed method ranges between 0.4 and 1.5 mm on average among all 35 patients. The mean dose differences on the 20 SBRT lung plans are −0.001 to 0.155 Gy for the five OARs. Conclusion We have investigated a novel deep learning‐based approach with a GAN strategy to segment multiple OARs in the thorax using chest CT images and demonstrated its feasibility and reliability. This is a potentially valuable method for improving the efficiency of chest radiotherapy treatment planning.
Article
Full-text available
Purpose Radiation therapy (RT) is a common treatment option for head and neck (HaN) cancer. An important step involved in RT planning is the delineation of organs‐at‐risks (OARs) based on HaN computed tomography (CT). However, manually delineating OARs is time‐consuming as each slice of CT images needs to be individually examined and a typical CT consists of hundreds of slices. Automating OARs segmentation has the benefit of both reducing the time and improving the quality of RT planning. Existing anatomy autosegmentation algorithms use primarily atlas‐based methods, which require sophisticated atlas creation and cannot adequately account for anatomy variations among patients. In this work, we propose an end‐to‐end, atlas‐free three‐dimensional (3D) convolutional deep learning framework for fast and fully automated whole‐volume HaN anatomy segmentation. Methods Our deep learning model, called AnatomyNet, segments OARs from head and neck CT images in an end‐to‐end fashion, receiving whole‐volume HaN CT images as input and generating masks of all OARs of interest in one shot. AnatomyNet is built upon the popular 3D U‐net architecture, but extends it in three important ways: (a) a new encoding scheme to allow autosegmentation on whole‐volume CT images instead of local patches or subsets of slices, (b) incorporating 3D squeeze‐and‐excitation residual blocks in encoding layers for better feature representation, and (c) a new loss function combining Dice scores and focal loss to facilitate the training of the neural model. These features are designed to address two main challenges in deep learning‐based HaN segmentation: (a) segmenting small anatomies (i.e., optic chiasm and optic nerves) occupying only a few slices, and (b) training with inconsistent data annotations with missing ground truth for some anatomical structures. Results We collected 261 HaN CT images to train AnatomyNet and used MICCAI Head and Neck Auto Segmentation Challenge 2015 as a benchmark dataset to evaluate the performance of AnatomyNet. The objective is to segment nine anatomies: brain stem, chiasm, mandible, optic nerve left, optic nerve right, parotid gland left, parotid gland right, submandibular gland left, and submandibular gland right. Compared to previous state‐of‐the‐art results from the MICCAI 2015 competition, AnatomyNet increases Dice similarity coefficient by 3.3% on average. AnatomyNet takes about 0.12 s to fully segment a head and neck CT image of dimension 178 × 302 × 225, significantly faster than previous methods. In addition, the model is able to process whole‐volume CT images and delineate all OARs in one pass, requiring little pre‐ or postprocessing. Conclusion Deep learning models offer a feasible solution to the problem of delineating OARs from CT images. We demonstrate that our proposed model can improve segmentation accuracy and simplify the autosegmentation pipeline. With this method, it is possible to delineate OARs of a head and neck CT within a fraction of a second.
Article
Full-text available
Quantitative extraction of high-dimensional mineable data from medical images is a process known as radiomics. Radiomics is foreseen as an essential prognostic tool for cancer risk assessment and the quantification of intratumoural heterogeneity. In this work, 1615 radiomic features (quantifying tumour image intensity, shape, texture) extracted from pre-treatment FDG-PET and CT images of 300 patients from four different cohorts were analyzed for the risk assessment of locoregional recurrences (LR) and distant metastases (DM) in head-and-neck cancer. Prediction models combining radiomic and clinical variables were constructed via random forests and imbalance-adjustment strategies using two of the four cohorts. Independent validation of the prediction and prognostic performance of the models was carried out on the other two cohorts (LR: AUC = 0.69 and CI = 0.67; DM: AUC = 0.86 and CI = 0.88). Furthermore, the results obtained via Kaplan-Meier analysis demonstrated the potential of radiomics for assessing the risk of specific tumour outcomes using multiple stratification groups. This could have important clinical impact, notably by allowing for a better personalization of chemo-radiation treatments for head-and-neck cancer patients from different risk groups.
Article
Full-text available
Purpose: Automated delineation of structures and organs is a key step in medical imaging. However, due to the large number and diversity of structures and the large variety of segmentation algorithms a consensus is lacking as to which automated segmentation method works best for certain applications. Segmentation challenges are a good approach for unbiased evaluation and comparison of segmentation algorithms. Methods: In this work we describe and present the results of the Head and Neck Auto-Segmentation Challenge 2015, a satellite event at the Medical Image Computing and Computer Assisted Interventions (MICCAI) 2015 conference. Six teams participated in a challenge to segment nine structures in the head and neck region of CT images: brainstem, mandible, chiasm, bilateral optic nerves, bilateral parotid glands and bilateral submandibular glands. Results: This paper presents the quantitative results of this challenge using multiple established error metrics and a well-defined ranking system. The strengths and weaknesses of the different auto-segmentation approaches are analyzed and discussed. Conclusions: The Head and Neck Auto-Segmentation Challenge 2015 was a good opportunity to assess the current state-of-the-art in segmentation of organs at risk for radiotherapy treatment. Participating teams had the possibility to compare their approaches to other methods under unbiased and standardized circumstances. The results demonstrate a clear tendency towards more general-purpose and fewer structure-specific segmentation algorithms. This article is protected by copyright. All rights reserved.
Chapter
The medical imaging community generates a wealth of data-sets, many of which are openly accessible and annotated for specific diseases and tasks such as multi-organ or lesion segmentation. Current practices continue to limit model training and supervised pre-training to one or a few similar datasets, neglecting the synergistic potential of other available annotated data. We propose MultiTalent, a method that leverages multiple CT datasets with diverse and conflicting class definitions to train a single model for a comprehensive structure segmentation. Our results demonstrate improved segmentation performance compared to previous related approaches, systematically, also compared to single-dataset training using state-of-the-art methods, especially for lesion segmentation and other challenging structures. We show that MultiTalent also represents a powerful foundation model that offers a superior pre-training for various segmentation tasks compared to commonly used supervised or unsupervised pre-training baselines. Our findings offer a new direction for the medical imaging community to effectively utilize the wealth of available data for improved segmentation performance. The code and model weights will be published here: https://github.com/MIC-DKFZ/MultiTalent.
Chapter
The universal model emerges as a promising trend for medical image segmentation, paving up the way to build medical imaging large model (MILM). One popular strategy to build universal models is to encode each task as a one-hot vector and generate dynamic convolutional layers at the end of the decoder to extract the interested target. Although successful, it ignores the correlations among tasks and meanwhile is too late to make the model ‘aware’ of the ongoing task. To address both issues, we propose a prompt-driven Universal Segmentation model (UniSeg) for multi-task medical image segmentation using diverse modalities and domains. We first devise a learnable universal prompt to describe the correlations among all tasks and then convert this prompt and image features into a task-specific prompt, which is fed to the decoder as a part of its input. Thus, we make the model ‘aware’ of the ongoing task early and boost the task-specific training of the whole decoder. Our results indicate that the proposed UniSeg outperforms other universal models and single-task models on 11 upstream tasks. Moreover, UniSeg also beats other pre-trained models on two downstream datasets, providing the community with a high-quality pre-trained model for 3D medical image segmentation. Code and model are available at https://github.com/yeerwen/UniSeg.
Article
Purpose: To present a deep learning segmentation model that can automatically and robustly segment all major anatomic structures on body CT images. Materials and methods: In this retrospective study, 1204 CT examinations (from 2012, 2016, and 2020) were used to segment 104 anatomic structures (27 organs, 59 bones, 10 muscles, and eight vessels) relevant for use cases such as organ volumetry, disease characterization, and surgical or radiation therapy planning. The CT images were randomly sampled from routine clinical studies and thus represent a real-world dataset (different ages, abnormalities, scanners, body parts, sequences, and sites). The authors trained an nnU-Net segmentation algorithm on this dataset and calculated Dice similarity coefficients to evaluate the model's performance. The trained algorithm was applied to a second dataset of 4004 whole-body CT examinations to investigate age-dependent volume and attenuation changes. Results: The proposed model showed a high Dice score (0.943) on the test set, which included a wide range of clinical data with major abnormalities. The model significantly outperformed another publicly available segmentation model on a separate dataset (Dice score, 0.932 vs 0.871; P < .001). The aging study demonstrated significant correlations between age and volume and mean attenuation for a variety of organ groups (eg, age and aortic volume [rs = 0.64; P < .001]; age and mean attenuation of the autochthonous dorsal musculature [rs = -0.74; P < .001]). Conclusion: The developed model enables robust and accurate segmentation of 104 anatomic structures. The annotated dataset (https://doi.org/10.5281/zenodo.6802613) and toolkit (https://www.github.com/wasserth/TotalSegmentator) are publicly available.Keywords: CT, Segmentation, Neural Networks Supplemental material is available for this article. © RSNA, 2023See also commentary by Sebro and Mongan in this issue.
Article
Background and purpose: The problem of obtaining accurate primary gross tumor volume (GTVp) segmentation for nasopharyngeal carcinoma (NPC) on heterogeneous magnetic resonance imaging (MRI) images with deep learning remains unsolved. Herein, we reported a new deep-learning method than can accurately delineate GTVp for NPC on multi-center MRI scans. Material and methods: We collected 1057 patients with MRI images from five hospitals and randomly selected 600 patients from three hospitals to constitute a mixed training cohort for model development. The resting patients were used as internal (n = 259) and external (n = 198) testing cohorts for model evaluation. An augmentation-invariant strategy was proposed to delineate GTVp from multi-center MRI images, which encouraged networks to produce similar predictions for inputs with different augmentations to learn invariant anatomical structure features. The Dice similarity coefficient (DSC), 95% Hausdorff distance (HD95), average surface distance (ASD), and relative absolute volume difference (RAVD) were used to measure segmentation performance. Results: The model-generated predictions had a high overlap ratio with the ground truth. For the internal testing cohorts, the average DSC, HD95, ASD, and RAVD were 0.88, 4.99mm, 1.03mm, and 0.13, respectively. For external testing cohorts, the average DSC, HD95, ASD, and RAVD were 0.88, 3.97mm, 0.97mm, and 0.10, respectively. No significant differences were found in DSC, HD95, and ASD for patients with different T categories, MRI thickness, or in-plane spacings. Moreover, the proposed augmentation-invariant strategy outperformed the widely-used nnUNet, which uses conventional data augmentation approaches. Conclusion: Our proposed method showed a highly accurate GTVp segmentation for NPC on multi-center MRI images, suggesting that it has the potential to act as a generalized delineation solution for heterogeneous MRI images.
Article
In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with varied sizes and appearances with various lesion-to-background levels (hyper-/hypo-dense), created in collaboration with seven hospitals and research institutions. Seventy-five submitted liver and liver tumor segmentation algorithms were trained on a set of 131 computed tomography (CT) volumes and were tested on 70 unseen test images acquired from different patients. We found that not a single algorithm performed best for both liver and liver tumors in the three events. The best liver segmentation algorithm achieved a Dice score of 0.963, whereas, for tumor segmentation, the best algorithms achieved Dices scores of 0.674 (ISBI 2017), 0.702 (MICCAI 2017), and 0.739 (MICCAI 2018). Retrospectively, we performed additional analysis on liver tumor detection and revealed that not all top-performing segmentation algorithms worked well for tumor detection. The best liver tumor detection method achieved a lesion-wise recall of 0.458 (ISBI 2017), 0.515 (MICCAI 2017), and 0.554 (MICCAI 2018), indicating the need for further research. LiTS remains an active benchmark and resource for research, e.g., contributing the liver-related segmentation tasks in http://medicaldecathlon.com/. In addition, both data and online evaluation are accessible via www.lits-challenge.com.
Article
Domain Adaptation (DA) has recently been of strong interest in the medical imaging community. While a large variety of DA techniques have been proposed for image segmentation, most of these techniques have been validated either on private datasets or on small publicly available datasets. Moreover, these datasets mostly addressed single-class problems. To tackle these limitations, the Cross-Modality Domain Adaptation (crossMoDA) challenge was organised in conjunction with the 24th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2021). CrossMoDA is the first large and multi-class benchmark for unsupervised cross-modality Domain Adaptation. The goal of the challenge is to segment two key brain structures involved in the follow-up and treatment planning of vestibular schwannoma (VS): the VS and the cochleas. Currently, the diagnosis and surveillance in patients with VS are commonly performed using contrast-enhanced T1 (ceT1) MR imaging. However, there is growing interest in using non-contrast imaging sequences such as high-resolution T2 (hrT2) imaging. For this reason, we established an unsupervised cross-modality segmentation benchmark. The training dataset provides annotated ceT1 scans (N=105) and unpaired non-annotated hrT2 scans (N=105). The aim was to automatically perform unilateral VS and bilateral cochlea segmentation on hrT2 scans as provided in the testing set (N=137). This problem is particularly challenging given the large intensity distribution gap across the modalities and the small volume of the structures. A total of 55 teams from 16 countries submitted predictions to the validation leaderboard. Among them, 16 teams from 9 different countries submitted their algorithm for the evaluation phase. The level of performance reached by the top-performing teams is strikingly high (best median Dice score - VS: 88.4%; Cochleas: 85.7%) and close to full supervision (median Dice score - VS: 92.5%; Cochleas: 87.7%). All top-performing methods made use of an image-to-image translation approach to transform the source-domain images into pseudo-target-domain images. A segmentation network was then trained using these generated images and the manual annotations provided for the source image.
Article
Despite that Convolutional Neural Networks (CNNs) have achieved promising performance in many medical image segmentation tasks, they rely on a large set of labeled images for training, which is expensive and time-consuming to acquire. Semi-supervised learning has shown the potential to alleviate this challenge by learning from a large set of unlabeled images and limited labeled samples. In this work, we present a simple yet efficient consistency regularization approach for semi-supervised medical image segmentation, called Uncertainty Rectified Pyramid Consistency (URPC). Inspired by the pyramid feature network, we chose a pyramid-prediction network that obtains a set of segmentation predictions at different scales. For semi-supervised learning, URPC learns from unlabeled data by minimizing the discrepancy between each of the pyramid predictions and their average. We further present multi-scale uncertainty rectification to boost the pyramid consistency regularization, where the rectification seeks to temper the consistency loss at outlier pixels that may have substantially different predictions than the average, potentially due to upsampling errors or lack of enough labeled data. Experiments on two public datasets and an in-house clinical dataset showed that: 1) URPC can achieve large performance improvement by utilizing unlabeled data and 2) Compared with five existing semi-supervised methods, URPC achieved better or comparable results with a simpler pipeline. Furthermore, we build a semi-supervised medical image segmentation codebase to boost research on this topic: https://github.com/HiLab-git/SSL4MIS.
Article
Purpose To validate the accuracy and clinical value of a novel semi-supervised learning (SSL) framework for gross tumor volume (GTV) delineation in nasopharyngeal carcinoma (NPC). Material and methods 258 patients with MRI datasets were divided into training (n = 180), validation (n = 20), and testing (n = 58) cohorts. Ground truth contours of nasopharynx gross tumor volume (GTVnx) and node gross tumor volume (GTVnd) were manually delineated by two experienced radiation oncologists. 20% (n = 36) labeled and 80% (n = 144) unlabeled images were used to train model, producing model-generated contours for patients from testing cohort. Nine experienced experts were invited to revise model-generated GTV in randomly selected 20 patients from testing cohort. Six junior oncologists were asked to delineate GTV in randomly selected 12 patients from testing cohort without and with assistance of the model, and compared revision degrees under these two modes. The Dice similarity coefficient (DSC) was used to quantify the accuracy of the model. Results The model generated-contours showed a high accuracy when compared with ground truth contours, with an average DSC score of 0.83 and 0.80 for GTVnx and GTVnd, respectively. There was no significant difference in DSC score between T1-2 and T3-4 patients (0.81 vs. 0.83, p = 0.223), or between N1-2 and N3 patients (0.80 vs. 0.79, p = 0.807). The mean revision degree was lower than 10% in 19 (95%) patients for GTVnx, and in 16 (80%) patients for GTVnd. With assistance of the model, the mean revision degree for GTVnx and GTVnd by junior oncologists reduced from 25.63% to 7.75%, and from 21.38% to 14.44%, respectively. Meanwhile, the delineating efficiency was improved by over 60%. Conclusion The proposed SSL based model showed a high accuracy for delineating GTV of NPC. It was clinically applicable and could assist junior oncologists improving GTV contouring accuracy and saving contouring time.
Article
Nasopharyngeal carcinoma (NPC) is a malignant tumor whose survivability is greatly improved if early diagnosis and timely treatment are provided. Accurate segmentation of both the primary NPC tumors and metastatic lymph nodes (MLNs) is crucial for patient staging and radiotherapy scheduling. However, existing studies mainly focus on the segmentation of primary tumors, eliding the recognition of MLNs, and thus fail to comprehensively provide a landscape for tumor identification. There are three main challenges in segmenting primary NPC tumors and MLNs: variable location, variable size, and irregular boundary. To address these challenges, we propose an automatic segmentation network, named by NPCNet, to achieve segmentation of primary NPC tumors and MLNs simultaneously. Specifically, we design three modules, including position enhancement module (PEM), scale enhancement module (SEM), and boundary enhancement module (BEM), to address the above challenges. First, the PEM enhances the feature representations of the most suspicious regions. Subsequently, the SEM captures multiscale context information and target context information. Finally, the BEM rectifies the unreliable predictions in the segmentation mask. To that end, extensive experiments are conducted on our dataset of 9124 samples collected from 754 patients. Empirical results demonstrate that each module realizes its designed functionalities and is complementary to the others. By incorporating the three proposed modules together, our model achieves state-of-the-art performance compared with nine popular models.
Article
Nasopharyngeal carcinoma (NPC) is a malignant tumor in the nasopharyngeal epithelium and is mainly treated by radiotherapy. The accurate delineation of the target tumor can greatly improve the radiotherapy effectiveness. However, due to the small size of the NPC imaging volume, the scarcity of labeled samples, the low signal-to-noise ratio in small target areas and the lack of detailed features, automatic gross tumor volume (GTV) delineation inspired by advances in domain adaption for high-resolution image processing has become a great challenge. In addition, since computed tomography (CT) images have the low resolution of soft tissues, it is difficult to identify small volume tumors, and segmentation accuracy of this kind of small GTV is very low. In this paper, we propose an automatic segmentation model based on adversarial network and U-Net for NPC delineation. Specifically, we embed adversarial classification learning into a segmentation network to balance the distribution differences between the small targets in the sample and the large target categories. To reduce the loss weight of large target categories with large samples, and simultaneously increase the weight of small target categories, we design a new U-Net based on focal loss as a GTV segmentation model for adjusting the effect of different categories on the final loss. This method can effectively solve the feature bias caused by the imbalance of the target volume distribution. Furthermore, we conduct a pre-processing of images using an algorithm based on distribution histograms to ensure that the same or approximate CT value represents the same organization. In order to evaluate our proposed method, we perform experiments on the open datasets from StructSeg2019 and the datasets provided by Sichuan Provincial Cancer Hospital. The results of the comparison with some typical up-to-date methods demonstrate that our model can significantly enhance detection accuracy and sensitivity for NPC segmentation.
Article
Radiotherapy is a treatment where radiation is used to eliminate cancer cells. The delineation of organs-at-risk (OARs) is a vital step in radiotherapy treatment planning to avoid damage to healthy organs. For nasopharyngeal cancer, more than 20 OARs are needed to be precisely segmented in advance. The challenge of this task lies in complex anatomical structure, low-contrast organ contours, and the extremely imbalanced size between large and small organs. Common segmentation methods that treat them equally would generally lead to inaccurate small-organ labeling. We propose a novel two-stage deep neural network, FocusNetv2, to solve this challenging problem by automatically locating, ROI-pooling, and segmenting small organs with specifically designed small-organ localization and segmentation sub-networks while maintaining the accuracy of large organ segmentation. In addition to our original FocusNet, we employ a novel adversarial shape constraint on small organs to ensure the consistency between estimated small-organ shapes and organ shape prior knowledge. Our proposed framework is extensively tested on both self-collected dataset of 1,164 CT scans and the MICCAI Head and Neck Auto Segmentation Challenge 2015 dataset, which shows superior performance compared with state-of-the-art head and neck OAR segmentation methods.
Chapter
In this paper, we propose an end-to-end deep neural network for solving the problem of imbalanced large and small organ segmentation in head and neck (HaN) CT images. To conduct radiotherapy planning for nasopharyngeal cancer, more than 10 organs-at-risk (normal organs) need to be precisely segmented in advance. However, the size ratio between large and small organs in the head could reach hundreds. Directly using such imbalanced organ annotations to train deep neural networks generally leads to inaccurate small-organ label maps. We propose a novel end-to-end deep neural network to solve this challenging problem by automatically locating, ROI-pooling, and segmenting small organs with specifically designed small-organ sub-networks while maintaining the accuracy of large organ segmentation. A strong main network with densely connected atrous spatial pyramid pooling and squeeze-and-excitation modules is used for segmenting large organs, where large organs’ label maps are directly output. For small organs, their probabilistic locations instead of label maps are estimated by the main network. High-resolution and multi-scale feature volumes for each small organ are ROI-pooled according to their locations and are fed into small-organ networks for accurate segmenting small organs. Our proposed network is extensively tested on both collected real data and the MICCAI Head and Neck Auto Segmentation Challenge 2015 dataset, and shows superior performance compared with state-of-the-art segmentation methods.
Article
Background Nasopharyngeal carcinoma (NPC) may be cured with radiation therapy. Tumor proximity to critical structures demands accuracy in tumor delineation to avoid toxicities from radiation therapy; however, tumor target contouring for head and neck radiation therapy is labor intensive and highly variable among radiation oncologists. Purpose To construct and validate an artificial intelligence (AI) contouring tool to automate primary gross tumor volume (GTV) contouring in patients with NPC. Materials and Methods In this retrospective study, MRI data sets covering the nasopharynx from 1021 patients (median age, 47 years; 751 male, 270 female) with NPC between September 2016 and September 2017 were collected and divided into training, validation, and testing cohorts of 715, 103, and 203 patients, respectively. GTV contours were delineated for 1021 patients and were defined by consensus of two experts. A three-dimensional convolutional neural network was applied to 818 training and validation MRI data sets to construct the AI tool, which was tested in 203 independent MRI data sets. Next, the AI tool was compared against eight qualified radiation oncologists in a multicenter evaluation by using a random sample of 20 test MRI examinations. The Wilcoxon matched-pairs signed rank test was used to compare the difference of Dice similarity coefficient (DSC) of pre- versus post-AI assistance. Results The AI-generated contours demonstrated a high level of accuracy when compared with ground truth contours at testing in 203 patients (DSC, 0.79; 2.0-mm difference in average surface distance). In multicenter evaluation, AI assistance improved contouring accuracy (five of eight oncologists had a higher median DSC after AI assistance; average median DSC, 0.74 vs 0.78; P < .001), reduced intra- and interobserver variation (by 36.4% and 54.5%, respectively), and reduced contouring time (by 39.4%). Conclusion The AI contouring tool improved primary gross tumor contouring accuracy of nasopharyngeal carcinoma, which could have a positive impact on tumor control and patient survival. © RSNA, 2019 Online supplemental material is available for this article. See also the editorial by Chang in this issue.
Article
Advances in technical radiotherapy have resulted in significant sparing of organs at risk (OARs), reducing radiation-related toxicities for patients with cancer of the head and neck (HNC). Accurate delineation of target volumes (TVs) and OARs is critical for maximising tumour control and minimising radiation toxicities. When performed manually, variability in TV and OAR delineation has been shown to have significant dosimetric impacts for patients on treatment. Auto-segmentation (AS) techniques have shown promise in reducing both inter-practitioner variability and the time taken in TV and OAR delineation in HNC. Ultimately, this may reduce treatment planning and clinical waiting times for patients. Adaptation of radiation treatment for biological or anatomical changes during therapy will also require rapid re-planning; indeed, the time taken for manual delineation currently prevents adaptive radiotherapy from being implemented optimally. We are therefore standing on the threshold of a transformation of routine radiotherapy planning via the use of artificial intelligence. In this article, we outline the current state-of-the-art for AS for HNC radiotherapy in order to predict how this will rapidly change with the introduction of artificial intelligence. We specifically focus on delineation accuracy and time saving. We argue that, if such technologies are implemented correctly, AS should result in better standardisation of treatment for patients and significantly reduce the time taken to plan radiotherapy.
Article
Automatic segmentation of abdominal anatomy on computed tomography (CT) images can support diagnosis, treatment planning and treatment delivery workflows. Segmentation methods using statistical models and multi-atlas label fusion (MALF) require inter-subject image registrations which are challenging for abdominal images, but alternative methods without registration have not yet achieved higher accuracy for most abdominal organs. We present a registration-free deeplearning- based segmentation algorithm for eight organs that are relevant for navigation in endoscopic pancreatic and biliary procedures, including the pancreas, the GI tract (esophagus, stomach, duodenum) and surrounding organs (liver, spleen, left kidney, gallbladder). We directly compared the segmentation accuracy of the proposed method to existing deep learning and MALF methods in a cross-validation on a multi-centre data set with 90 subjects. The proposed method yielded significantly higher Dice scores for all organs and lower mean absolute distances for most organs, including Dice scores of 0.78 vs. 0.71, 0.74 and 0.74 for the pancreas, 0.90 vs 0.85, 0.87 and 0.83 for the stomach and 0.76 vs 0.68, 0.69 and 0.66 for the esophagus. We conclude that deep-learning-based segmentation represents a registration-free method for multi-organ abdominal CT segmentation whose accuracy can surpass current methods, potentially supporting image-guided navigation in gastrointestinal endoscopy procedures.
Article
Purpose: Target delineation in nasopharyngeal carcinoma (NPC) often proves challenging because of the notoriously narrow therapeutic margin. High doses are needed to achieve optimal levels of tumour control, and dosimetric inadequacy remains one of the most important independent factors affecting treatment outcome. Method: A review of the available literature addressing the natural behaviour of NPC and correlation between clinical and pathological aspects of the disease was conducted. Existing international guidelines as well as published protocols specified by clinical trials on contouring of clinical target volumes (CTV) were compared. This information was then summarized into a preliminary draft guideline which was then circulated to international experts in the field for exchange of opinions and subsequent voting on areas with the greatest controversies. Results: Common areas of uncertainty and variation in practices among experts experienced in radiation therapy for NPC were elucidated. Iterative revisions were made based on extensive discussion and final voting on controversial areas by the expert panel, to formulate the recommendations on contouring of CTV based on optimal geometric expansion and anatomical editing for those structures with substantial risk of microscopic infiltration. Conclusion: Through this comprehensive review of available evidence and best practices at major institutions, as well as interactive exchange of vast experience by international experts, this set of consensus guidelines has been developed to provide a practical reference for appropriate contouring to ensure optimal target coverage. However, the final decision on the treatment volumes should be based on full consideration of individual patients' factors and facilities of an individual centre (including the quality of imaging methods and the precision of treatment delivery).
Article
Purpose To estimate the radiation dose as a result of contrast medium administration in a typical abdominal computed tomographic (CT) examination across a library of contrast material-enhanced computational patient models. Materials and Methods In part II of this study, first, the technique described in part I of this study was applied to enhance the extended cardiac-torso models with patient-specific iodine-time profiles reflecting the administration of contrast material. Second, the patient models were deployed to assess the patient-specific organ dose as a function of time in a typical abdominal CT examination using Monte Carlo simulation. In this hypothesis-generating study, organ dose refers to the total energy deposited in the unit mass of the tissue inclusive of iodine. Third, a study was performed as a strategy to anticipate the biologically relevant dose (absorbed dose to tissue) in highly perfused organs such as the liver and kidney. The time-varying organ-dose increment values relative to those for unenhanced CT examinations were reported. Results The results from the patient models subjected to the injection protocol indicated up to a total 53%, 30%, 35%, 54%, 27%, 18%, 17%, and 24% increase in radiation dose delivered to the heart, spleen, liver, kidneys, stomach, colon, small intestine, and pancreas, respectively. The biologically relevant dose increase with respect to the dose at an unenhanced CT examination was in the range of 0%-18% increase for the liver and 27% for the kidney across 58 patient models. Conclusion The administration of contrast medium increases the total radiation dose. However, radiation dose, while relevant to be included in estimating the risk associated with contrast-enhanced CT, may still not fully characterize the total biologic effects. Therefore, given the fact that many CT diagnostic decisions would be impossible without the use of iodine, this study suggests the need to consider the effect of iodinated contrast material on the organ doses to patients undergoing CT studies when designing CT protocols. (©) RSNA, 2017 Online supplemental material is available for this article.
Article
The American Joint Committee on Cancer (AJCC) staging manual has become the benchmark for classifying patients with cancer, defining prognosis, and determining the best treatment approaches. Many view the primary role of the tumor, lymph node, metastasis (TNM) system as that of a standardized classification system for evaluating cancer at a population level in terms of the extent of disease, both at initial presentation and after surgical treatment, and the overall impact of improvements in cancer treatment. The rapid evolution of knowledge in cancer biology and the discovery and validation of biologic factors that predict cancer outcome and response to treatment with better accuracy have led some cancer experts to question the utility of a TNM-based approach in clinical care at an individualized patient level. In the Eighth Edition of the AJCC Cancer Staging Manual, the goal of including relevant, nonanatomic (including molecular) factors has been foremost, although changes are made only when there is strong evidence for inclusion. The editorial board viewed this iteration as a proactive effort to continue to build the important bridge from a "population-based" to a more "personalized" approach to patient classification, one that forms the conceptual framework and foundation of cancer staging in the era of precision molecular oncology. The AJCC promulgates best staging practices through each new edition in an effort to provide cancer care providers with a powerful, knowledge-based resource for the battle against cancer. In this commentary, the authors highlight the overall organizational and structural changes as well as "what's new" in the Eighth Edition. It is hoped that this information will provide the reader with a better understanding of the rationale behind the aggregate proposed changes and the exciting developments in the upcoming edition. CA Cancer J Clin 2017. © 2017 American Cancer Society.
Conference Paper
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
Article
Epidemiological trends during the past decade suggest that although incidence of nasopharyngeal carcinoma is gradually declining, even in endemic regions, mortality from the disease has fallen substantially. This finding is probably a result of a combination of lifestyle modification, population screening coupled with better imaging, advances in radiotherapy, and effective systemic agents. In particular, intensity-modulated radiotherapy has driven the improvement in tumour control and reduction in toxic effects in survivors. Clinical use of Epstein-Barr virus (EBV) as a surrogate biomarker in nasopharyngeal carcinoma continues to increase, with quantitative assessment of circulating EBV DNA used for population screening, prognostication, and disease surveillance. Randomised trials are investigating the role of EBV DNA in stratification of patients for treatment intensification and deintensification. Among the exciting developments in nasopharyngeal carcinoma, vascular endothelial growth factor inhibition and novel immunotherapies targeted at immune checkpoint and EBV-specific tumour antigens offer promising alternatives to patients with metastatic disease. Copyright © 2015 Elsevier Ltd. All rights reserved.