ArticleLiterature Review

Deep Generative Adversarial Networks: Applications in Musculoskeletal Imaging

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In recent years, deep learning techniques have been applied in musculoskeletal radiology to increase the diagnostic potential of acquired images. Generative adversarial networks (GANs), which are deep neural networks that can generate or transform images, have the potential to aid in faster imaging by generating images with a high level of realism across multiple contrast and modalities from existing imaging protocols. This review introduces the key architectures of GANs as well as their technical background and challenges. Key research trends are highlighted, including: (a) reconstruction of high-resolution MRI; (b) image synthesis with different modalities and contrasts; (c) image enhancement that efficiently preserves high-frequency information suitable for human interpretation; (d) pixel-level segmentation with annotation sharing between domains; and (e) applications to different musculoskeletal anatomies. In addition, an overview is provided of the key issues wherein clinical applicability is challenging to capture with conventional performance metrics and expert evaluation. When clinically validated, GANs have the potential to improve musculoskeletal imaging.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Conventional methods heavily rely on the expertise of radiologists [4], which can be subjective and time-consuming. However, recent advancements in artificial intelligence and deep learning techniques have paved the way for computer-assisted diagnosis systems that can aid radiologists in detecting and classifying fractures with improved accuracy [5]. ...
... However, analyzing medical images to identify bone fractures can be time-consuming and requires the expertise of qualified professionals. To address these challenges, scientists have been investigating ways to reduce diagnosis time and improve decision precision, aiming to assist doctors in their diagnostic processes [11], [12], [5], [13]. Several studies have demonstrated the potential of AI/DL in supporting medical professionals and decision-makers by developing automated tools that enhance the accuracy of physician interpretation [14], [5], [15], [16], [17], [18], [19], [20] and facilitate the creation of effective and cost-efficient treatment plans [15]. ...
... To address these challenges, scientists have been investigating ways to reduce diagnosis time and improve decision precision, aiming to assist doctors in their diagnostic processes [11], [12], [5], [13]. Several studies have demonstrated the potential of AI/DL in supporting medical professionals and decision-makers by developing automated tools that enhance the accuracy of physician interpretation [14], [5], [15], [16], [17], [18], [19], [20] and facilitate the creation of effective and cost-efficient treatment plans [15]. While many studies have focused on accurately detecting musculoskeletal abnormalities using CNN models, our research specifically explores the application of generative adversarial networks (GANs) [21] as a novel approach as showen in I. ...
... The domain transfer task of GAN is the cross-modality image synthesis process by which images are generated for one modality based on another. Crossdomain modality using the domain transfer technique has shown the possibility of obtaining additional clinical information without additional examinations [57]. Eight studies using GAN mainly focused on domain transfer for ophthalmology imaging domains (shown in Table 5). ...
... Second, spatial deformities frequently occur when there are small training images without spatial alignment. In particular, in domain transfer using conditional GAN, paired images with structural and spatial alignment are critically challenging and require additional image registration in a preprocessing to obtain high-quality medical images [57,63]. Third, unintended changes could occur in image-toimage translation because of the different data distributions of the structural features between the two image domains. ...
... However, these techniques require spatial alignment between the two image domains to obtain highquality results. Therefore, additional image registration is required before the GAN performs a domain transformation [57]. If the structures in the images are not aligned, the GAN may perform an image-to-image translation with deformed results in synthetic images. ...
Article
Full-text available
Background Recent advances in deep learning techniques have led to improved diagnostic abilities in ophthalmology. A generative adversarial network (GAN), which consists of two competing types of deep neural networks, including a generator and a discriminator, has demonstrated remarkable performance in image synthesis and image-to-image translation. The adoption of GAN for medical imaging is increasing for image generation and translation, but it is not familiar to researchers in the field of ophthalmology. In this work, we present a literature review on the application of GAN in ophthalmology image domains to discuss important contributions and to identify potential future research directions. Methods We performed a survey on studies using GAN published before June 2021 only, and we introduced various applications of GAN in ophthalmology image domains. The search identified 48 peer-reviewed papers in the final review. The type of GAN used in the analysis, task, imaging domain, and the outcome were collected to verify the usefulness of the GAN. Results In ophthalmology image domains, GAN can perform segmentation, data augmentation, denoising, domain transfer, super-resolution, post-intervention prediction, and feature extraction. GAN techniques have established an extension of datasets and modalities in ophthalmology. GAN has several limitations, such as mode collapse, spatial deformities, unintended changes, and the generation of high-frequency noises and artifacts of checkerboard patterns. Conclusions The use of GAN has benefited the various tasks in ophthalmology image domains. Based on our observations, the adoption of GAN in ophthalmology is still in a very early stage of clinical validation compared with deep learning classification techniques because several problems need to be overcome for practical use. However, the proper selection of the GAN technique and statistical modeling of ocular imaging will greatly improve the performance of each image analysis. Finally, this survey would enable researchers to access the appropriate GAN technique to maximize the potential of ophthalmology datasets for deep learning research.
... The Fréchet Inception Distance (FID), based on feature representations extracted from a pre-trained Inception-V3 network, is commonly used as a quantitative measure for assessing the performance of image generation models. 15 The function for evaluating the image with FID is as follows: ...
... The operator ||.|| 2 is squared Euclidean distance, and operator T r(.) is trace of matrix. 15 In our experiments, initially, we utilized only FID to assess the quality of generated images by comparing them to real samples. However, we encountered challenges due to variations in FID values among different real samples, resulting from inherent biases in the dataset. ...
Preprint
Deep learning-based super-resolution models have the potential to revolutionize biomedical imaging and diagnoses by effectively tackling various challenges associated with early detection, personalized medicine, and clinical automation. However, the requirement of an extensive collection of high-resolution images presents limitations for widespread adoption in clinical practice. In our experiment, we proposed an approach to effectively train the deep learning-based super-resolution models using only one real image by leveraging self-generated high-resolution images. We employed a mixed metric of image screening to automatically select images with a distribution similar to ground truth, creating an incrementally curated training data set that encourages the model to generate improved images over time. After five training iterations, the proposed deep learning-based super-resolution model experienced a 7.5\% and 5.49\% improvement in structural similarity and peak-signal-to-noise ratio, respectively. Significantly, the model consistently produces visually enhanced results for training, improving its performance while preserving the characteristics of original biomedical images. These findings indicate a potential way to train a deep neural network in a self-revolution manner independent of real-world human data.
... In GAN-based GDLMs, spatial deformity (Supplementary Figure S7J and S7K) occurs in small training sets or spatial misalignment of image pairs 18 . Achieving exact structural and spatial alignment in image-to-image transformation with conditional GANs is crucial, often requiring further registration during preprocessing to ensure high-quality images 37,38 . However, the newly remodeled and migrated neural structure around the fovea makes accurate structural alignment impossible following FTMH surgery. ...
Article
Full-text available
This study aims to propose a generative deep learning model (GDLM) based on a variational autoencoder that predicts macular optical coherence tomography (OCT) images following full-thickness macular hole (FTMH) surgery and evaluate its clinical accuracy. Preoperative and 6-month postoperative swept-source OCT data were collected from 150 patients with successfully closed FTMH using 6 × 6 mm² macular volume scan datasets. Randomly selected and augmented 120,000 training and 5000 validation pairs of OCT images were used to train the GDLM. We assessed the accuracy and F1 score of concordance for neurosensory retinal areas, performed Bland–Altman analysis of foveolar height (FH) and mean foveal thickness (MFT), and predicted postoperative external limiting membrane (ELM) and ellipsoid zone (EZ) restoration accuracy between artificial intelligence (AI)-OCT and ground truth (GT)-OCT images. Accuracy and F1 scores were 94.7% and 0.891, respectively. Average FH (228.2 vs. 233.4 μm, P = 0.587) and MFT (271.4 vs. 273.3 μm, P = 0.819) were similar between AI- and GT-OCT images, within 30.0% differences of 95% limits of agreement. ELM and EZ recovery prediction accuracy was 88.0% and 92.0%, respectively. The proposed GDLM accurately predicted macular OCT images following FTMH surgery, aiding patient and surgeon understanding of postoperative macular features.
... This can result in inaccurate or misleading outputs, which can be par�cularly problema�c in medical imaging where accuracy and reliability are cri�cal. (Shin et al., 2021). Table 1 provides a summary of the pros and cons of using genera�ve AI. ...
Chapter
Medical imaging is a crucial aspect of modern healthcare, as it enables the diagnosis and treatment of various diseases and conditions. However, developing and deploying AI models for medical imaging is challenging, due to the limited availability and quality of data, as well as the high complexity and diversity of imaging modalities and tasks. Generative AI models, such as variational autoencoders (VAEs), generative adversarial networks (GANs), and text-to-image diffusion models, offer a promising solution to these challenges, as they can generate realistic and diverse images from existing data or latent representations. In this chapter, we provide a practical guide on how generative AI is transforming medical imaging, by reviewing the state-of-the-art methods and frameworks, presenting some successful case studies in different domains and modalities, and discussing the future directions and opportunities for research and development.
... Data augmentation generates several slightly modified copies of existing data, by means like rotation, scaling, and cropping, to reduce overfitting when training models [39]. It is worth mentioning that generative adversarial networks can utilize different contrasts and modalities of existing imaging protocols to generate new synthetic images with high authenticity [40][41][42][43][44], demonstrating great potential in data augmentation. ...
Article
Full-text available
Kidney diseases result from various causes, which can generally be divided into neoplastic and non-neoplastic diseases. Deep learning based on medical imaging is an established methodology for further data mining and an evolving field of expertise, which provides the possibility for precise management of kidney diseases. Recently, imaging-based deep learning has been widely applied to many clinical scenarios of kidney diseases including organ segmentation, lesion detection, differential diagnosis, surgical planning, and prognosis prediction, which can provide support for disease diagnosis and management. In this review, we will introduce the basic methodology of imaging-based deep learning and its recent clinical applications in neoplastic and non-neoplastic kidney diseases. Additionally, we further discuss its current challenges and future prospects and conclude that achieving data balance, addressing heterogeneity, and managing data size remain challenges for imaging-based deep learning. Meanwhile, the interpretability of algorithms, ethical risks, and barriers of bias assessment are also issues that require consideration in future development. We hope to provide urologists, nephrologists, and radiologists with clear ideas about imaging-based deep learning and reveal its great potential in clinical practice. Critical relevance statement The wide clinical applications of imaging-based deep learning in kidney diseases can help doctors to diagnose, treat, and manage patients with neoplastic or non-neoplastic renal diseases. Key points • Imaging-based deep learning is widely applied to neoplastic and non-neoplastic renal diseases. • Imaging-based deep learning improves the accuracy of the delineation, diagnosis, and evaluation of kidney diseases. • The small dataset, various lesion sizes, and so on are still challenges for deep learning. Graphical Abstract
... 41 Image generation has numerous applications, including data augmentation, improved data privacy, and synthetic data for the study of rare diseases. [42][43][44] Step 2: Data handling. Data collection. ...
Article
Full-text available
The digitization of medical records and expanding electronic health records has created an era of “Big Data” with an abundance of available information ranging from clinical notes to imaging studies. In the field of rheumatology, medical imaging is used to guide both diagnosis and treatment of a wide variety of rheumatic conditions. Although there is an abundance of data to analyze, traditional methods of image analysis are human resource intensive. Fortunately, the growth of artificial intelligence (AI) may be a solution to handle large datasets. In particular, computer vision is a field within AI that analyzes images and extracts information. Computer vision has impressive capabilities and can be applied to rheumatologic conditions, necessitating a need to understand how computer vision works. In this article, we provide an overview of AI in rheumatology and conclude with a five step process to plan and conduct research in the field of computer vision. The five steps include (1) project definition, (2) data handling, (3) model development, (4) performance evaluation, and (5) deployment into clinical care.
... The domain transfer mission of GANs is the reciprocal image synthesis operation, by which images for one modality are generated as theoretical by another. The cross-domain method using the domain transfer technique has shown the possibility of obtaining other clinical details without additional examinations [85]. Various studies have displayed that GAN could be an excellent choice in overcoming information poverty and the lack of large annotated datasets in ophthalmology [86]. ...
Conference Paper
Full-text available
The lack of high-quality annotated medical image datasets is a major problem colliding with and hindering the growth of applications acquiring engines in the field of medical image analysis. In biomedical image analysis, the applicability of deep acquisition methods is directly influenced by the wealth of accessible image information. This is because the deep acquisition version requires a large image dataset to perform well. Generative Adversarial Networks (GANs) are widely used to remove data limitations in the generation of artificial biomedical images. As already mentioned, the artificial image is built by the feedback received. A discriminator is a pattern that artificially or realistically classifies an image and provides feedback to the generator. A general study is carried out on GANs network application to medical image segmentation, primarily adopted to several GANs-based versions, performance metrics, less operates, datasets, augmentation methods, money performance, and source codes. Secondly, this money offers a comprehensive overview of GANs network applications in different human diseases segmentation. We complete our research with vital discussion, limitations of GANs, and prepositions for coming directions. We hope this study is helpful and increases the sensitivity of GANs network implementations for biomedical image segmentation missions.
... Then, discriminator error is back-propagated to the generator to help synthesize images closer to the ground truth. 6 For medical imaging, this method is being actively studied for noise reduction and resolution enhancement in MRI and computed tomography. 7,8 However, few studies have explored ways of generating different tissue contrast images based on other pulse sequence images. ...
Article
Full-text available
Purpose: This study proposed a generative adversarial network (GAN) model for T2-weighted image (WI) synthesis from proton density (PD)-WI in a temporomandibular joint (TMJ) magnetic resonance imaging (MRI) protocol. Materials and methods: From January to November 2019, MRI scans for TMJ were reviewed and 308 imaging sets were collected. For training, 277 pairs of PD- and T2-WI sagittal TMJ images were used. Transfer learning of the pix2pix GAN model was utilized to generate T2-WI from PD-WI. Model performance was evaluated with the structural similarity index map (SSIM) and peak signal-to-noise ratio (PSNR) indices for 31 predicted T2-WI (pT2). The disc position was clinically diagnosed as anterior disc displacement with or without reduction, and joint effusion as present or absent. The true T2-WI-based diagnosis was regarded as the gold standard, to which pT2-based diagnoses were compared using Cohen's ĸ coefficient. Results: The mean SSIM and PSNR values were 0.4781(±0.0522) and 21.30(±1.51) dB, respectively. The pT2 protocol showed almost perfect agreement (ĸ=0.81) with the gold standard for disc position. The number of discordant cases was higher for normal disc position (17%) than for anterior displacement with reduction (2%) or without reduction (10%). The effusion diagnosis also showed almost perfect agreement (ĸ=0.88), with higher concordance for the presence (85%) than for the absence (77%) of effusion. Conclusion: The application of pT2 images for a TMJ MRI protocol useful for diagnosis, although the image quality of pT2 was not fully satisfactory. Further research is expected to enhance pT2 quality.
... As the networks are competitors that continuously trigger mutual improvement, the optimization of both networks leads to generated realistic synthetic data. In our current framework, we apply the deep convolutional GAN [34] as it is a particularly preferred architecture for image synthesis [35]. (d) Image generation: The trained GAN is used to generate any number of plausible anatomical parameter images (semantic segmentation maps). ...
Article
Full-text available
Photoacoustic tomography (PAT) has the potential to recover morphological and functional tissue properties with high spatial resolution. However, previous attempts to solve the optical inverse problem with supervised machine learning were hampered by the absence of labeled reference data. While this bottleneck has been tackled by simulating training data, the domain gap between real and simulated images remains an unsolved challenge. We propose a novel approach to PAT image synthesis that involves subdividing the challenge of generating plausible simulations into two disjoint problems: (1) Probabilistic generation of realistic tissue morphology, and (2) pixel-wise assignment of corresponding optical and acoustic properties. The former is achieved with Generative Adversarial Networks (GANs) trained on semantically annotated medical imaging data. According to a validation study on a downstream task our approach yields more realistic synthetic images than the traditional model-based approach and could therefore become a fundamental step for deep learning-based quantitative PAT (qPAT).
... Generative adversarial network (GAN)-based superresolution (SRGAN) techniques have shown remarkable performance compared to conventional algorithms. 42 A three-dimensional SRGAN system able to create thin-interval CT images from thick-interval images was developed that was able to better clarify the boundaries between bone and surrounding tissue. 43 This system can be applied in segmenting high-density structures lying in close proximity to bone; better perceptual quality of bone boundaries on 18 F-NaF-PET would simplify the delineation of the bones and highdensity structures as separate objects. ...
Article
This review discusses the current state of artificial intelligence (AI) in 18F-NaF-PET/CT imaging and the potential applications to come in diagnosis, prognostication, and improvement of care in patients with bone diseases, with emphasis on the role of AI algorithms in CT bone segmentation, relying on their prevalence in medical imaging and utility in the extraction of spatial information in combined PET/CT studies.
... As the networks are competitors that continuously trigger mutual improvement, the optimization of both networks leads to generated realistic synthetic data. In our current framework, we apply the deep convolutional GAN [15] as it is a particularly preferred architecture for image synthesis [16]. ...
Preprint
Full-text available
Photoacoustic tomography (PAT) has the potential to recover morphological and functional tissue properties such as blood oxygenation with high spatial resolution and in an interventional setting. However, decades of research invested in solving the inverse problem of recovering clinically relevant tissue properties from spectral measurements have failed to produce solutions that can quantify tissue parameters robustly in a clinical setting. Previous attempts to address the limitations of model-based approaches with machine learning were hampered by the absence of labeled reference data needed for supervised algorithm training. While this bottleneck has been tackled by simulating training data, the domain gap between real and simulated images remains a huge unsolved challenge. As a first step to address this bottleneck, we propose a novel approach to PAT data simulation, which we refer to as "learning to simulate". Our approach involves subdividing the challenge of generating plausible simulations into two disjoint problems: (1) Probabilistic generation of realistic tissue morphology, represented by semantic segmentation maps and (2) pixel-wise assignment of corresponding optical and acoustic properties. In the present work, we focus on the first challenge. Specifically, we leverage the concept of Generative Adversarial Networks (GANs) trained on semantically annotated medical imaging data to generate plausible tissue geometries. According to an initial in silico feasibility study our approach is well-suited for contributing to realistic PAT image synthesis and could thus become a fundamental step for deep learning-based quantitative PAT.
Article
Full-text available
Cross-modality data translation has attracted great interest in medical image computing. Deep generative models show performance improvement in addressing related challenges. Nevertheless, as a fundamental challenge in image translation, the problem of zero-shot learning cross-modality image translation with fidelity remains unanswered. To bridge this gap, we propose a novel unsupervised zero-shot learning method called Mutual Information guided Diffusion Model, which learns to translate an unseen source image to the target modality by leveraging the inherent statistical consistency of Mutual Information between different modalities. To overcome the prohibitive high dimensional Mutual Information calculation, we propose a differentiable local-wise mutual information layer for conditioning the iterative denoising process. The Local-wise-Mutual-Information-Layer captures identical cross-modality features in the statistical domain, offering diffusion guidance without relying on direct mappings between the source and target domains. This advantage allows our method to adapt to changing source domains without the need for retraining, making it highly practical when sufficient labeled source domain data is not available. We demonstrate the superior performance of MIDiffusion in zero-shot cross-modality translation tasks through empirical comparisons with other generative models, including adversarial-based and diffusion-based models. Finally, we showcase the real-world application of MIDiffusion in 3D zero-shot learning-based cross-modality image segmentation tasks.
Article
Image data has grown exponentially as systems have increased their ability to collect and store it. Unfortunately, there are limits to human resources both in time and knowledge to fully interpret and manage that data. Computer Vision (CV) has grown in popularity as a discipline for better understanding visual data. CV has become a powerful tool for imaging analytics in orthopaedic surgery, allowing computers to evaluate large volumes of image data with greater nuance than previously possible. Nevertheless, even with the growing number of uses in medicine, literature on the fundamentals of CV and its implementation is mainly oriented toward computer scientists rather than clinicians, rendering CV unapproachable for most orthopaedic surgeons as a tool for clinical practice and research. The purpose of this article is to summarize and review the fundamental concepts of CV application for the orthopaedic surgeon and musculoskeletal researcher.
Article
Full-text available
Resistance of high grade tumors to treatment involves cancer stem cell features, deregulated cell division, acceleration of genomic errors, and emergence of cellular variants that rely upon diverse signaling pathways. This heterogeneous tumor landscape limits the utility of the focal sampling provided by invasive biopsy when designing strategies for targeted therapies. In this roadmap review paper, we propose and develop methods for enabling mapping of cellular and molecular features in vivo to inform and optimize cancer treatment strategies in the brain. This approach leverages (1) the spatial and temporal advantages of in vivo imaging compared with surgical biopsy, (2) the rapid expansion of meaningful anatomical and functional magnetic resonance signals, (3) widespread access to cellular and molecular information enabled by next‐generation sequencing, and (4) the enhanced accuracy and computational efficiency of deep learning techniques. As multiple cellular variants may be present within volumes below the resolution of imaging, we describe a mapping process to decode micro‐ and even nano‐scale properties from the macro‐scale data by simultaneously utilizing complimentary multiparametric image signals acquired in routine clinical practice. We outline design protocols for future research efforts that marry revolutionary bioinformation technologies, growing access to increased computational capability, and powerful statistical classification techniques to guide rational treatment selection.
Article
This review summarizes the existing techniques and methods used to generate synthetic contrasts from magnetic resonance imaging data focusing on musculoskeletal magnetic resonance imaging. To that end, the different approaches were categorized into 3 different methodological groups: mathematical image transformation, physics-based, and data-driven approaches. Each group is characterized, followed by examples and a brief overview of their clinical validation, if present. Finally, we will discuss the advantages, disadvantages, and caveats of synthetic contrasts, focusing on the preservation of image information, validation, and aspects of the clinical workflow.
Conference Paper
In our comprehensive experiments and evaluations, we show that it is possible to generate multiple contrast (even all synthetically) and use synthetically generated images to train an image segmentation engine. We showed promising segmentation results tested on real multi-contrast MRI scans when delineating muscle, fat, bone and bone marrow, all trained on synthetic images. Based on synthetic image training, our segmentation results were as high as 93.91%, 94.11%, 91.63%, 95.33%, for muscle, fat, bone, and bone marrow delineation, respectively. Results were not significantly different from the ones obtained when real images were used for segmentation training: 94.68%, 94.67%, 95.91%, and 96.82%, respectively. Clinical relevance- Synthetically generated images could potentially be used in large-scale training of deep networks for segmentation purpose. Small data set problem of many clinical imaging problems can potentially be addressed with the proposed algorithm.
Article
In the context of deep learning, this paper combines the arbitration mechanism to propose a GAN (Arbi-DCGAN) model based on the arbitration mechanism. First, the network structure of the proposed improved algorithm is composed of generator, discriminator and arbitrator. Then, the generator and the discriminator will conduct adversarial training according to the training plan and strengthen the ability of generating images and distinguishing the authenticity of the images according to the characteristics learned from the data set. Secondly, the arbitrator is composed of the generator, discriminator and measurement score computation module that have undergone the previous adversarial training. The arbitrator will feed back the results of the metric generator and discriminator adversarial training to the training plan. Finally, a winning limit is added to the network structure to improve the stability of model training, and the Circle loss function is used to replace the BCE loss function, which makes the model optimization process more flexible and the convergence state more clear. On the basis of geographic information system, this paper uses 325 meticulously annotated sample plans to establish a data set for deep learning, and trains the Arbi-DCGAN model to achieve the task of extracting land plots of different land types in the plan, as well as from the plane color block map to the color texture. The rendering and generation of the map complete the reconstruction task of the garden landscape. In addition, we further evaluate the results of the model's reconstruction of the garden landscape from the aspects of image quality, correct standardization and color expression. The training model has the potential to be applied to land type analysis and plane rendering in landscape architecture cases, helping designers improve the efficiency of analysis and drawing.
Article
The future of musculoskeletal (MSK) radiology is being built on research developments in the field. Over the past decade, MSK imaging research has been dominated by advancements in molecular imaging biomarkers, artificial intelligence, radiomics, and novel high-resolution equipment. Adequate preparation of trainees and specialists will ensure that current and future leaders will be prepared to embrace and critically appraise technological developments, will be up to date on clinical developments, such as the use of artificial tissues, will define research directions, and will actively participate and lead multidisciplinary research. This review presents an overview of the current MSK research landscape and proposes tangible future goals and strategic directions that will fortify the future of MSK radiology.
Article
Full-text available
Since their introduction in 2014 Generative Adversarial Networks (GANs) have been employed successfully in many areas such as image processing, computer vision, medical imaging, video as well as other disciplines. A large number of review papers have been published, focusing on certain application areas and proposed methods. In this paper, we collected the most recent review papers, organized the collected information according to the application field and we presented the application areas, the GAN architectures that have been applied in each case and summarized the open issues in each area.
Article
Recent developments in machine learning (ML) methods demonstrate unparalleled potential for application in the spine. The ability for ML to provide diagnostic faculty, produce novel insights from existing capabilities, and augment or accelerate elements of surgical planning and decision making at levels equivalent or superior to humans will tremendously benefit spine surgeons and patients alike. In this review, we aim to provide a clinically relevant outline of ML-based technology in the contexts of spinal deformity, degeneration, and trauma, as well as an overview of commercial-level and precommercial-level surgical assist systems and decisional support tools. Furthermore, we briefly discuss potential applications of generative networks before highlighting some of the limitations of ML applications. We conclude that ML in spine imaging represents a significant addition to the neurosurgeon's armamentarium-it has the capacity to directly address and manifest clinical needs and improve diagnostic and procedural quality and safety-but is yet subject to challenges that must be addressed before widespread implementation.
Article
Full-text available
Background and Objective Deep learning approaches are common in image processing, but often rely on supervised learning, which requires a large volume of training images, usually accompanied by hand-crafted labels. As labelled data are often not available, it would be desirable to develop methods that allow such data to be compiled automatically. In this study, we used a Generative Adversarial Network (GAN) to generate realistic B-mode musculoskeletal ultrasound images, and tested the suitability of two automated labelling approaches. Methods We used a model including two GANs each trained to transfer an image from one domain to another. The two inputs were a set of 100 longitudinal images of the gastrocnemius medialis muscle, and a set of 100 synthetic segmented masks that featured two aponeuroses and a random number of ‘fascicles’. The model output a set of synthetic ultrasound images and an automated segmentation of each real input image. This automated segmentation process was one of the two approaches we assessed. The second approach involved synthesising ultrasound images and then feeding these images into an ImageJ/Fiji-based automated algorithm, to determine whether it could detect the aponeuroses and muscle fascicles. Results Histogram distributions were similar between real and synthetic images, but synthetic images displayed less variation between samples and a narrower range. Mean entropy values were statistically similar (real: 6.97, synthetic: 7.03; p = 0.218), but the range was much narrower for synthetic images (6.91 – 7.11 versus 6.30 – 7.62). When comparing GAN-derived and manually labelled segmentations, intersection-over-union values- denoting the degree of overlap between aponeurosis labels- varied between 0.0280 – 0.612 (mean ± SD: 0.312 ± 0.159), and pennation angles were higher for the GAN-derived segmentations (25.1° vs. 19.3 °; p < 0.001). For the second segmentation approach, the algorithm generally performed equally well on synthetic and real images, yielding pennation angles within the physiological range (13.8-20°). Conclusions We used a GAN to generate realistic B-mode ultrasound images, and extracted muscle architectural parameters from these images automatically. This approach could enable generation of large labelled datasets for image segmentation tasks, and may also be useful for data sharing. Automatic generation and labelling of ultrasound images minimises user input and overcomes several limitations associated with manual analysis.
Article
Full-text available
Deep neural networks (DNNs) are efficient solvers for ill-posed problems and have been shown to outperform classical optimization techniques in several computational imaging problems. In supervised mode, DNNs are trained by minimizing a measure of the difference between their actual output and their desired output; the choice of measure, referred to as “loss function,” severely impacts performance and generalization ability. In a recent paper [A. Goy et al., Phys. Rev. Lett. 121(24), 243902 (2018)], we showed that DNNs trained with the negative Pearson correlation coefficient (NPCC) as the loss function are particularly fit for photon-starved phase-retrieval problems, though the reconstructions are manifestly deficient at high spatial frequencies. In this paper, we show that reconstructions by DNNs trained with default feature loss (defined at VGG layer ReLU-22) contain more fine details; however, grid-like artifacts appear and are enhanced as photon counts become very low. Two additional key findings related to these artifacts are presented here. First, the frequency signature of the artifacts depends on the VGG’s inner layer that perceptual loss is defined upon, halving with each MaxPooling2D layer deeper in the VGG. Second, VGG ReLU-12 outperforms all other layers as the defining layer for the perceptual loss.
Article
Full-text available
Background A multitask deep learning model might be useful in large epidemiologic studies wherein detailed structural assessment of osteoarthritis still relies on expert radiologists' readings. The potential of such a model in clinical routine should be investigated. Purpose To develop a multitask deep learning model for grading radiographic hip osteoarthritis features on radiographs and compare its performance to that of attending-level radiologists. Materials and Methods This retrospective study analyzed hip joints seen on weight-bearing anterior-posterior pelvic radiographs from participants in the Osteoarthritis Initiative (OAI). Participants were recruited from February 2004 to May 2006 for baseline measurements, and follow-up was performed 48 months later. Femoral osteophytes (FOs), acetabular osteophytes (AOs), and joint-space narrowing (JSN) were graded as absent, mild, moderate, or severe according to the Osteoarthritis Research Society International atlas. Subchondral sclerosis and subchondral cysts were graded as present or absent. The participants were split at 80% (n = 3494), 10% (n = 437), and 10% (n = 437) by using split-sample validation into training, validation, and testing sets, respectively. The multitask neural network was based on DenseNet-161, a shared convolutional features extractor trained with multitask loss function. Model performance was evaluated in the internal test set from the OAI and in an external test set by using temporal and geographic validation consisting of routine clinical radiographs. Results A total of 4368 participants (mean age, 61.0 years ± 9.2 [standard deviation]; 2538 women) were evaluated (15 364 hip joints on 7738 weight-bearing anterior-posterior pelvic radiographs). The accuracy of the model for assessing these five features was 86.7% (1333 of 1538) for FOs, 69.9% (1075 of 1538) for AOs, 81.7% (1257 of 1538) for JSN, 95.8% (1473 of 1538) for subchondral sclerosis, and 97.6% (1501 of 1538) for subchondral cysts in the internal test set, and 82.7% (86 of 104) for FOS, 65.4% (68 of 104) for AOs, 80.8% (84 of 104) for JSN, 88.5% (92 of 104) for subchondral sclerosis, and 91.3% (95 of 104) for subchondral cysts in the external test set. Conclusion A multitask deep learning model is a feasible approach to reliably assess radiographic features of hip osteoarthritis. © RSNA, 2020 Online supplemental material is available for this article.
Article
Full-text available
Deep learning with convolutional neural networks (CNN) is a rapidly advancing subset of artificial intelligence that is ideally suited to solving image-based problems. There are an increasing number of musculoskeletal applications of deep learning, which can be conceptually divided into the categories of lesion detection, classification, segmentation, and non-interpretive tasks. Numerous examples of deep learning achieving expert-level performance in specific tasks in all four categories have been demonstrated in the past few years, although comprehensive interpretation of imaging examinations has not yet been achieved. It is important for the practicing musculoskeletal radiologist to understand the current scope of deep learning as it relates to musculoskeletal radiology. Interest in deep learning from researchers, radiology leadership, and industry continues to increase, and it is likely that these developments will impact the daily practice of musculoskeletal radiology in the near future.
Article
Full-text available
Purpose Automated synthetic computed tomography (sCT) generation based on magnetic resonance imaging (MRI) images would allow for MRI‐only based treatment planning in radiation therapy, eliminating the need for CT simulation and simplifying the patient treatment workflow. In this work, the authors propose a novel method for generation of sCT based on dense cycle‐consistent generative adversarial networks (cycle GAN), a deep‐learning based model that trains two transformation mappings (MRI to CT and CT to MRI) simultaneously. Methods and materials The cycle GAN‐based model was developed to generate sCT images in a patch‐based framework. Cycle GAN was applied to this problem because it includes an inverse transformation from CT to MRI, which helps constrain the model to learn a one‐to‐one mapping. Dense block‐based networks were used to construct generator of cycle GAN. The network weights and variables were optimized via a gradient difference (GD) loss and a novel distance loss metric between sCT and original CT. Results Leave‐one‐out cross‐validation was performed to validate the proposed model. The mean absolute error (MAE), peak signal‐to‐noise ratio (PSNR), and normalized cross correlation (NCC) indexes were used to quantify the differences between the sCT and original planning CT images. For the proposed method, the mean MAE between sCT and CT were 55.7 Hounsfield units (HU) for 24 brain cancer patients and 50.8 HU for 20 prostate cancer patients. The mean PSNR and NCC were 26.6 dB and 0.963 in the brain cases, and 24.5 dB and 0.929 in the pelvis. Conclusion We developed and validated a novel learning‐based approach to generate CT images from routine MRIs based on dense cycle GAN model to effectively capture the relationship between the CT and MRIs. The proposed method can generate robust, high‐quality sCT in minutes. The proposed method offers strong potential for supporting near real‐time MRI‐only treatment planning in the brain and pelvis.
Article
Full-text available
Objective: To identify the feasibility of using a deep convolutional neural network (DCNN) for the detection and localization of hip fractures on plain frontal pelvic radiographs (PXRs). Hip fracture is a leading worldwide health problem for the elderly. A missed diagnosis of hip fracture on radiography leads to a dismal prognosis. The application of a DCNN to PXRs can potentially improve the accuracy and efficiency of hip fracture diagnosis. Methods: A DCNN was pretrained using 25,505 limb radiographs between January 2012 and December 2017. It was retrained using 3605 PXRs between August 2008 and December 2016. The accuracy, sensitivity, false-negative rate, and area under the receiver operating characteristic curve (AUC) were evaluated on 100 independent PXRs acquired during 2017. The authors also used the visualization algorithm gradient-weighted class activation mapping (Grad-CAM) to confirm the validity of the model. Results: The algorithm achieved an accuracy of 91%, a sensitivity of 98%, a false-negative rate of 2%, and an AUC of 0.98 for identifying hip fractures. The visualization algorithm showed an accuracy of 95.9% for lesion identification. Conclusions: A DCNN not only detected hip fractures on PXRs with a low false-negative rate but also had high accuracy for localizing fracture lesions. The DCNN might be an efficient and economical model to help clinicians make a diagnosis without interrupting the current clinical pathway. Key points: • Automated detection of hip fractures on frontal pelvic radiographs may facilitate emergent screening and evaluation efforts for primary physicians. • Good visualization of the fracture site by Grad-CAM enables the rapid integration of this tool into the current medical system. • The feasibility and efficiency of utilizing a deep neural network have been confirmed for the screening of hip fractures.
Article
Full-text available
Obtaining expert labels in clinical imaging is difficult since exhaustive annotation is time-consuming. Furthermore, not all possibly relevant markers may be known and sufficiently well described a priori to even guide annotation. While supervised learning yields good results if expert labeled training data is available, the visual variability, and thus the vocabulary of findings, we can detect and exploit, is limited to the annotated lesions. Here, we present fast AnoGAN (f-AnoGAN), a generative adversarial network (GAN) based unsupervised learning approach capable of identifying anomalous images and image segments, that can serve as imaging biomarker candidates. We build a generative model of healthy training data, and propose and evaluate a fast mapping technique of new data to the GAN's latent space. The mapping is based on a trained encoder, and anomalies are detected via a combined anomaly score based on the building blocks of the trained model - comprising a discriminator feature residual error and an image reconstruction error. In the experiments on optical coherence tomography data, we compare the proposed method with alternative approaches, and provide comprehensive empirical evidence that f-AnoGAN outperforms alternative approaches and yields high anomaly detection accuracy. In addition, a visual Turing test with two retina experts showed that the generated images are indistinguishable from real normal retinal OCT images. The f-AnoGAN code is available at https://github.com/tSchlegl/f-AnoGAN.
Article
Full-text available
Purpose In multiphase coronary CT angiography (CTA), a series of CT images are taken at different levels of radiation dose during the examination. Although this reduces the total radiation dose, the image quality during the low‐dose phases is significantly degraded. Recently, deep neural network approaches based on supervised learning technique have demonstrated impressive performance improvement over conventional model‐based iterative methods for low‐dose CT. However, matched low‐ and routine‐dose CT image pairs are difficult to obtain in multiphase CT. To address this problem, we aim at developing a new deep learning framework. Method We propose an unsupervised learning technique that can remove the noise of the CT images in the low‐dose phases by learning from the CT images in the routine dose phases. Although a supervised learning approach is not applicable due to the differences in the underlying heart structure in two phases, the images are closely related in two phases, so we propose a cycle‐consistent adversarial denoising network to learn the mapping between the low‐ and high‐dose cardiac phases. Results Experimental results showed that the proposed method effectively reduces the noise in the low‐dose CT image while preserving detailed texture and edge information. Moreover, thanks to the cyclic consistency and identity loss, the proposed network does not create any artificial features that are not present in the input images. Visual grading and quality evaluation also confirm that the proposed method provides significant improvement in diagnostic quality. Conclusions The proposed network can learn the image distributions from the routine‐dose cardiac phases, which is a big advantage over the existing supervised learning networks that need exactly matched low‐ and routine‐dose CT images. Considering the effectiveness and practicability of the proposed method, we believe that the proposed can be applied for many other CT acquisition protocols.
Article
Full-text available
Purpose To describe and evaluate a segmentation method using joint adversarial and segmentation convolutional neural network to achieve accurate segmentation using unannotated MR image datasets. Theory and Methods A segmentation pipeline was built using joint adversarial and segmentation network. A convolutional neural network technique called cycle‐consistent generative adversarial network (CycleGAN) was applied as the core of the method to perform unpaired image‐to‐image translation between different MR image datasets. A joint segmentation network was incorporated into the adversarial network to obtain additional functionality for semantic segmentation. The fully automated segmentation method termed as SUSAN was tested for segmenting bone and cartilage on 2 clinical knee MR image datasets using images and annotated segmentation masks from an online publicly available knee MR image dataset. The segmentation results were compared using quantitative segmentation metrics with the results from a supervised U‐Net segmentation method and 2 registration methods. The Wilcoxon signed‐rank test was used to evaluate the value difference of quantitative metrics between different methods. Results The proposed method SUSAN provided high segmentation accuracy with results comparable to the supervised U‐Net segmentation method (most quantitative metrics having P > 0.05) and significantly better than a multiatlas registration method (all quantitative metrics having P < 0.001) and a direct registration method (all quantitative metrics having P< 0.0001) for the clinical knee image datasets. SUSAN also demonstrated the applicability for segmenting knee MR images with different tissue contrasts. Conclusion SUSAN performed rapid and accurate tissue segmentation for multiple MR image datasets without the need for sequence specific segmentation annotation. The joint adversarial and segmentation network and training strategy have promising potential applications in medical image segmentation.
Article
Full-text available
Background Deep learning is a ground-breaking technology that is revolutionising many research and industrial fields. Generative models are recently gaining interest. Here, we investigate their potential, namely conditional generative adversarial networks, in the field of magnetic resonance imaging (MRI) of the spine, by performing clinically relevant benchmark cases. Methods First, the enhancement of the resolution of T2-weighted (T2W) images (super-resolution) was tested. Then, automated image-to-image translation was tested in the following tasks: (1) from T1-weighted to T2W images of the lumbar spine and (2) vice versa; (3) from T2W to short time inversion-recovery (STIR) images; (4) from T2W to turbo inversion recovery magnitude (TIRM) images; (5) from sagittal standing x-ray projections to T2W images. Clinical and quantitative assessments of the outputs by means of image quality metrics were performed. The training of the models was performed on MRI and x-ray images from 989 patients. Results The performance of the models was generally positive and promising, but with several limitations. The number of disc protrusions or herniations showed good concordance (κ = 0.691) between native and super-resolution images. Moderate-to-excellent concordance was found when translating T2W to STIR and TIRM images (κ ≥ 0.842 regarding disc degeneration), while the agreement was poor when translating x-ray to T2W images. Conclusions Conditional generative adversarial networks are able to generate perceptually convincing synthetic images of the spine in super-resolution and image-to-image translation tasks. Taking into account the limitations of the study, deep learning-based generative methods showed the potential to be an upcoming innovation in musculoskeletal radiology.
Chapter
Full-text available
Automated lesion segmentation from computed tomography (CT) is an important and challenging task in medical image analysis. While many advancements have been made, there is room for continued improvements. One hurdle is that CT images can exhibit high noise and low contrast, particularly in lower dosages. To address this, we focus on a preprocessing method for CT images that uses stacked generative adversarial networks (SGAN) approach. The first GAN reduces the noise in the CT image and the second GAN generates a higher resolution image with enhanced boundaries and high contrast. To make up for the absence of high quality CT images, we detail how to synthesize a large number of low- and high-quality natural images and use transfer learning with progressively larger amounts of CT images. We apply both the classic GrabCut method and the modern holistically nested network (HNN) to lesion segmentation, testing whether SGAN can yield improved lesion segmentation. Experimental results on the DeepLesion dataset demonstrate that the SGAN enhancements alone can push GrabCut performance over HNN trained on original images. We also demonstrate that HNN + SGAN performs best compared against four other enhancement methods, including when using only a single GAN.
Article
Full-text available
Spinal clinicians still rely on laborious workloads to conduct comprehensive assessments of multiple spinal structures in MRIs, in order to detect abnormalities and discover possible pathological factors. The objective of this work is to perform automated segmentation and classification (i.e., normal and abnormal) of intervertebral discs, vertebrae, and neural foramen in MRIs in one shot, which is called semantic segmentation that is extremely urgent to assist spinal clinicians in diagnosing neural foraminal stenosis, disc degeneration, and vertebral deformity as well as discovering possible pathological factors. However, no work has simultaneously achieved the semantic segmentation of intervertebral discs, vertebrae, and neural foramen due to three-fold unusual challenges: 1) Multiple tasks, i.e., simultaneous semantic segmentation of multiple spinal structures, are more difficult than individual tasks; 2) Multiple targets: average 21 spinal structures per MRI require automated analysis yet have high variety and variability; 3) Weak spatial correlations and subtle differences between normal and abnormal structures generate dynamic complexity and indeterminacy. In this paper, we propose a Recurrent Generative Adversarial Network called Spine-GAN for resolving above-aforementioned challenges. Firstly, Spine-GAN explicitly solves the high variety and variability of complex spinal structures through an atrous convolution (i.e., convolution with holes) autoencoder module that is capable of obtaining semantic task-aware representation and preserving fine-grained structural information. Secondly, Spine-GAN dynamically models the spatial pathological correlations between both normal and abnormal structures thanks to a specially designed long short-term memory module. Thirdly, Spine-GAN obtains reliable performance and efficient generalization by leveraging a discriminative network that is capable of correcting predicted errors and global-level contiguity. Extensive experiments on MRIs of 253 patients have demonstrated that Spine-GAN achieves high pixel accuracy of 96.2%, Dice coefficient of 87.1%, Sensitivity of 89.1% and Specificity of 86.0%, which reveals its effectiveness and potential as a clinical tool.
Article
Full-text available
To enable magnetic resonance (MR)-only radiotherapy and facilitate modelling of radiation attenuation in humans, synthetic CT (sCT) images need to be generated. Considering the application of MR-guided radiotherapy and online adaptive replanning, sCT generation should occur within minutes. This work aims at assessing whether an existing deep learning network can rapidly generate sCT images to be used for accurate MR-based dose calculations in the entire pelvis. A study was conducted on data of 91 patients with prostate (59), rectal (18) and cervical (14) cancer who underwent external beam radiotherapy acquiring both CT and MRI for patients’ simulation. Dixon reconstructed water, fat and in-phase images obtained from a conventional dual gradient-recalled echo sequence were used to generate sCT images. A conditional generative adversarial network (cGAN) was trained in a paired fashion on 2D transverse slices of 32 prostate cancer patients. The trained network was tested on the remaining patients to generate sCT images. For 30 patients in the test set, dose recalculations of the clinical plan were performed on sCT images. Dose distributions were evaluated comparing voxel-based dose differences, gamma and dose-volume histogram (DVH) analysis. The sCT generation required 5.6 s and 21 s for a single patient volume on a GPU and CPU, respectively. On average, sCT images resulted in a higher dose to the target of maximum 0.3%. The average gamma pass rates using the 3%,3mm and 2%,2mm criteria were above 97 and 91%, respectively, for all volumes of interests considered. All DVH points calculated on sCT differed less than ±2.5% from the corresponding points on CT. Results suggest that accurate MR-based dose calculation using sCT images generated with a cGAN trained on prostate cancer patients is feasible for the entire pelvis. The sCT generation was sufficiently fast to be integrated into an MR-guided radiotherapy workflow.
Article
Full-text available
Computed tomography (CT) is a popular medical imaging modality and enjoys wide clinical applications. At the same time, the x-ray radiation dose associated with CT scannings raises a public concern due to its potential risks to the patients. Over the past years, major efforts have been dedicated to the development of Low-Dose CT (LDCT) methods. However, the radiation dose reduction compromises the signal-to-noise ratio (SNR), leading to strong noise and artifacts that downgrade CT image quality. In this paper, we propose a novel 3D noise reduction method, called Structurally-sensitive Multi-scale Generative Adversarial Net (SMGAN), to improve the LDCT image quality. Specifically, we incorporate three-dimensional (3D) volumetric information to improve the image quality. Also, different loss functions for training denoising models are investigated. Experiments show that the proposed method can effectively preserve structural and textural information in reference to normal-dose CT (NDCT) images, and significantly suppress noise and artifacts. Qualitative visual assessments by three experienced radiologists demonstrate that the proposed method retrieves more information, and outperforms competing methods.
Article
Full-text available
In silico trials recently emerged as a disruptive technology, which may reduce the costs related to the development and marketing approval of novel medical technologies, as well as shortening their time-to-market. In these trials, virtual patients are recruited from a large database and their response to the therapy, such as the implantation of a medical device, is simulated by means of numerical models. In this work, we propose the use of generative adversarial networks to produce synthetic radiological images to be used in in silico trials. The generative models produced credible synthetic sagittal X-rays of the lumbar spine based on a simple sketch, and were able to generate sagittal radiological images of the trunk using coronal projections as inputs, and vice versa. Although numerous inaccuracies in the anatomical details may still allow distinguishing synthetic and real images in the majority of cases, the present work showed that generative models are a feasible solution for creating synthetic imaging data to be used in in silico trials of novel medical devices.
Article
Full-text available
Robust localisation and identification of vertebrae is an essential part of automated spine analysis. The contribution of this work to the task is two-fold: (1) Inspired by the human expert, we hypothesise that a sagittal and coronal reformation of the spine contain sufficient information for labelling the vertebrae. Thereby, we propose a butterfly-shaped network architecture (termed Btrfly Net) that efficiently combines the information across the reformations. (2) Underpinning the Btrfly net, we present an energy-based adversarial training regime that encodes the local spine structure as an anatomical prior into the network, thereby enabling it to achieve state-of-art performance in all standard metrics on a benchmark dataset of 302 scans without any post-processing during inference.
Article
Full-text available
Purpose: To describe and evaluate a new segmentation method using deep convolutional neural network (CNN), three-dimensional (3D) fully-connected conditional random field (CRF), and 3D simplex deformable modeling to improve the efficiency and accuracy of knee joint tissue segmentation. Methods: A segmentation pipeline was built by combining a semantic segmentation CNN, 3D fully-connected CRF, and 3D simplex deformable modeling. A convolutional encoder-decoder network was designed as the core of the segmentation method to perform high resolution pixel-wise multi-class tissue classification for 12 different joint structures. The 3D fully-connected CRF was applied to regularize contextual relationship among voxels within the same tissue class and between different classes. The 3D simplex deformable modeling refined the output from 3D CRF to preserve the overall shape and maintain a desirable smooth surface for joint structures. The method was evaluated on 3D fast spin-echo (3D-FSE) magnetic resonance (MR) image datasets. Quantitative morphological metrics were used to evaluate the accuracy and robustness of the method in comparison to the ground truth data. Results: The proposed segmentation method provided good performance for segmenting all knee joint structures. There were four tissue types with high mean Dice coefficient above 0.9 including the femur, tibia, muscle, and other non-specified tissues. There were seven tissue types with mean Dice coefficient between 0.8 and 0.9 including the femoral cartilage, tibial cartilage, patella, patellar cartilage, meniscus, quadriceps and patellar tendon, and infrapatellar fat pad. There was one tissue type with mean Dice coefficient between 0.7 and 0.8 for joint effusion and Baker’s cyst. Most musculoskeletal tissues had a mean value of Average Symmetric Surface Distance below 1mm. Conclusion: The combined CNN, 3D fully-connected CRF, and 3D deformable modeling approach was well suited for performing rapid and accurate comprehensive tissue segmentation of the knee joint. The deep learning-based segmentation method has promising potential applications in musculoskeletal imaging.
Article
Full-text available
Deep learning methods, and in particular convolutional neural networks (CNNs), have led to an enormous breakthrough in a wide range of computer vision tasks, primarily by using large-scale annotated datasets. However, obtaining such datasets in the medical domain remains a challenge. In this paper, we present methods for generating synthetic medical images using recently presented deep learning Generative Adversarial Networks (GANs). Furthermore, we show that generated medical images can be used for synthetic data augmentation, and improve the performance of CNN for medical image classification. Our novel method is demonstrated on a limited dataset of computed tomography (CT) images of 182 liver lesions (53 cysts, 64 metastases and 65 hemangiomas). We first exploit GAN architectures for synthesizing high quality liver lesion ROIs. Then we present a novel scheme for liver lesion classification using CNN. Finally, we train the CNN using classic data augmentation and our synthetic data augmentation and compare performance. In addition, we explore the quality of our synthesized examples using visualization and expert assessment. The classification performance using only classic data augmentation yielded 78.6% sensitivity and 88.4% specificity. By adding the synthetic data augmentation the results increased to 85.7% sensitivity and 92.4% specificity. We believe that this approach to synthetic data augmentation can generalize to other medical classification applications and thus support radiologists' efforts to improve diagnosis.
Article
Full-text available
Objective To evaluate the feasibility of synthetic magnetic resonance imaging (MRI) compared to conventional MRI for the diagnosis of internal derangements of the knee at 3T. Materials and Methods Following Institutional Review Board approval, image sets of conventional and synthetic MRI in 39 patients were included. Two musculoskeletal radiologists compared the image sets and qualitatively analyzed the images. Subjective image quality was assessed using a four-grade scale. Interobserver agreement and intersequence agreement between conventional and synthetic images for cartilage lesions, tears of the cruciate ligament, and tears of the meniscus were independently assessed using Kappa statistics. In patients who underwent arthroscopy (n = 8), the sensitivity, specificity, and accuracy for evaluated internal structures were calculated using arthroscopic findings as the gold standard. Results There was no statistically significant difference in image quality (p = 0.90). Interobserver agreement (κ = 0.649– 0.981) and intersequence agreement (κ = 0.794–0.938) were nearly perfect for all evaluated structures. The sensitivity, specificity, and accuracy for detecting cartilage lesions (sensitivity, 63.6% vs. 54.6–63.6%; specificity, 91.9% vs. 91.9%; accuracy, 83.3–85.4% vs. 83.3–85.4%) and tears of the cruciate ligament (sensitivity, specificity, accuracy, 100% vs. 100%) and meniscus (sensitivity, 50.0–62.5% vs. 62.5%; specificity, 100% vs. 87.5–100%; accuracy, 83.3–85.4% vs. 83.3–85.4%) were similar between the two MRI methods. Conclusion Conventional and synthetic MRI showed substantial to almost perfect degree of agreement for the assessment of internal derangement of knee joints. Synthetic MRI may be feasible in the diagnosis of internal derangements of the knee.
Article
Full-text available
Compressed Sensing Magnetic Resonance Imaging (CS-MRI) enables fast acquisition, which is highly desirable for numerous clinical applications. This can not only reduce the scanning cost and ease patient burden, but also potentially reduce motion artefacts and the effect of contrast washout, thus yielding better image quality. Different from parallel imaging based fast MRI, which utilises multiple coils to simultaneously receive MR signals, CS-MRI breaks the Nyquist-Shannon sampling barrier to reconstruct MRI images with much less required raw data. This paper provides a deep learning based strategy for reconstruction of CS-MRI, and bridges a substantial gap between conventional non-learning methods working only on data from a single image, and prior knowledge from large training datasets. In particular, a novel conditional Generative Adversarial Networks-based model (DAGAN) is proposed to reconstruct CS-MRI. In our DAGAN architecture, we have designed a refinement learning method to stabilise our U-Net based generator, which provides an endto-end network to reduce aliasing artefacts. To better preserve texture and edges in the reconstruction, we have coupled the adversarial loss with an innovative content loss. In addition, we incorporate frequency domain information to enforce similarity in both the image and frequency domains. We have performed comprehensive comparison studies with both conventional CSMRI reconstruction methods and newly investigated deep learning approaches. Compared to these methods, our DAGAN method provides superior reconstruction with preserved perceptual image details. Furthermore, each image is reconstructed in about 5 ms, which is suitable for real-time processing.
Article
Full-text available
Knee osteoarthritis (OA) is the most common musculoskeletal disorder. OA diagnosis is currently conducted by assessing symptoms and evaluating plain radiographs, but this process suffers from subjectivity. In this study, we present a new transparent computer-aided diagnosis method based on the Deep Siamese Convolutional Neural Network to automatically score knee OA severity according to the Kellgren-Lawrence grading scale. We trained our method using the data solely from the Multicenter Osteoarthritis Study and validated it on randomly selected 3,000 subjects (5,960 knees) from Osteoarthritis Initiative dataset. Our method yielded a quadratic Kappa coefficient of 0.83 and average multiclass accuracy of 66.71\% compared to the annotations given by a committee of clinical experts. Here, we also report a radiological OA diagnosis area under the ROC curve of 0.93. We also present attention maps -- given as a class probability distribution -- highlighting the radiological features affecting the network decision. This information makes the decision process transparent for the practitioner, which builds better trust toward automatic methods. We believe that our model is useful for clinical decision making and for OA research; therefore, we openly release our training codes and the data set created in this study.
Article
A publicly available dataset containing k-space data as well as Digital Imaging and Communications in Medicine image data of knee images for accelerated MR image reconstruction using machine learning is presented.
Conference Paper
Unsupervised domain mapping aims to learn a function GXY to translate domain X to Y in the absence of paired examples. Finding the optimal G XY without paired data is an ill-posed problem, so appropriate constraints are required to obtain reasonable solutions. While some prominent constraints such as cycle consistency and distance preservation successfully constrain the solution space, they overlook the special properties of images that simple geometric transformations do not change the image's semantic structure. Based on this special property, we develop a geometry-consistent generative adversarial network (Gc-GAN), which enables one-sided unsupervised domain mapping. GcGAN takes the original image and its counterpart image transformed by a predefined geometric transformation as inputs and generates two images in the new domain coupled with the corresponding geometry-consistency constraint. The geometry-consistency constraint reduces the space of possible solutions while keep the correct solutions in the search space. Quantitative and qualitative comparisons with the baseline (GAN alone) and the state-of-the-art methods including CycleGAN [66] and DistanceGAN [5] demonstrate the effectiveness of our method.
Chapter
Many CT slice images are stored with large slice intervals to reduce storage size in clinical practice. This leads to low resolution perpendicular to the slice images (i.e., z-axis), which is insufficient for 3D visualization or image analysis. In this paper, we present a novel architecture based on conditional Generative Adversarial Networks (cGANs) with the goal of generating high resolution images of main body parts including head, chest, abdomen and legs. However, GANs are known to have a difficulty with generating a diversity of patterns due to a phenomena known as mode collapse. To overcome the lack of generated pattern variety, we propose to condition the discriminator on the different body parts. Furthermore, our generator networks are extended to be three dimensional fully convolutional neural networks, allowing for the generation of high resolution images from arbitrary fields of view. In our verification tests, we show that the proposed method obtains the best scores by PSNR/SSIM metrics and Visual Turing Test, allowing for accurate reproduction of the principle anatomy in high resolution. We expect that the proposed method contribute to effective utilization of the existing vast amounts of thick CT images stored in hospitals.
Article
Purpose The potential of medical image analysis with neural networks is limited by the restricted availability of extensive data sets. The incorporation of synthetic training data is one approach to bypass this shortcoming, as synthetic data offer accurate annotations and unlimited data size. Methods We evaluated eleven CycleGAN for the synthesis of computed tomography (CT) images based on XCAT body phantoms. The image quality was assessed in terms of anatomical accuracy and realistic noise properties. We performed two studies exploring various network and training configurations as well as a task-based adaption of the corresponding loss function. Results The CycleGAN using the Res-Net architecture and three XCAT input slices achieved the best overall performance in the configuration study. In the task-based study, the anatomical accuracy of the generated synthetic CTs remained high (SSIM=0.64 and FSIM=0.76). At the same time, the generated noise texture was close to real data with a noise power spectrum correlation coefficient of NCC=0.92 . Simultaneously, we observed an improvement in annotation accuracy of 65% when using the dedicated loss function. The feasibility of a combined training on both real and synthetic data was demonstrated in a blood vessel segmentation task (dice similarity coefficient DSC=0.83±0.05). Conclusion CT synthesis using CycleGAN is a feasible approach to generate realistic images from simulated XCAT phantoms. Synthetic CTs generated with a task-based loss function can be used in addition to real data to improve the performance of segmentation networks.
Article
Background: Super-resolution is an emerging method for enhancing MRI resolution; however, its impact on image quality is still unknown. Purpose: To evaluate MRI super-resolution using quantitative and qualitative metrics of cartilage morphometry, osteophyte detection, and global image blurring. Study type: Retrospective. Population: In all, 176 MRI studies of subjects at varying stages of osteoarthritis. Field strength/sequence: Original-resolution 3D double-echo steady-state (DESS) and DESS with 3× thicker slices retrospectively enhanced using super-resolution and tricubic interpolation (TCI) at 3T. Assessment: A quantitative comparison of femoral cartilage morphometry was performed for the original-resolution DESS, the super-resolution, and the TCI scans in 17 subjects. A reader study by three musculoskeletal radiologists assessed cartilage image quality, overall image sharpness, and osteophytes incidence in all three sets of scans. A referenceless blurring metric evaluated blurring in all three image dimensions for the three sets of scans. Statistical tests: Mann-Whitney U-tests compared Dice coefficients (DC) of segmentation accuracy for the DESS, super-resolution, and TCI images, along with the image quality readings and blurring metrics. Sensitivity, specificity, and diagnostic odds ratio (DOR) with 95% confidence intervals compared osteophyte detection for the super-resolution and TCI images, with the original-resolution as a reference. Results: DC for the original-resolution (90.2 ± 1.7%) and super-resolution (89.6 ± 2.0%) were significantly higher (P < 0.001) than TCI (86.3 ± 5.6%). Segmentation overlap of super-resolution with the original-resolution (DC = 97.6 ± 0.7%) was significantly higher (P < 0.0001) than TCI overlap (DC = 95.0 ± 1.1%). Cartilage image quality for sharpness and contrast levels, and the through-plane quantitative blur factor for super-resolution images, was significantly (P < 0.001) better than TCI. Super-resolution osteophyte detection sensitivity of 80% (76-82%), specificity of 93% (92-94%), and DOR of 32 (22-46) was significantly higher (P < 0.001) than TCI sensitivity of 73% (69-76%), specificity of 90% (89-91%), and DOR of 17 (13-22). Data conclusion: Super-resolution appears to consistently outperform naïve interpolation and may improve image quality without biasing quantitative biomarkers. Level of evidence: 2 Technical Efficacy: Stage 2 J. Magn. Reson. Imaging 2019.
Article
In this paper, we present a semi-supervised deep learning approach to accurately recover high-resolution (HR) CT images from low-resolution (LR) counterparts. Specifically, with the generative adversarial network (GAN) as the building block, we enforce the cycle-consistency in terms of the Wasserstein distance to establish a nonlinear end-to-end mapping from noisy LR input images to denoised and deblurred HR outputs. We also include the joint constraints in the loss function to facilitate structural preservation. In this process, we incorporate deep convolutional neural network (CNN), residual learning, and network in network techniques for feature extraction and restoration. In contrast to the current trend of increasing network depth and complexity to boost the imaging performance, we apply a parallel 1.1 CNN to compress the output of the hidden layer and optimize the number of layers and the number of filters for each convolutional layer. Quantitative and qualitative evaluative results demonstrate that our proposed model is accurate, efficient and robust for super-resolution (SR) image restoration from noisy LR input images. In particular, we validate our composite SR networks on three large-scale CT datasets, and obtain promising results as compared to the other state-of-the-art methods.
Article
Deep learning for MRI detection of sports injuries poses unique challenges. To address these difficulties, this study examines the feasibility and incremental benefit of several customized network architectures in evaluation of complete anterior cruciate ligament (ACL) tears. Two hundred sixty patients, ages 18–40, were identified in a retrospective review of knee MRIs obtained from September 2013 to March 2016. Half of the cases demonstrated a complete ACL tear (624 slices), the other half a normal ACL (3520 slices). Two hundred cases were used for training and validation, and the remaining 60 cases as an independent test set. For each exam with an ACL tear, coronal proton density non-fat suppressed sequence was manually annotated to delineate: (1) a bounding-box around the cruciate ligaments; (2) slices containing the tear. Multiple convolutional neural network (CNN) architectures were implemented including variations in input field-of-view and dimensionality. For single-slice CNN architectures, validation accuracy of a dynamic patch-based sampling algorithm (0.765) outperformed both cropped slice (0.720) and full slice (0.680) strategies. Using the dynamic patch-based sampling algorithm as a baseline, a five-slice CNN input (0.915) outperformed both three-slice (0.865) and single-slice (0.765) inputs. The final highest performing five-slice dynamic patch-based sampling algorithm resulted in independent test set AUC, sensitivity, specificity, PPV, and NPV of 0.971, 0.967, 1.00, 0.938, and 1.00. A customized 3D deep learning architecture based on dynamic patch-based sampling demonstrates high performance in detection of complete ACL tears with over 96% test set accuracy. A cropped field-of-view and 3D inputs are critical for high algorithm performance.
Article
Background Fat‐fraction has been established as a relevant marker for the assessment and diagnosis of neuromuscular diseases. For computing this metric, segmentation of muscle tissue in MR images is a first crucial step. Purpose To tackle the high degree of variability in combination with the high annotation effort for training supervised segmentation models (such as fully convolutional neural networks). Study Type Prospective. Subjects In all, 41 patients consisting of 20 patients showing fatty infiltration and 21 healthy subjects. Field Strength/Sequence: The T1‐weighted MR‐pulse sequences were acquired on a 1.5T scanner. Assessment To increase performance with limited training data, we propose a domain‐specific technique for simulating fatty infiltrations (i.e., texture augmentation) in nonaffected subjects' MR images in combination with shape augmentation. For simulating the fatty infiltrations, we make use of an architecture comprising several competing networks (generative adversarial networks) that facilitate a realistic artificial conversion between healthy and infiltrated MR images. Finally, we assess the segmentation accuracy (Dice similarity coefficient). Statistical Tests A Wilcoxon signed rank test was performed to assess whether differences in segmentation accuracy are significant. Results The mean Dice similarity coefficients significantly increase from 0.84–0.88 (P < 0.01) using data augmentation if training is performed with mixed data and from 0.59–0.87 (P < 0.001) if training is conducted with healthy subjects only. Data Conclusion Domain‐specific data adaptation is highly suitable for facilitating neural network‐based segmentation of thighs with feasible manual effort for creating training data. The results even suggest an approach completely bypassing manual annotations. Level of Evidence: 4 Technical Efficacy: Stage 3
Article
The use of artificial intelligence, and the deep-learning subtype in particular, has been enabled by the use of labeled big data, along with markedly enhanced computing power and cloud storage, across all sectors. In medicine, this is beginning to have an impact at three levels: for clinicians, predominantly via rapid, accurate image interpretation; for health systems, by improving workflow and the potential for reducing medical errors; and for patients, by enabling them to process their own data to promote health. The current limitations, including bias, privacy and security, and lack of transparency, along with the future directions of these applications will be discussed in this article. Over time, marked improvements in accuracy, productivity, and workflow will likely be actualized, but whether that will be used to improve the patient–doctor relationship or facilitate its erosion remains to be seen.
Chapter
Robust localisation and identification of vertebrae is essential for automated spine analysis. The contribution of this work to the task is two-fold: (1) Inspired by the human expert, we hypothesise that a sagittal and coronal reformation of the spine contain sufficient information for labelling the vertebrae. Thereby, we propose a butterfly-shaped network architecture (termed Btrfly Net) that efficiently combines the information across reformations. (2) Underpinning the Btrfly net, we present an energy-based adversarial training regime that encodes local spine structure as an anatomical prior into the network, thereby enabling it to achieve state-of-art performance in all standard metrics on a benchmark dataset of 302 scans without any post-processing during inference.
Chapter
This paper discusses how distribution matching losses, such as those used in CycleGAN, when used to synthesize medical images can lead to mis-diagnosis of medical conditions. It seems appealing to use these new image synthesis methods for translating images from a source to a target domain because they can produce high quality images and some even do not require paired data. However, the basis of how these image translation models work is through matching the translation output to the distribution of the target domain. This can cause an issue when the data provided in the target domain has an over or under representation of some classes (e.g. healthy or sick). When the output of an algorithm is a transformed image there are uncertainties whether all known and unknown class labels have been preserved or changed. Therefore, we recommend that these translated images should not be used for direct interpretation (e.g. by doctors) because they may lead to misdiagnosis of patients based on hallucinated image features by an algorithm that matches a distribution. However there are many recent papers that seem as though this is the goal.
Chapter
CT is commonly used in orthopedic procedures. MRI is used along with CT to identify muscle structures and diagnose osteonecrosis due to its superior soft tissue contrast. However, MRI has poor contrast for bone structures. Clearly, it would be helpful if a corresponding CT were available, as bone boundaries are more clearly seen and CT has a standardized (i.e., Hounsfield) unit. Therefore, we aim at MR-to-CT synthesis. While the CycleGAN was successfully applied to unpaired CT and MR images of the head, these images do not have as much variation of intensity pairs as do images in the pelvic region due to the presence of joints and muscles. In this paper, we extended the CycleGAN approach by adding the gradient consistency loss to improve the accuracy at the boundaries. We conducted two experiments. To evaluate image synthesis, we investigated dependency of image synthesis accuracy on (1) the number of training data and (2) incorporation of the gradient consistency loss. To demonstrate the applicability of our method, we also investigated segmentation accuracy on synthesized images.
Article
Undersampled magnetic resonance image (MRI) reconstruction is typically an ill-posed linear inverse task. The time and resource intensive computations require tradeoffs between accuracy and speed . In addition, state-of-the-art compressed sensing (CS) analytics are not cognizant of the image diagnostic quality . To address these challenges, we propose a novel CS framework that uses generative adversarial networks (GAN) to model the (low-dimensional) manifold of high-quality MR images. Leveraging a mixture of least-squares (LS) GANs and pixel-wise 1/2\ell _{1}/\ell _{2} cost, a deep residual network with skip connections is trained as the generator that learns to remove the aliasing artifacts by projecting onto the image manifold. The LSGAN learns the texture details, while the 1/2\ell _{1}/\ell _{2} cost suppresses high-frequency noise. A discriminator network, which is a multilayer convolutional neural network (CNN), plays the role of a perceptual cost that is then jointly trained based on high-quality MR images to score the quality of retrieved images. In the operational phase, an initial aliased estimate (e.g., simply obtained by zero-filling) is propagated into the trained generator to output the desired reconstruction. This demands a very low computational overhead. Extensive evaluations are performed on a large contrast-enhanced MR dataset of pediatric patients. Images rated by expert radiologists corroborate that GANCS retrieves higher quality images with improved fine texture details compared with conventional Wavelet-based and dictionary-learning-based CS schemes as well as with deep-learning-based schemes using pixel-wise training. In addition, it offers reconstruction times of under a few milliseconds, which are two orders of magnitude faster than the current state-of-the-art CS-MRI schemes.
Article
Purpose: To determine the feasibility of using a deep learning approach to detect cartilage lesions within the knee joint on magnetic resonance (MR) images including cartilage softening, fibrillation, fissuring, focal defects, and diffuse thinning due to cartilage degeneration and acute cartilage injury. Methods: A fully-automated deep learning-based cartilage lesion detection system was developed using segmentation and classification convolutional neural networks (CNNs). Fat-suppressed T2-weighted fast spin-echo image datasets of the knee of 175 patients with knee pain were retrospectively analyzed using the deep learning method. The reference standard for training the CNN classification was the interpretation provided by a fellowship-trained musculoskeletal radiologist of the presence or absence of a cartilage lesion within 17,395 small image patches placed on the articular surfaces of the femur and tibia. Receiver operating curve (ROC) analysis and kappa statistic was used to assess diagnostic performance and intra-observer agreement for detecting cartilage lesions for two individual evaluations performed by the cartilage lesion detection system. Results: The sensitivity and specificity of the cartilage lesion detection system at the optimal threshold using the Youden index was 84.1% and 85.2% respectively for evaluation 1 and 80.5% and 87.9% respectively for evaluation 2. Areas under the curves (AUCs) were 0.917 and 0.914 for evaluations 1 and 2 respectively, indicating high overall diagnostic accuracy for detecting cartilage lesions. There was good intra-observer agreement between the two individual evaluations with a kappa value of 0.76. Conclusion: Our study demonstrated the feasibility of using a fully-automated, deep learning-based cartilage lesion detection system to evaluate the articular cartilage of the knee joint with high diagnostic performance and good intra-observer agreement for detecting cartilage degeneration and acute cartilage injury.
Article
Purpose To develop a super‐resolution technique using convolutional neural networks for generating thin‐slice knee MR images from thicker input slices, and compare this method with alternative through‐plane interpolation methods. Methods We implemented a 3D convolutional neural network entitled DeepResolve to learn residual‐based transformations between high‐resolution thin‐slice images and lower‐resolution thick‐slice images at the same center locations. DeepResolve was trained using 124 double echo in steady‐state (DESS) data sets with 0.7‐mm slice thickness and tested on 17 patients. Ground‐truth images were compared with DeepResolve, clinically used tricubic interpolation, and Fourier interpolation methods, along with state‐of‐the‐art single‐image sparse‐coding super‐resolution. Comparisons were performed using structural similarity, peak SNR, and RMS error image quality metrics for a multitude of thin‐slice downsampling factors. Two musculoskeletal radiologists ranked the 3 data sets and reviewed the diagnostic quality of the DeepResolve, tricubic interpolation, and ground‐truth images for sharpness, contrast, artifacts, SNR, and overall diagnostic quality. Mann‐Whitney U tests evaluated differences among the quantitative image metrics, reader scores, and rankings. Cohen's Kappa (κ) evaluated interreader reliability. Results DeepResolve had significantly better structural similarity, peak SNR, and RMS error than tricubic interpolation, Fourier interpolation, and sparse‐coding super‐resolution for all downsampling factors (p < .05, except 4 × and 8 × sparse‐coding super‐resolution downsampling factors). In the reader study, DeepResolve significantly outperformed (p < .01) tricubic interpolation in all image quality categories and overall image ranking. Both readers had substantial scoring agreement (κ = 0.73). Conclusion DeepResolve was capable of resolving high‐resolution thin‐slice knee MRI from lower‐resolution thicker slices, achieving superior quantitative and qualitative diagnostic performance to both conventionally used and state‐of‐the‐art methods.
Conference Paper
MR-only radiotherapy treatment planning requires accurate MR-to-CT synthesis. Current deep learning methods for MR-to-CT synthesis depend on pairwise aligned MR and CT training images of the same patient. However, misalignment between paired images could lead to errors in synthesized CT images. To overcome this, we propose to train a generative adversarial network (GAN) with unpaired MR and CT images. A GAN consisting of two synthesis convolutional neural networks (CNNs) and two discriminator CNNs was trained with cycle consistency to transform 2D brain MR image slices into 2D brain CT image slices and vice versa. Brain MR and CT images of 24 patients were analyzed. A quantitative evaluation showed that the model was able to synthesize CT images that closely approximate reference CT images, and was able to outperform a GAN model trained with paired MR and CT images.