Conference PaperPDF Available

Asurvey: Generative adversarial networks and their applications in medical imaging: an overview from computer science perspective

Authors:

Abstract

The lack of high-quality annotated medical image datasets is a major problem colliding with and hindering the growth of applications acquiring engines in the field of medical image analysis. In biomedical image analysis, the applicability of deep acquisition methods is directly influenced by the wealth of accessible image information. This is because the deep acquisition version requires a large image dataset to perform well. Generative Adversarial Networks (GANs) are widely used to remove data limitations in the generation of artificial biomedical images. As already mentioned, the artificial image is built by the feedback received. A discriminator is a pattern that artificially or realistically classifies an image and provides feedback to the generator. A general study is carried out on GANs network application to medical image segmentation, primarily adopted to several GANs-based versions, performance metrics, less operates, datasets, augmentation methods, money performance, and source codes. Secondly, this money offers a comprehensive overview of GANs network applications in different human diseases segmentation. We complete our research with vital discussion, limitations of GANs, and prepositions for coming directions. We hope this study is helpful and increases the sensitivity of GANs network implementations for biomedical image segmentation missions.
1
Asurvey: Generative adversarial networks and their applications in medical
imaging: an overview from computer science perspective
Mohammad Vand Jalili1, Masoud Asghari2
1-PhD student in Artificial Intelligence, Department of Engineering, Miyaneh Branch, Islamic
Azad University, Miyaneh, Iran
2-Faculty of Engineering, University of Maragheh, P.O. Box 55136-553, Maragheh, Iran
Abstract
The lack of high-quality annotated medical image datasets is a major problem colliding with and
hindering the growth of applications acquiring engines in the field of medical image analysis. In
biomedical image analysis, the applicability of deep acquisition methods is directly influenced by
the wealth of accessible image information. This is because the deep acquisition version requires
a large image dataset to perform well. Generative Adversarial Networks (GANs) are widely used
to remove data limitations in the generation of artificial biomedical images. As already mentioned,
the artificial image is built by the feedback received. A discriminator is a pattern that artificially
or realistically classifies an image and provides feedback to the generator. A general study is
carried out on GANs network application to medical image segmentation, primarily adopted to
several GANs-based versions, performance metrics, less operates, datasets, augmentation
methods, money performance, and source codes. Secondly, this money offers a comprehensive
overview of GANs network applications in different human diseases segmentation. We complete
our research with vital discussion, limitations of GANs, and prepositions for coming directions.
We hope this study is helpful and increases the sensitivity of GANs network implementations for
biomedical image segmentation missions.
Keywords: Generative adversarial network, GANs applications, Artificial Intelligence
2
1. Introduction
Recent years have seen significant advances in computer-aided diagnosis (CAD) in medical
imaging and diagnostic radiology to expand deep acquisition systems [1, 2]. Various medical
imaging intends have seemed with the fixed improvement of medical tech. This way is
unendurable for the human eyes, main to faults, and needs lots of time and endeavor [3]. Medical
imaging is important in modern clinics to deliver recommendations for the accurate diagnosis and
surgery of different diseases. These images allow a quantitative and qualitative evaluation of the
symptoms of the lesion situation that are used in different organs of the body such as the heart,
brain, lung, chest, kidney, etc. [4]. It regularly relies on the radiologist’s encounter to study the
image with bare eyes and recognizes the lesion location [3]. Over the few years, DL medical
systems have generated great interest and have been employed strongly in all fields of medicine,
from drug identification to medical decisions crucially altering the treatment route [5, 6]. Part of
the deep acquiring artificial brightness that characterizes information enhancement technologies
is generative modeling which includes creating fake images from the initial dataset and then using
them to predict options of the image. A generative adversarial network (GAN) is a sample of a
generative network. GANs are formed of two obvious kinds of networks that are educated
concurrently. The network is trained to forecast indoor scenes during the time and distinguish
among them. GANs are titled as a particular sample of Deep acquiring. GANs may study
representations from information not requiring labeled datasets. It's extracted from competitive
acquiring mechanisms entailing two-of-a-kind of neural networks. Academic and business fields
have agreed to accept adversarial preparation as a data-driven manipulation approach due to its
simplicity and usefulness in creating new images. GANs have composed powerful improvements
and have maintained main changes in various applications. These applications include collection
characterizing, type conversion, semantic image editing, image super-resolution, and image
classification [7]. The key issue discussed in the money is the two-player zero-sum scenario. The
one who wins the game receives a similar sum of money as the other squad. The networks lead to
classes of GANs labeled discriminator and generator networks. The discriminator was developed
to choose in-case or not a sample was accurate sample or artificial. Alternatively, the generator
will build a fake sample of images to mix up the discriminator. The Discriminator generates the
possibility that a delivered sample originated from a collection of genuine samples. A real sample
has a healthy serendipity of being accurate. Perhaps untrue samples are recommended by their low
possibility. The generator may offer the best approach where the discriminator has nearly [8].
A generator is a neural network that takes pictures from beats and creates images. Beaters
produced by the generator are listed above G(z). A Gaussian bat input is intended for the latent
space. During the training steps, the values of the G and D neurons are changed repeatedly.
Discriminators are neural networks that represent real-world evidence and can recognize
characters whether they remember them or not. X is the input to D and the output (x) [9]. Goal
manipulation for a two-player mini-max game was described in Create Equation. (1)
󰇛 󰇜 󰇛󰇜󰇛󰇜 󰇛󰇜󰇟󰇛 󰇛󰇛󰇜󰇜󰇜󰇠


(1)
Medical imaging plays a pivotal part in recent healthcare by enabling in vivo examination of
pathology in the human body. In several clinical procedures, superior multi-modal protocols show
a different collection of images from many scanners (e.g., CT, MRI) [10], or various acquisitions
from a single scanner (multi-contrast MRI) [11]. The usually confusing details of tissue
3
morphology, enable clinicians to diagnose with greater accuracy and confidence. Unfortunately,
many factors including uncooperative patients and unconventional scan times prohibit
multimodality imaging everywhere [12, 13]. Consequently, there has been a growing interest in
obtaining intact images in multi-modal protocols from a subset of accessible images, bypassing
the costs associated with additional scans [14, 15]. The goal of medical image acquisition is to
predict images with a target modality for a subject related to images with a source method learned
under its quantitative scanning appropriation [16]. This is an inappropriate inverse problem
because medical images are high-dimensional, target modality information is not present in the
same inference, and there are nonlinear differences in tissue disparity among modalities [17].
Unsurprisingly, the current adoption of deep acquiring methods for resolving this difficult
issue has enabled main performance leaps [18, 19]. In the learning-based approach, the network
versions successfully surpass the joint distribution of source-target images [20]. Earlier studies
utilizing CNNs for this objective stated essential improvements over familiar approaches [21].
Generative adversarial networks (GANs) were introduced that force an adversarial loss to escalate
acquire of comprehensive tissue form [22, 23]. Furthermore, improvements were attained by
leveraging improved architectural constructs [24, 25], and acquiring techniques [26, 27]. In spite
of their skillfulness, prior learning-based approach versions are fundamental as said by
convolutional construction that uses thick filters to extract local image options [27]. By exploiting
correlations between limited areas of image pixels, this induced bias reduces the number of sample
parameters to simplify acquisition. While sometimes restricting expression to textual options that
mediate long-range spatial dependencies [28].
Medical images acquire contextual relationships across both healthy and pathological tissues.
For example, bone in the skull or CSF in the ventricles are widely distributed over spatially distinct
or discrete regions of the brain, leading to dependencies between distant voxels. During the time
that pathological tissues have less common anatomical priors, their spatial distribution (e.g.,
region, abundance, shape) can still demonstrate specific patterns of disease. Among the numerous
disseminated brain lesions, multiple sclerosis (MS) and Alzheimer's disease (AD) have been
demonstrated. Mainly in MS and AD, it approaches the periventricular and para cortical regions,
and the hippocampus, entorhinal cortex, and isocortex, respectively [28]. Meanwhile, a small
number of lesions appear as spatially adjacent masses in cancer, with lesions typically approaching
the brain and cerebellum in glioma and the skull in meningioma. Therefore, the distribution of
pathology additionally requires information about the condition and shape of lesions relative to
healthy tissue. In principle, the performance of the approach can be improved with priors that
capture these relationships. Visual transformers are particularly promising for this purpose because
attentional operators that study contextual options can increase sensitivity to long-range
interactions [29], and focus on vital image areas for enhanced generalization to different irregular
anatomy as an example lesions [30]. But sometimes, adopting vanilla transformers in missions
with pixel-level outputs is complicated because of algorithmic burden and localization [31].
Current studies give attention to the use of hybrid constructions or efficient computation
attentional operators to adopt transformers in medical imaging missions [32].
2. Related Work
Alzheimer Disease Neuroimaging Initiative (ADNI) data were released in 2004 by R, Michael W,
and Weiner and were financed by a publicprivate partnership. The goal of the ADNI database
central is to develop a clinical method for immediately diagnosing Alzheimer's life-threatening
4
disease in the before time juncture. The vast majority of the medical datasets of ADNI used in
studies are MRI and PET images for disease assessments [33, 34].
Similarly, studies utilized one more dataset NAMIC Brain Multimodality for automatic brain
segmentation. This database is freely available and contains structural MRI images [35]. Two
GAN-based networks are proposed which are named DCGAN and LAPGAN, respectively. The
basic conception is to build skin lesion images and segmentation masks utilizing the proposed
GANs construction. Ding et al. [36] exhibit a way for obtaining dermatoscopy images to take on
information limitation problems. The GAN is utilized for image-to-image translation to constitute
prior mark mapping as source input to invent new dermatoscopy approach images. In addition,
reducing matching options is suggested to improve the resulting image. Lee, et al. encoded the
content description of the lesion delivered by the physician as an option vector to make the
synthesized lesion meet the wanted characteristics. Wu, et al. [38] input the latent representation
of the lesion into each network layer in order that the artificial image could be more
comprehensively constrained in the course of the generation procedure. Kanayama, et al. [39]
studied gastric lesions in a unique image without lesions. To increase the continuity of the tissue
between the lesion and its nearby context, they set limits for the tissue at the junction with the
lesion and the context. Lin, et al. [40] performed fusion-based enhancement in mammography. In
the same way, they paid attention to the continuity of the joint tissue. Khalifa, et al. [41] deep
acquisition image information enhancement included three types, the first was image information
enhancement using GAN, among them, the second was neural style transfer and the third was
metameric acquisition. MetaMetrics included neural enhancement, automatic enhancement, and
brain enhancement. The third field of advanced research showed image information enhancement
in distinct fields, for example, the medical field, agricultural field, and other miscellaneous fields.
The prospect of information augmentation is very certain. Search algorithmic programs that use
information warping and oversampling methods have enormous potential. The layered structure
of the deep neural network provides many possibilities for information enhancement. Dalmaz, et
al. [42] introduced new synthetic advances for multimodal imaging, as stated by conditional deep
adversarial networks. In detail, ResViT aggregates convolutional operators and image
transformers to improve the capture of contextual relationships across time while maintaining
localization performance. A unified implementation was introduced, eliminating the need to
restore versions of different source/destination configurations. ResViT provides superior precision
synthesis for state-of-the-art approaches in multicontrast brain MRI and multimodal pelvic MRI-
CT datasets. As such, it is a promising candidate for medical imaging approaches.
3. Definition of GAN
In comparison with familiar deep neural networks, GANs are distinct types of deep neural
networks where two networks are trained concurrently. Several studies and genuine feedback
papers on applications of GAN to medical imaging were released [43]. GANs use the objective as
a joint loss operation with minimal optimization. The generative goal is to create realistic
information and mislead the discriminator to categorize it. In contrast, the discriminator aims to
classify artificial information as artificial and real information as real. Ideally, the training of
GANs should continue unless it achieves a Nash equilibrium so that the activities of the generative
and differentiating versions do not affect each other's performance [44].
In healthcare, which has introduced new synthetic advances for multimodal, GANs are widely
used for many tasks such as biomedical image analysis [45], electronic vital record [46], and drug
discovery [47]. . Recently, GANs have also been implicated in the area of coronavirus disease
(COVID-19), namely the diagnosis of diseases based on chest X-rays [48]. In the field of
5
biomedical imaging, information availability is a barrier to using deep acquisition. The Deep
Acquisition version consists of a deep neural network that requires a large training dataset for
further predictive analysis.Therefore, increasing the dimensions of biomedical datasets is a
difficult problem. Another issue in biomedical imaging is class imbalanced datasets. Examines
datasets with skewed classes when dealing with multiple disease classes. With class-imbalanced
datasets, deep neural networks are more directed to classes with a large number of images instead
of the path with fewer images [49]. Information enhancement is one of the potential solutions to
manage period imbalance, along with information limitation issues [50].
Mapping GAN strategies presents general concepts of GANs for ophthalmic image analysis,
the structures most commonly encountered in the primarily reviewed literature. The basic form of
GAN is vanilla GAN. [51]. In some cases, in order to use GANs for medical purposes, it is
necessary to create artificial images with the desired properties. Conditional GANs are an
extension of vanilla GANs, where both generators and discriminators are trained using unique
datasets and additional condition variables [52].In order to get excellent image generation
performance in several domains, researchers have modified the generators of conditional GAN in
several deep acquiring structures. Currently, conditional GAN includes countless types of GAN
versions, as the condition variable can be any variable consisting of a state variable [53], images
of the same or separate domains [54], masked images, and directed heat map images [55]. The
vanilla GAN structure consists of two versions of deep assimilation, which include a generator
that combines candidate samples according to the information distribution of the unique dataset,
and a discriminator that tries to distinguish prepared candidate samples from real samples. The
unique datasets of these two modules are trained simultaneously as the gradient details are back-
propagated to the generator to double the real-world image synthesis capabilities and to the
discriminator to enhance real/artificial discriminating capabilities. Later, vanilla GANs were
introduced, and GANs gained prominence for their ability to create realistic artificial images
explained by their own datasets [56]. GAN [57] has basic image processing functions and is widely
used in the field of medical image enhancement [58]. Researchers have several methods they have
specifically tried to improve the detector training set. Some of these methods have specific
provisions for training samples. Introduced as a breakthrough in deep networks, GANs are rapidly
gaining the attention of the research community due to their wide range of medical imaging
applications [59].
4. GAN application in medical imaging
4.1. Datasets based on MRI imaging
MRI is a vital non-invasive technique that is widely used as a brain tumor imaging method in
numerous research studies. MRI medical imaging technique is safe even for pregnant women and
their babies and never affects radiation. But the main disadvantage of MRI images is that it is
sensitive and it is difficult to evaluate the organs involved in oral tumors. In medical MRI imaging
techniques, a common use of segmentation is to extract transparent tissues to identify
abnormalities and tumor location [60]. From 2013 to 2018, MICCAI includes a dataset of brain
tumor MRI scans (BraTS 2013-18). Various brain tumor classification approaches and frameworks
have been described. This helps to increase the accuracy and identify tumors from MRI images
[61]. ISLES (Ischemic Stroke Lesions Segmentation) plays a role in datasets for biomedical
segmentation, and similarly, IXI and NAMIC multimodal datasets [62] are used in studies. Data
from the Alzheimer's Disease Neuroimaging Initiative (ADNI) were published in 2004 by R.
6
Michael W. Weiner and were funded by a public-private partnership. The main goal of the ADNI
database is to plan a clinical system for early diagnosis of Alzheimer's disease. The vast majority
of ADNI's internal datasets are MRI and PET images for disease assessment used in studies [7].
4.2. Datasets based on CT scans
CT scan is a biomedical imaging method that has a surprising effect on the diagnosis of human
body evaluations. CT scanning is widely used in several medical conditions in a wide range of
biomedical applications such as MRI. CT scan requires less screening time and is a better
technique for rare coronary artery disease and vascular evaluation than MRI imaging. Also, in
people with kidney problems, radiation vulnerability and inefficient performance have a pervasive
effect on all functions. The ISLES 2018 challenge presents CT (3D) ischemic stroke lesion
segmentation images that have been used in studies [63]. There are multiple CT scan datasets
MICCAI 2017, Image CHD, MRBrainS18, and MM-WHS-2017 used in previous work for heart
segmentation for heart segmentation [64, 65]. Likewise, the MICCAI Grand Challenge
additionally presents the PROMISE12 prostate MR image segmentation dataset [66]. CT scan
images have gained great importance as a 3D imaging method, and the vast majority of ISBI LiTS-
2017, Deep Lesion, MICCAI-SLiver07, and LIVER100 liver tumor datasets are said to be 3D
techniques [67, 68]. For lung tumor segmentation, LIDC-IDRI, SARSCOV-2 Ct-Scan, and
NSCLC-Radiomics datasets have been tested in research work [69]. CT scan images (3D) are also
used in kidney tumor segmentation. GAN-based versions use KiTS19 Challenge and Kidney NIH
Pancreas-CT datasets for correct tumor segmentation. Also, for backbone, chest, head, neck, and
spleen segmentation the InnerEye dataset, 2017AAPMThoracicAuto-Segmentation Challenges,
H&N CT, and Decathlon spleen data are publicly available. These datasets have been extensively
tested on GAN versions [70, 71].
4.3. GANs applications in cardiac segmentation
In medical imaging, cardiac segmentation plays an important role in cardiac disease, clinical
monitoring, and treatment planning. CMRI (Cardiac Magnetic Resonance Imaging) contains
descriptions of drug and surgical treatments that are useful for evaluating all conceivable
treatments [72]. But there are many challenges in echocardiography. For example, its low spatial
stability, malleable appearance, and small annotation image availability. The authors present the
short-axis development of a biventricular slice heart on MRI by cCGAN [73]. Similarly, cGAN
[74] is employed to indicate deformation from CMR frames, with an amazing outcome of accuracy
in realistic prediction. For in an automated manner complete heart and outstanding vessel
segmenting utilizing CMR images, a context-aware cGAN is presented by research [75].
4.4. GANs applications in liver tumor segmentation
According to 2017 WHO report, liver cancer is the second most common malignancy and a leading
cause of death worldwide [76]. Academic methods use unexplained information to reduce the level
of detail. Additionally, a Bayesian loss operation is used to account for prior probabilities and
probabilities. In depth experiments, images of both the liver and brain are combined, doubling the
amount of information [77].
4.5. GANs applications in retina diseases Segmentation
The concept of residual acquisition is applied to enhance the structure created on the FCN versions.
In addition, adversarial training enhances segmentation results, mapping across retinal maps, and
7
segmentation using FCN and GAN [78]. CGAN [79] is prepared for segmentation tasks in such a
way that preprocessing and image magic are applied. Compatibility options use MGAN
classification diversity indices [80]. As modified by U-Net with multiple short-circuits and the
middle Conv layer with a stacked connection block [81]. A GAN model [82] is used to segment
retinal narrow vessels. The performance is higher than the classic U-Net network.
4.6. GANs applications in skin lesion segmentation
Dermatologists use the dermoscopy arrangement to observe and exaggerate skin pigmentation
diseases. Whereas this treatment needs a more time-consuming and highly skilled. The growth of
deep acquiring versions in PC vision systems offers a crucial generator for dermatologists to
discover skin-related cancers more accurately [44]. Information increase like information
augmentation is unable to extrapolate generated information, which leads to information bias and
suboptimal performance of trained versions. Various researchers have shown that information
augmentation using GAN strategies can be more profitable than conventional methods [83].
Lately, GAN strategies have been widely utilized to prepare realistic medical images for
information augmentation [84]. Domain transfer is more important than getting a working engine
that has grown and validated information from the same domain. To create a more generalized
machine learning model, information from different domains may be combined with domain
transfer, which is the transfer between different imaging modalities. The domain transfer mission
of GANs is the reciprocal image synthesis operation, by which images for one modality are
generated as theoretical by another. The cross-domain method using the domain transfer technique
has shown the possibility of obtaining other clinical details without additional examinations [85].
Various studies have displayed that GAN could be an excellent choice in overcoming information
poverty and the lack of large annotated datasets in ophthalmology [86]. Burlina et al. showed that
a deep acquiring exemplar trained with only artificial retinal images created by PGGAN
accomplished worse than those trained with genuine retinal images (0.9706 vs.0.9235 taking into
account the place below the handset functioning characteristic curve) [87]. GAN was utilized for
information augmentation of OCT images with disorderly retinal diseases in a semi-supervised
acquiring manner [53]. Furthermore, the image conformation skill of GAN offers patient privacy,
as artificial images preserve features by becoming unidentifiable. It preserves the synthetic
information of the manifold in the option space of the unique dataset [88].
5. Limitations of GAN use in medical imaging
Regarding the common limitations of GAN strategies, it can be seen that GAN has several
limitations that researchers should be careful about. State breaking is a phenomenon that continues
to output similar results and is a well-known GAN problem. To overcome this failure caused by a
sample stuck in a local minimum, different training data or other information enhancement
strategies are needed [89].
Numerous GAN versions were trained with no assurance of convergence. Spatial deformities
frequently happen when there exist microscopic training images without spatial alignment.
Specifically, in domain transfer utilizing conditional GAN, paired images with structural and
spatial alignment are critically complicated and need extra image registration in preprocessing to
access high-quality medical images. Unintended changes could happen in image-to-image
translation as a consequence of the various information distributions of the structural options
between the two image domains. GAN and its variants are commonly made up of two or more
8
deep-acquiring modules. For instance, two generators and two discriminators in cycle GAN, and
therefore training GAN tends to be unstable in comparison with a single deep acquiring module
[53]. The zero gradient problem can also occur if the discriminator works well and the generator
is acquired too late. Therefore, hyperparameter tuning is required and training can be stopped
prematurely to access larger artificial images. Furthermore, the occurrence of these problems is
unpredictable and depends on the amount of information and the distribution of fixed pixels. The
GAN strategy has shown better performance in radiology and pathology than other generative
deep acquisition versions such as autoencoders, fully convolutional networks (FCN), and U-Net
[89]. FCN and U-Net are well-established deep generation versions for detection and segmentation
missions in the biomedical imaging domain [90]. GAN frameworks can improve the image
synthesis performance of FCN and U-Net versions because they do not consider the exhaustive
possibilities of output images [91].
6. Conclusion and Discussions
In this study, the training challenges of GANs as a broken sample state, lack of convergence, and
instability in relation to the field of biomedical imaging have been investigated. As discussed by
the classification of programs and solutions, GAN has emerged in the last few years and shows
promising results in image processing for various purposes. Today, GAN has become a vital
generator in the field of medical imaging and helps to solve various problems of medical imaging,
which include growing datasets, transferring images from one domain to another domain,
segmentation of lesions, etc. As previously presented in the honest literature. GAN feedback has
shown outstanding results in countless missions, and its structure has been further enhanced to
reduce training instability. This study additionally highlights conceivable research directions to
address the fundamental training challenges of GANs for biomedical imaging. In this study, we
conclude all three technical challenges, while training GANs requires more research work to fill
this gap for biomedical image analysis. It encourages researchers to propose sophisticated
solutions to investigate the underlying training challenges of GANs in the biomedical imaging
field.
9
References
[1] AlAmir, M. and AlGhamdi, M., 2022. The Role of generative adversarial network in medical
image analysis: An in-depth survey. ACM Computing Surveys, 55(5), pp.1-36.
[2] Ting, D.S., Liu, Y., Burlina, P., Xu, X., Bressler, N.M. and Wong, T.Y., 2018. AI for medical
imaging goes deep. Nature medicine, 24(5), pp.539-540.
[3] Wang, J., Zhu, H., Wang, S.H. and Zhang, Y.D., 2021. A review of deep learning on medical
image analysis. Mobile Networks and Applications, 26, pp.351-380.
[4] AlAmir, M. and AlGhamdi, M., 2022. The Role of generative adversarial network in medical
image analysis: An in-depth survey. ACM Computing Surveys, 55(5), pp.1-36.
[5] Singh, A., Sengupta, S. and Lakshminarayanan, V., 2020. Explainable deep learning models
in medical image analysis. Journal of Imaging, 6(6), p.52.
[6] Ker, J., Wang, L., Rao, J. and Lim, T., 2017. Deep learning applications in medical image
analysis. Ieee Access, 6, pp.9375-9389.
[7] Tokuoka, Y., Suzuki, S. and Sugawara, Y., 2019, November. An inductive transfer learning
approach using cycle-consistent adversarial domain adaptation with application to brain tumor
segmentation. In Proceedings of the 2019 6th international conference on biomedical and
bioinformatics engineering (pp. 44-48).
[8] Alqahtani, H., Kavakli-Thorne, M. and Kumar, G., 2021. Applications of generative
adversarial networks (gans): An updated review. Archives of Computational Methods in
Engineering, 28, pp.525-552.
[9] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville,
A. and Bengio, Y., 2014. Advances in neural information processing systems. Curran Associates,
Inc, 27, pp.2672-2680.
[10] Pichler, B.J., Judenhofer, M.S. and Pfannenberg, C., 2008. Multimodal imaging approaches:
Pet/ct and pet/mri. Molecular Imaging I, pp.109-132.
[11] Moraal, B., Roosendaal, S.D., Pouwels, P.J., Vrenken, H., Van Schijndel, R.A., Meier, D.S.,
Guttmann, C.R., Geurts, J.J. and Barkhof, F., 2009. Multi-contrast, isotropic, single-slab 3D MR
imaging in multiple sclerosis. The Neuroradiology Journal, 22(1_suppl), pp.33-42.
[12] Thukral, B.B., 2015. Problems and preferences in pediatric imaging. Indian Journal of
Radiology and Imaging, 25(04), pp.359-364.
[13] Iglesias, J.E., Konukoglu, E., Zikic, D., Glocker, B., Van Leemput, K. and Fischl, B., 2013.
Is synthesizing MRI contrast useful for inter-modality analysis?. In Medical Image Computing and
Computer-Assisted InterventionMICCAI 2013: 16th International Conference, Nagoya, Japan,
September 22-26, 2013, Proceedings, Part I 16 (pp. 631-638). Springer Berlin Heidelberg.
[14] Huo, Y., Xu, Z., Bao, S., Assad, A., Abramson, R.G. and Landman, B.A., 2018, April.
Adversarial synthesis learning enables segmentation without target modality ground truth. In 2018
IEEE 15th international symposium on biomedical imaging (ISBI 2018) (pp. 1217-1220). IEEE.
[15] Farsiu, S., Robinson, D., Elad, M. and Milanfar, P., 2004. Advances and challenges in super
resolution. International Journal of Imaging Systems and Technology, 14(2), pp.47-57.
[16] Ye, D.H., Zikic, D., Glocker, B., Criminisi, A. and Konukoglu, E., 2013. Modality
propagation: coherent synthesis of subject-specific scans with data-driven regularization. In
Medical Image Computing and Computer-Assisted InterventionMICCAI 2013: 16th
International Conference, Nagoya, Japan, September 22-26, 2013, Proceedings, Part I 16 (pp. 606-
613). Springer Berlin Heidelberg.
[17] Catana, C., van der Kouwe, A., Benner, T., Michel, C.J., Hamm, M., Fenchel, M., Fischl, B.,
Rosen, B., Schmand, M. and Sorensen, A.G., 2010. Toward implementing an MRI-based PET
10
attenuation-correction method for neurologic studies on the MR-PET brain prototype. Journal of
nuclear medicine, 51(9), pp.1431-1438.
[18] Jog, A., Carass, A., Roy, S., Pham, D.L. and Prince, J.L., 2017. Random forest regression for
magnetic resonance image synthesis. Medical image analysis, 35, pp.475-488.
[19] Van Nguyen, H., Zhou, K. and Vemulapalli, R., 2015. Cross-domain synthesis of medical
images using efficient location-sensitive deep network. In Medical Image Computing and
Computer-Assisted Intervention--MICCAI 2015: 18th International Conference, Munich,
Germany, October 5-9, 2015, Proceedings, Part I 18 (pp. 677-684). Springer International
Publishing.
[20] Dar, S.U., Yurt, M., Karacan, L., Erdem, A., Erdem, E. and Cukur, T., 2019. Image synthesis
in multi-contrast MRI with conditional generative adversarial networks. IEEE transactions on
medical imaging, 38(10), pp.2375-2388.
[21] Wei, W., Poirion, E., Bodini, B., Durrleman, S., Colliot, O., Stankoff, B. and Ayache, N.,
2019. Fluid-attenuated inversion recovery MRI synthesis from multisequence MRI using three-
dimensional fully convolutional networks for multiple sclerosis. Journal of Medical Imaging, 6(1),
pp.014005-014005.
[22] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville,
A. and Bengio, Y., 2020. Generative adversarial networks. Communications of the ACM, 63(11),
pp.139-144.
[23] Armanious, K., Jiang, C., Fischer, M., Küstner, T., Hepp, T., Nikolaou, K., Gatidis, S. and
Yang, B., 2020. MedGAN: Medical image translation using GANs. Computerized medical
imaging and graphics, 79, p.101684.
[24] Yurt, M., Dar, S.U., Erdem, A., Erdem, E., Oguz, K.K. and Çukur, T., 2021. mustGAN: multi-
stream generative adversarial networks for MR image synthesis. Medical image analysis, 70,
p.101944.
[25] Yang, H., Lu, X., Wang, S.H., Lu, Z., Yao, J., Jiang, Y. and Qian, P., 2021. Synthesizing
multi-contrast MR images via novel 3D conditional Variational auto-encoding GAN. Mobile
Networks and Applications, 26, pp.415-424.
[26] Wang, G., Gong, E., Banerjee, S., Martin, D., Tong, E., Choi, J., Chen, H., Wintermark, M.,
Pauly, J.M. and Zaharchuk, G., 2020. Synthesize high-quality multi-contrast magnetic resonance
imaging from multi-echo acquisition using multi-task deep generative model. IEEE transactions
on medical imaging, 39(10), pp.3089-3099.
[27] Zhu, J.Y., Park, T., Isola, P. and Efros, A.A., 2017. Unpaired image-to-image translation
using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference
on computer vision (pp. 2223-2232).
[28] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T.,
Dehghani, M., Minderer, M., Heigold, G., Gelly, S. and Uszkoreit, J., 2020. An image is worth
16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
[29] Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K.,
McDonagh, S., Hammerla, N.Y., Kainz, B. and Glocker, B., 2018. Attention u-net: Learning where
to look for the pancreas. arXiv preprint arXiv:1804.03999.
[30] Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L. and Zhou, Y.,
2021. Transunet: Transformers make strong encoders for medical image segmentation. arXiv
preprint arXiv:2102.04306.
[31] Luo, Y., Wang, Y., Zu, C., Zhan, B., Wu, X., Zhou, J., Shen, D. and Zhou, L., 2021. 3D
transformer-GAN for high-quality PET reconstruction. In Medical Image Computing and
Computer Assisted InterventionMICCAI 2021: 24th International Conference, Strasbourg,
11
France, September 27October 1, 2021, Proceedings, Part VI 24 (pp. 276-285). Springer
International Publishing.
[32] Korkmaz, Y., Dar, S.U., Yurt, M., Özbey, M. and Cukur, T., 2022. Unsupervised MRI
reconstruction via zero-shot learned adversarial transformers. IEEE Transactions on Medical
Imaging, 41(7), pp.1747-1763.
[33] Yi, X., Walia, E. and Babyn, P., 2019. Generative adversarial network in medical imaging: A
review. Medical image analysis, 58, p.101552.
[34] Oh, K.T., Lee, S., Lee, H., Yun, M. and Yoo, S.K., 2020. Semantic segmentation of white
matter in FDG-PET using generative adversarial network. Journal of digital imaging, 33, pp.816-
825.
[35] Huang, Y., Zheng, F., Cong, R., Huang, W., Scott, M.R. and Shao, L., 2020. MCMT-GAN:
Multi-task coherent modality transferable GAN for 3D brain image synthesis. IEEE Transactions
on Image Processing, 29, pp.8187-8198.
[36] Ding, S., Zheng, J., Liu, Z., Zheng, Y., Chen, Y., Xu, X., Lu, J. and Xie, J., 2021. High-
resolution dermoscopy image synthesis with conditional generative adversarial networks.
Biomedical Signal Processing and Control, 64, p.102224.
[36] Lee, H., Kim, S.T., Lee, J.H. and Ro, Y.M., 2019. Realistic breast mass generation through
BIRADS category. In Medical Image Computing and Computer Assisted InterventionMICCAI
2019: 22nd International Conference, Shenzhen, China, October 1317, 2019, Proceedings, Part
VI 22 (pp. 703-711). Springer International Publishing.
[37] Wu, E., Wu, K., Cox, D. and Lotter, W., 2018. Conditional infilling GANs for data
augmentation in mammogram classification. In Image Analysis for Moving Organ, Breast, and
Thoracic Images: Third International Workshop, RAMBO 2018, Fourth International Workshop,
BIA 2018, and First International Workshop, TIA 2018, Held in Conjunction with MICCAI 2018,
Granada, Spain, September 16 and 20, 2018, Proceedings 3 (pp. 98-106). Springer International
Publishing.
[39] Gao, Y., Tang, Z., Zhou, M. and Metaxas, D., 2021, June. Enabling data diversity: efficient
automatic augmentation via regularized adversarial training. In Information Processing in Medical
Imaging: 27th International Conference, IPMI 2021, Virtual Event, June 28June 30, 2021,
Proceedings (pp. 85-97). Cham: Springer International Publishing.
[40] Lin, C., Tang, R., Lin, D.D., Liu, L., Lu, J., Chen, Y., Gao, D. and Zhou, J., 2019. Breast mass
detection in mammograms via blending adversarial learning. In Simulation and Synthesis in
Medical Imaging: 4th International Workshop, SASHIMI 2019, Held in Conjunction with
MICCAI 2019, Shenzhen, China, October 13, 2019, Proceedings 4 (pp. 52-61). Springer
International Publishing.
[41] Khalifa, N.E., Loey, M. and Mirjalili, S., 2022. A comprehensive survey of recent trends in
deep learning for digital images augmentation. Artificial Intelligence Review, pp.1-27.
[42] Dalmaz, O., Yurt, M. and Çukur, T., 2022. ResViT: residual vision transformers for
multimodal medical image synthesis. IEEE Transactions on Medical
[43] Pavan Kumar, M.R. and Jayagopal, P., 2021. Generative adversarial networks: a survey on
applications and challenges. International Journal of Multimedia Information Retrieval, 10(1),
pp.1-24.
[44] Iqbal, A., Sharif, M., Yasmin, M., Raza, M. and Aftab, S., 2022. Generative adversarial
networks and its applications in the biomedical image segmentation: a comprehensive survey.
International Journal of Multimedia Information Retrieval, 11(3), pp.333-368.
[45] Qin, Z., Liu, Z., Zhu, P. and Xue, Y., 2020. A GAN-based image synthesis method for skin
lesion classification. Computer Methods and Programs in Biomedicine, 195, p.105568.
12
[46] Lee, D., Yu, H., Jiang, X., Rogith, D., Gudala, M., Tejani, M., Zhang, Q. and Xiong, L., 2020.
Generating sequential electronic health records using dual adversarial autoencoder. Journal of the
American Medical Informatics Association, 27(9), pp.1411-1419.
[47] Zhao, L., Wang, J., Pang, L., Liu, Y. and Zhang, J., 2020. GANsDTA: Predicting drug-target
binding affinity using GANs. Frontiers in genetics, 10, p.1243.
[48] Waheed, A., Goyal, M., Gupta, D., Khanna, A., Al-Turjman, F. and Pinheiro, P.R., 2020.
Covidgan: data augmentation using auxiliary classifier gan for improved covid-19 detection. Ieee
Access, 8, pp.91916-91923.
[49] Saini, M. and Susan, S., 2020. Deep transfer with minority data augmentation for imbalanced
breast cancer dataset. Applied Soft Computing, 97, p.106759.
[50] Qasim, A.B., Ezhov, I., Shit, S., Schoppe, O., Paetzold, J.C., Sekuboyina, A., Kofler, F.,
Lipkova, J., Li, H. and Menze, B., 2020, September. Red-GAN: Attacking class imbalance via
conditioned generation. Yet another medical imaging perspective. In Medical Imaging with Deep
Learning (pp. 655-668). PMLR.
[51] Wang, J., Guan, Y., Zheng, C., Peng, R. and Li, X., 2021. A temporal-spectral generative
adversarial network based end-to-end packet loss concealment for wideband speech transmission.
The Journal of the Acoustical Society of America, 150(4), pp.2577-2588.
[52] Mirza, M. and Osindero, S., 2014. Conditional generative adversarial nets. arXiv preprint
arXiv:1411.1784.
[53] Yoo, T.K., Choi, J.Y. and Kim, H.K., 2020. A generative adversarial network approach to
predicting postoperative appearance after orbital decompression surgery for thyroid eye disease.
Computers in biology and medicine, 118, p.103628.
[54] Cheong, H., Devalla, S.K., Pham, T.H., Zhang, L., Tun, T.A., Wang, X., Perera, S.,
Schmetterer, L., Aung, T., Boote, C. and Thiery, A., 2020. DeshadowGAN: a deep learning
approach to remove shadows from optical coherence tomography images. Translational Vision
Science & Technology, 9(2), pp.23-23.
[55] Wang, W., Li, X., Xu, Z., Yu, W., Zhao, J., Ding, D. and Chen, Y., 2022. Learning two-
stream CNN for multi-modal age-related macular degeneration categorization. IEEE Journal of
Biomedical and Health Informatics, 26(8), pp.4111-4122.
[56] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville,
A. and Bengio, Y., 2020. Generative adversarial networks. Communications of the ACM, 63(11),
pp.139-144.
[57] Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,
Courville, A. and Bengio, Y., 2014. Generative adversarial nets. NIPS’14: Proceedings of the 27th
International Conference on Neural Information Processing Systems-Volume 2 (pp. 2672-2680).
[58] de Souza Jr, L.A., Passos, L.A., Mendel, R., Ebigbo, A., Probst, A., Messmann, H., Palm, C.
and Papa, J.P., 2020. Assisting Barrett's esophagus identification using endoscopic data
augmentation based on Generative Adversarial Networks. Computers in Biology and Medicine,
126, p.104029.
[59] Nair, S. and Gohel, J.V., 2020. A review on contemporary hole transport materials for
perovskite solar cells. Nanotechnology for Energy and Environmental Engineering, pp.145-168.
[60] Chang, Q., Qu, H., Zhang, Y., Sabuncu, M., Chen, C., Zhang, T. and Metaxas, D.N., 2020.
Synthetic learning: Learn from distributed asynchronized discriminator gan without sharing
medical image data. In Proceedings of the IEEE/CVF conference on computer vision and pattern
recognition (pp. 13856-13866).
[61] Nema, S., Dudhane, A., Murala, S. and Naidu, S., 2020. RescueNet: An unpaired GAN for
brain tumor segmentation. Biomedical Signal Processing and Control, 55, p.101641.
13
[62] Huang, Y., Zheng, F., Cong, R., Huang, W., Scott, M.R. and Shao, L., 2020. MCMT-GAN:
Multi-task coherent modality transferable GAN for 3D brain image synthesis. IEEE Transactions
on Image Processing, 29, pp.8187-8198.
[63] Wang, G., Song, T., Dong, Q., Cui, M., Huang, N. and Zhang, S., 2020. Automatic ischemic
stroke lesion segmentation from computed tomography perfusion images by image synthesis and
attention-based deep neural networks. Medical Image Analysis, 65, p.101787.
[64] Dou, Q., Ouyang, C., Chen, C., Chen, H. and Heng, P.A., 2018. Unsupervised cross-modality
domain adaptation of convnets for biomedical image segmentations with adversarial loss. arXiv
preprint arXiv:1804.10916.
[65] Teng, L., Fu, Z., Ma, Q., Yao, Y., Zhang, B., Zhu, K. and Li, P., 2020. Interactive
echocardiography translation using few-shot GAN transfer learning. Computational and
mathematical methods in medicine, 2020.
[66] Wang, W., Wang, G., Wu, X., Ding, X., Cao, X., Wang, L., Zhang, J. and Wang, P., 2021.
Automatic segmentation of prostate magnetic resonance imaging using generative adversarial
networks. Clinical Imaging, 70, pp.1-9.
[67] Sandfort, V., Yan, K., Pickhardt, P.J. and Summers, R.M., 2019. Data augmentation using
generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation
tasks. Scientific reports, 9(1), p.16884.
[68] Xia, K., Yin, H., Qian, P., Jiang, Y. and Wang, S., 2019. Liver semantic segmentation
algorithm based on improved deep adversarial networks in combination of weighted loss function
on abdominal CT images. IEEE Access, 7, pp.96349-96358.
[69] Goel, T., Murugan, R., Mirjalili, S. and Chakrabartty, D.K., 2021. Automatic screening of
COVID-19 using an optimized generative adversarial network. Cognitive computation, pp.1-16.
[70] Dong, X., Lei, Y., Wang, T., Thomas, M., Tang, L., Curran, W.J., Liu, T. and Yang, X., 2019.
Automatic multiorgan segmentation in thorax CT images using UnetGAN. Medical physics,
46(5), pp.2157-2168.
[71] Tong, N., Gou, S., Yang, S., Cao, M. and Sheng, K., 2019. Shape constrained fully
convolutional DenseNet with adversarial training for multiorgan segmentation on head and neck
CT and lowfield MR images. Medical physics, 46(6), pp.2669-2682.
[72] Zhang, H., Cao, X., Xu, L. and Qi, L., 2019, August. Conditional convolution generative
adversarial network for bi-ventricle segmentation in cardiac MR images. In Proceedings of the
third international symposium on image computing and digital medicine (pp. 118-122).
[73]Yan, W., Wang, Y., Gu, S., Huang, L., Yan, F., Xia, L. and Tao, Q., 2019. The domain shift
problem of medical image segmentation and vendor-adaptation by Unet-GAN. In Medical Image
Computing and Computer Assisted InterventionMICCAI 2019: 22nd International Conference,
Shenzhen, China, October 1317, 2019, Proceedings, Part II 22 (pp. 623-631). Springer
International Publishing.
[74] Ossenberg-Engels, J. and Grau, V., 2020, January. Conditional generative adversarial
networks for the prediction of cardiac contraction from individual frames. In Statistical Atlases
and Computational Models of the Heart. Multi-Sequence CMR Segmentation, CRT-EPiggy and
LV Full Quantification Challenges: 10th International Workshop, STACOM 2019, Held in
Conjunction with MICCAI 2019, Shenzhen, China, October 13, 2019, Revised Selected Papers
(pp. 109-118). Cham: Springer International Publishing.
[75] Rezaei, M., Yang, H., Harmuth, K. and Meinel, C., 2019, January. Conditional generative
adversarial refinement networks for unbalanced medical image semantic segmentation. In 2019
IEEE winter conference on applications of computer vision (WACV) (pp. 1836-1845). IEEE.
14
[76] Sia, D., Villanueva, A., Friedman, S.L. and Llovet, J.M., 2017. Liver cancer cell of origin,
molecular class, and effects on patient prognosis. Gastroenterology, 152(4), pp.745-761.
[77] Sun, Y., Yuan, P. and Sun, Y., 2020, August. MM-GAN: 3D MRI data augmentation for
medical image segmentation via generative adversarial networks. In 2020 IEEE International
conference on knowledge graph (ICKG) (pp. 227-234). IEEE.
[78] Shankaranarayana, S.M., Ram, K., Mitra, K. and Sivaprakasam, M., 2017. Joint optic disc
and cup segmentation using fully convolutional and adversarial networks. In Fetal, Infant and
Ophthalmic Medical Image Analysis: International Workshop, FIFI 2017, and 4th International
Workshop, OMIA 2017, Held in Conjunction with MICCAI 2017, Québec City, QC, Canada,
September 14, Proceedings 4 (pp. 168-176). Springer International Publishing.
[79] Bisneto, T.R.V., de Carvalho Filho, A.O. and Magalhães, D.M.V., 2020. Generative
adversarial network and texture features applied to automatic glaucoma detection. Applied Soft
Computing, 90, p.106165.
[80] Park, K.B., Choi, S.H. and Lee, J.Y., 2020. M-GAN: Retinal blood vessel segmentation by
balancing losses through stacked deep fully convolutional networks. IEEE Access, 8, pp.146308-
146322.
[81] Yang, T., Wu, T., Li, L. and Zhu, C., 2020. SUD-GAN: deep convolution generative
adversarial network combined with short connection and dense block for retinal vessel
segmentation. Journal of digital imaging, 33, pp.946-957.
[82] Son, J., Park, S.J. and Jung, K.H., 2019. Towards accurate segmentation of retinal vessels and
the optic disc in fundoscopic images with generative adversarial networks. Journal of digital
imaging, 32(3), pp.499-512.
[83] Sorin, V., Barash, Y., Konen, E. and Klang, E., 2020. Creating artificial images for radiology
applications using generative adversarial networks (GANs)a systematic review. Academic
radiology, 27(8), pp.1175-1185.
[84] You, A., Kim, J.K., Ryu, I.H. and Yoo, T.K., 2022. Application of generative adversarial
networks (GAN) for ophthalmology image domains: a survey. Eye and Vision, 9(1), pp.1-19.
[85] Shin, Y., Yang, J. and Lee, Y.H., 2021. Deep generative adversarial networks: applications
in musculoskeletal imaging. Radiology: Artificial Intelligence, 3(3), p.e200157.
[86] Bellemo, V., Burlina, P., Yong, L., Wong, T.Y. and Ting, D.S.W., 2019. Generative
adversarial networks (GANs) for retinal fundus image synthesis. In Computer VisionACCV 2018
Workshops: 14th Asian Conference on Computer Vision, Perth, Australia, December 26, 2018,
Revised Selected Papers 14 (pp. 289-302). Springer International Publishing.
[87] Burlina, P.M., Joshi, N., Pacheco, K.D., Liu, T.A. and Bressler, N.M., 2019. Assessment of
deep generative models for high-resolution synthetic retinal image generation of age-related
macular degeneration. JAMA ophthalmology, 137(3), pp.258-264.
[88] Yoon, J., Drumright, L.N. and Van Der Schaar, M., 2020. Anonymization through data
synthesis using generative adversarial networks (ads-gan). IEEE journal of biomedical and health
informatics, 24(8), pp.2378-2388.
[89] Tschuchnig, M.E., Oostingh, G.J. and Gadermayr, M., 2020. Generative adversarial networks
in digital pathology: a survey on trends and future potential. Patterns, 1(6), p.100089.
[90] Ronneberger, O., Fischer, P. and Brox, T., 2015. U-net: Convolutional networks for
biomedical image segmentation. In Medical Image Computing and Computer-Assisted
InterventionMICCAI 2015: 18th International Conference, Munich, Germany, October 5-9,
2015, Proceedings, Part III 18 (pp. 234-241). Springer International Publishing.
15
[91] Lei, B., Xia, Z., Jiang, F., Jiang, X., Ge, Z., Xu, Y., Qin, J., Chen, S., Wang, T. and Wang, S.,
2020. Skin lesion segmentation via generative adversarial networks with dual discriminators.
Medical Image Analysis, 64, p.101716.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Recent advancements with deep generative models have proven significant potential in the task of image synthesis, detection, segmentation, and classification. Segmenting the medical images is considered a primary challenge in the biomedical imaging field. There have been various GANs-based models proposed in the literature to resolve medical segmentation challenges. Our research outcome has identified 151 papers; after the twofold screening, 138 papers are selected for the final survey. A comprehensive survey is conducted on GANs network application to medical image segmentation, primarily focused on various GANs-based models, performance metrics, loss function, datasets, augmentation methods, paper implementation, and source codes. Secondly, this paper provides a detailed overview of GANs network application in different human diseases segmentation. We conclude our research with critical discussion, limitations of GANs, and suggestions for future directions. We hope this survey is beneficial and increases awareness of GANs network implementations for biomedical image segmentation tasks.
Article
Full-text available
This paper tackles automated categorization of Age-related Macular Degeneration (AMD), a common macular disease among people over 50. Previous research efforts mainly focus on AMD categorization with a single-modal input, let it be a color fundus photograph (CFP) or an OCT B-scan image. By contrast, we consider AMD categorization given a multi-modal input, a direction that is clinically meaningful yet mostly unexplored. Contrary to the prior art that takes a traditional approach of feature extraction plus classifier training that cannot be jointly optimized, we opt for end-to-end multi-modal Convolutional Neural Networks (MM-CNN). Our MM-CNN is instantiated by a two-stream CNN, with spatially-invariant fusion to combine information from the CFP and OCT streams. In order to visually interpret the contribution of the individual modalities to the final prediction, we extend the class activation mapping (CAM) technique to the multi-modal scenario. For effective training of MM-CNN, we develop two data augmentation methods. One is GAN-based CFP/OCT image synthesis, with our novel use of CAMs as conditional input of a high-resolution image-to-image translation GAN. The other method is Loose Pairing, which pairs a CFP image and an OCT image on the basis of their classes instead of eye identities. Experiments on a clinical dataset consisting of 1,094 CFP images and 1,289 OCT images acquired from 1,093 distinct eyes show that the proposed solution obtains better F1 and Accuracy than multiple baselines for multi-modal AMD categorization. Code and data are available at https://github.com/li-xirong/mmc-amd .
Article
Full-text available
Background Recent advances in deep learning techniques have led to improved diagnostic abilities in ophthalmology. A generative adversarial network (GAN), which consists of two competing types of deep neural networks, including a generator and a discriminator, has demonstrated remarkable performance in image synthesis and image-to-image translation. The adoption of GAN for medical imaging is increasing for image generation and translation, but it is not familiar to researchers in the field of ophthalmology. In this work, we present a literature review on the application of GAN in ophthalmology image domains to discuss important contributions and to identify potential future research directions. Methods We performed a survey on studies using GAN published before June 2021 only, and we introduced various applications of GAN in ophthalmology image domains. The search identified 48 peer-reviewed papers in the final review. The type of GAN used in the analysis, task, imaging domain, and the outcome were collected to verify the usefulness of the GAN. Results In ophthalmology image domains, GAN can perform segmentation, data augmentation, denoising, domain transfer, super-resolution, post-intervention prediction, and feature extraction. GAN techniques have established an extension of datasets and modalities in ophthalmology. GAN has several limitations, such as mode collapse, spatial deformities, unintended changes, and the generation of high-frequency noises and artifacts of checkerboard patterns. Conclusions The use of GAN has benefited the various tasks in ophthalmology image domains. Based on our observations, the adoption of GAN in ophthalmology is still in a very early stage of clinical validation compared with deep learning classification techniques because several problems need to be overcome for practical use. However, the proper selection of the GAN technique and statistical modeling of ocular imaging will greatly improve the performance of each image analysis. Finally, this survey would enable researchers to access the appropriate GAN technique to maximize the potential of ophthalmology datasets for deep learning research.
Article
Full-text available
Packet loss concealment (PLC) aims to mitigate speech impairments caused by packet losses so as to improve speech perceptual quality. This paper proposes an end-to-end PLC algorithm with a time-frequency hybrid generative adversarial network, which incorporates a dilated residual convolution and the integration of a time-domain discriminator and frequency-domain discriminator into a convolutional encoder-decoder architecture. The dilated residual convolution is employed to aggregate the short-term and long-term context information of lost speech frames through two network receptive fields with different dilation rates, and the integrated time-frequency discriminators are proposed to learn multi-resolution time-frequency features from correctly received speech frames with both time-domain waveform and frequency-domain complex spectrums. Both causal and noncausal strategies are proposed for the packet-loss problem, which can effectively reduce the transitional distortion caused by lost speech frames with a significantly reduced number of training parameters and computational complexity. The experimental results show that the proposed method can achieve better performance in terms of three objective measurements, including the signal-to-noise ratio, perceptual evaluation of speech quality, and short-time objective intelligibility. The results of the subjective listening test further confirm a better performance in the speech perceptual quality.
Article
Full-text available
Deep learning proved its efficiency in many fields of computer science such as computer vision, image classifications, object detection, image segmentation, and more. Deep learning models primarily depend on the availability of huge datasets. Without the existence of many images in datasets, different deep learning models will not be able to learn and produce accurate models. Unfortunately, several fields don't have access to large amounts of evidence, such as medical image processing. For example. The world is suffering from the lack of COVID-19 virus datasets, and there is no benchmark dataset from the beginning of 2020. This pandemic was the main motivation of this survey to deliver and discuss the current image data augmentation techniques which can be used to increase the number of images. In this paper, a survey of data augmentation for digital images in deep learning will be presented. The study begins and with the introduction section, which reflects the importance of data augmentation in general. The classical image data augmentation taxonomy and photometric transformation will be presented in the second section. The third section will illustrate the deep learning image data augmentation. Finally, the fourth section will survey the state of the art of using image data augmentation techniques in the different deep learning research and application.
Article
Generative adversarial models with convolutional neural network (CNN) backbones have recently been established as state-of-the-art in numerous medical image synthesis tasks. However, CNNs are designed to perform local processing with compact filters, and this inductive bias compromises learning of contextual features. Here, we propose a novel generative adversarial approach for medical image synthesis, ResViT, that leverages the contextual sensitivity of vision transformers along with the precision of convolution operators and realism of adversarial learning. ResViT's generator employs a central bottleneck comprising novel aggregated residual transformer (ART) blocks that synergistically combine residual convolutional and transformer modules. Residual connections in ART blocks promote diversity in captured representations, while a channel compression module distills task-relevant information. A weight sharing strategy is introduced among ART blocks to mitigate computational burden. A unified implementation is introduced to avoid the need to rebuild separate synthesis models for varying source-target modality configurations. Comprehensive demonstrations are performed for synthesizing missing sequences in multi-contrast MRI, and CT images from MRI. Our results indicate superiority of ResViT against competing CNN- and transformer-based methods in terms of qualitative observations and quantitative metrics.
Article
Supervised reconstruction models are characteristically trained on matched pairs of undersampled and fully-sampled data to capture an MRI prior, along with supervision regarding the imaging operator to enforce data consistency. To reduce supervision requirements, the recent deep image prior framework instead conjoins untrained MRI priors with the imaging operator during inference. Yet, canonical convolutional architectures are suboptimal in capturing long-range relationships, and priors based on randomly initialized networks may yield suboptimal performance. To address these limitations, here we introduce a novel unsupervised MRI reconstruction method based on zero-Shot Learned Adversarial TransformERs (SLATER). SLATER embodies a deep adversarial network with cross-attention transformers to map noise and latent variables onto coil-combined MR images. During pre-training, this unconditional network learns a high-quality MRI prior in an unsupervised generative modeling task. During inference, a zero-shot reconstruction is then performed by incorporating the imaging operator and optimizing the prior to maximize consistency to undersampled data. Comprehensive experiments on brain MRI datasets clearly demonstrate the superior performance of SLATER against state-of-the-art unsupervised methods.
Article
In recent years, deep learning techniques have been applied in musculoskeletal radiology to increase the diagnostic potential of acquired images. Generative adversarial networks (GANs), which are deep neural networks that can generate or transform images, have the potential to aid in faster imaging by generating images with a high level of realism across multiple contrast and modalities from existing imaging protocols. This review introduces the key architectures of GANs as well as their technical background and challenges. Key research trends are highlighted, including: (a) reconstruction of high-resolution MRI; (b) image synthesis with different modalities and contrasts; (c) image enhancement that efficiently preserves high-frequency information suitable for human interpretation; (d) pixel-level segmentation with annotation sharing between domains; and (e) applications to different musculoskeletal anatomies. In addition, an overview is provided of the key issues wherein clinical applicability is challenging to capture with conventional performance metrics and expert evaluation. When clinically validated, GANs have the potential to improve musculoskeletal imaging.
Article
Multi-contrast MRI protocols increase the level of morphological information available for diagnosis. Yet, the number and quality of contrasts are limited in practice by various factors including scan time and patient motion. Synthesis of missing or corrupted contrasts from other high-quality ones can alleviate this limitation. When a single target contrast is of interest, common approaches for multi-contrast MRI involve either one-to-one or many-to-one synthesis methods depending on their input. One-to-one methods take as input a single source contrast, and they learn a latent representation sensitive to unique features of the source. Meanwhile, many-to-one methods receive multiple distinct sources, and they learn a shared latent representation more sensitive to common features across sources. For enhanced image synthesis, we propose a multi-stream approach that aggregates information across multiple source images via a mixture of multiple one-to-one streams and a joint many-to-one stream. The complementary feature maps generated in the one-to-one streams and the shared feature maps generated in the many-to-one stream are combined with a fusion block. The location of the fusion block is adaptively modified to maximize task-specific performance. Quantitative and radiological assessments on T1,- T2-, PD-weighted, and FLAIR images clearly demonstrate the superior performance of the proposed method compared to previous state-of-the-art one-to-one and many-to-one methods.