Conference PaperPDF Available

Ultrasound denoising using the pix2pix GAN

Authors:

Abstract and Figures

The use of ultrasound (US) as an imaging technique is essential for the diagnosis of atherosclerotic cardiovascular disease (ASCVD), which depends on US images of the carotid artery. However, US images are plagued by a specific type of noise called Speckle noise, which lowers image quality dramatically. As an attempt to improve US image quality, the use of a generative adversarial network (GAN) is explored. The GAN chosen for this is the pix2pix model and the dataset used for training is composed of images containing simple geometric shapes of various scales and their equivalent corrupted with Speckle noise following the Log-Compression model. The results of this GAN are displayed and a noticeable improvement can be verified in the image quality.
Content may be subject to copyright.
000
001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035
036
037
038
039
040
041
042
043
044
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
Ultrasound denoising using the pix2pix GAN
Afonso Raposo1
afonso.raposo@tecnico.ulisboa.pt
António Azeitona1
antoniorrazeitona@tecnico.ulisboa.pt
Manya Afonso2
manya.afonso@wur.nl
J. Miguel Sanches1
jmrs@tecnico.ulisboa.pt
1Institute for Systems and Robotics (ISR), LARSyS,
Instituto Superior Técnico,
Departamento de Bioengenharia,
Universidade de Lisboa
2Wageningen University and Research,
Wageningen, The Netherlands
Abstract
The use of ultrasound (US) as an imaging technique is essential for the
diagnosis of atherosclerotic cardiovascular disease (ASCVD), which de-
pends on US images of the carotid artery. However, US images are
plagued by a specific type of noise called Speckle noise, which lowers
image quality dramatically. As an attempt to improve US image qual-
ity, the use of a generative adversarial network (GAN) is explored. The
GAN chosen for this is the pix2pix model and the dataset used for train-
ing is composed of images containing simple geometric shapes of various
scales and their equivalent corrupted with Speckle noise following the
Log-Compression model. The results of this GAN are displayed and a
noticeable improvement can be verified in the image quality.
1 Introduction
The two main predictors used for the diagnosis and assessment of atheroscle-
rotic cardiovascular disease risk are the carotid intima-media thickness
and analysis of the carotid arterial plaque, both of which are obtained by
the use of ultrasound (US) imaging [4, 6]. Although there exist some
promising studies on the development of a fully automatic segmentation
technique, the performance is still far from ideal due to the high content
of Speckle noise [5]. The current approach for the denoising of US im-
ages is based on the use of non-linear filters such as anisotropic diffusion
filters and adaptive median filters [2]. These filters tend to preserve the
contours of the structure but over-smooth the remaining areas.
Figure 1: Ultrasound image of a liver (left) and corresponding images re-
sulting from anisotropic diffusion filtering (middle) and adaptive median
filtering (right).
The introduction of Generative Adversarial Networks (GANs) as a
means to generate images presents a new opportunity for developing novel
denoising techniques. The most widespread of these networks is the
pix2pix [3], which can be trained with pixel-wise paired images in a way
that it can receive a certain image as an input and then output a version of
that image with different characteristics, image-to-image translation.
2 Problem Formulation
The pix2pix network requires pairs of images to be trained. In this case,
these pairs consist of US images with Speckle noise and the same image
without Speckle noise.
Speckle noise follows the Rayleigh distribution:
ρ(yi) = yi
σ2ey2
i
2σ2(1)
This work was supported by Portuguese funds through FCT (Fundação para a Ciência
e Tecnologia) through the projects reference UIDP/50009/2020 and through the reference
UID/EEA/50009/2019, LARSyS - FCT Plurianual funding 2020-2023.
Where ρis the p.d.f., yiis the intensity value of the ith pixel in the
grayscale ultrasound image and σis a scale factor dependent on the scat-
tering amplitude of the particles in the medium [9].
B-mode US images suffer logarithmic compression after the acquisi-
tion of the data, which can be modeled by the equation 2.
zi j =αlog(yi j +1) + β(2)
Where i and j are the positions of the pixel, z is the pixel after the com-
pression, y is the pixel of the radio frequency (RF) image, and αand β
are parameters dependent on the contrast and brightness [7], respectively.
These mathematical models make possible the creation of synthetic
pairs of images to train the network, which, after trained, will accept US
images and return denoised versions of those images.
3 Methods
The dataset used was composed of synthetic images of several geomet-
ric shapes of varying dimensions, intensities, and number. These images
were generated with the draw.random shapes function from the skim-
age library, using the parameters: shape=(256, 256), allow_overlap=True,
min_shapes=128, max_shapes=256, min_size=10, max_size=50. The in-
tensity of the images was then inverted, so the background was black and
the pixels with the lowest intensities (<5) were corrected to have an inten-
sity of 5. A total of 2560 images was generated this way, corresponding
to the output of the training dataset (denoised US images).
The training input images (noisy US images), were computed based
on the images obtained using the method described above. To simulate the
US image with Speckle noise, the Rayleigh distribution function (equa-
tion 1) was used, taking the original synthetic image as the value for the
standard deviation (here denoted by σ). However, the model for logarith-
mic compression as displayed in figure 2, shows that after the noisy data
is obtained (RF image), there are a few steps to reach the final B-mode
US image.
Figure 2: Full diagram of the model for the generic processing operations
of an ultrasound system [8].
The first of these steps is an interpolation, which was mimicked by ap-
plying a 2D decimation of the images, reducing them to a quarter of their
size, followed by applying a linear interpolation, restoring the dimension
of the images.
The other step is the logarithmic compression expressed by equation
2. This process depends on two parameters, αand β, that are usually not
provided by US equipment manufacturers and, therefore, are unknown.
As a way to increase the versatility and robustness of the network, these
parameters were randomly selected for each image from a set of intervals:
[10,50]for αand [50,50]for β.
Proceedings of RECPAD 2021 27th Portuguese Conference on Pattern Recognition
91
*/
063
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079
080
081
082
083
084
085
086
087
088
089
090
091
092
093
094
095
096
097
098
099
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
Figure 3: Example of the image pairs resulting from the method used: the
input image (left) and the target (right).
The training of the network consisted of using 2048 image pairs for
training and 512 image pairs for validation, for a total of 200 epochs.
4 Results and Discussion
After the network was trained, some images from the validation set were
fed to it so that the image generated could be compared to the target out-
put, as shown in figure 4.
Figure 4: Example of a simulated noisy US image from the validation set
(left), the output given by the trained network (middle) and the original
synthetic image (right).
Although some distortion is present, the structural information recov-
ered seems to be more than satisfactory and the values of the Structural
Similarity Index (SSIM, higher is better) and Peak Signal-to-Noise Ratio
(PSNR, higher is better) as quality assessment metrics were calculated [1]
and the values are shown in table 1.
Table 1: Values of peak signal-to-noise ratio and structural similarity in-
dex corresponding to different denoising techniques
Method PSNR SSIM
Anisotropic Filter 11.906 0.461
Adaptive Median 12.053 0.452
Ours 21.085 0.789
The drastic difference in the values can be explained easily when
comparing the images (figure 6), seeing as the classical filters do not im-
prove the intensity values of the image in the same way that the network
was able to.
Nonetheless, both the SSIM and PSNR, testify to the potential of the
network when compared to classical methods of filtering, showing almost
double the score.
The finished network was also used to denoise real US images of the
carotid artery, resulting in the images shown in figure 5.
In this case, quality assessment measures cannot be performed be-
cause, being a real US image, there is no ground truth image (a completely
clean image) available. Even so, visual comparison with the aforemen-
tioned classical methods is possible (figure 6).
As it can be seen, the image resulting from the GAN highlights the
more prominent structures, preserves most contours without retaining noise
while, admittedly, losing some of the "realness" of the image as a default
US image since the model was trained with geometric synthetic images.
5 Conclusion
The use of the pix2pix network as a tool to denoise and enhance the image
quality of US images is an appealing prospect and, as shown in this work,
reveals itself promising in this area. In this work, synthetic images were
Figure 5: Example of a real carotid US image from (left) and the output
given by the trained network (right).
ORIGINAL
ANISOT ROP IC
FILTE R
ADAP TIVE
MEDIAN FILTE R OUR METHOD
Figure 6: Comparing various ultrasound denoising algorithms to our
method.
used to train the network and, although the influence of the geometric
shapes is clear on the resulting carotid US image, the increased sharpness
and preservation of contours holds great value for both physicians and
automatic segmentation algorithms. There is room for improvement still,
specifically in the areas of the architecture of the network, adjustments
to the network’s loss function, and refinement of the datasets used for
training, making the use of GANs as a denoising tool an enticing avenue
for further study.
References
[1] Li Sze Chow and Raveendran Paramesran. Review of medical image quality
assessment. Biomedical Signal Processing and Control, 27:145–154, May
2016. doi: 10.1016/j.bspc.2016.02.006.
[2] Linwei Fan, Fan Zhang, Hui Fan, and Caiming Zhang. Brief review of image
denoising techniques. Visual Computing for Industry, Biomedicine, and Art, 2
(1), July 2019. doi: 10.1186/s42492- 019-0016-7.
[3] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. Image-to-
image translation with conditional adversarial networks, 2018.
[4] Amer M. Johri, Vijay Nambi, Tasneem Z. Naqvi, Steven B. Feinstein, Es-
ther S.H. Kim, Margaret M. Park, Harald Becher, and Henrik Sillesen. Rec-
ommendations for the assessment of carotid arterial plaque by ultrasound for
the characterization of atherosclerosis and evaluation of cardiovascular risk:
From the american society of echocardiography. Journal of the American So-
ciety of Echocardiography, 33(8):917–933, 2020. doi: https://doi.org/10.1016/
j.echo.2020.04.021.
[5] P Krishna Kumar, Tadashi Araki, Jeny Rajan, John R Laird, Andrew Nico-
laides, and Jasjit S. Suri. State-of-the-art review on automated lumen and ad-
ventitial border delineation and its measurements in carotid ultrasound. Com-
puter Methods and Programs in Biomedicine, 163:155–168, September 2018.
doi: 10.1016/j.cmpb.2018.05.015.
[6] Joseph F. Polak and Daniel H. O’Leary. Carotid intima-media thickness as sur-
rogate for and predictor of CVD. Global Heart, 11(3):295, September 2016.
doi: 10.1016/j.gheart.2016.08.006.
[7] Jose Seabra and Joao Sanches. Modeling log-compressed ultrasound images
for radio frequency signal recovery. In 2008 30th Annual International Confer-
ence of the IEEE Engineering in Medicine and Biology Society. IEEE, August
2008. doi: 10.1109/iembs.2008.4649181.
[8] José Carlos Rosa Seabra. Medical Ultrasound B-Mode Modeling, De-
speckling and Tissue Characterization Assessing the Atherosclerotic Disease.
PhD thesis, Instituto Superior Técnico, 2011.
[9] R.F. Wagner, S.W. Smith, J.M. Sandrik, and H. Lopez. Statistics of speckle
in ultrasound b-scans. IEEE Transactions on Sonics and Ultrasonics, 30(3):
156–163, May 1983. doi: 10.1109/t- su.1983.31404.
2
Proceedings of RECPAD 2021 27th Portuguese Conference on Pattern Recognition
92
... Pix2pix [21] was introduced as a generalpurpose framework for image-to-image translation tasks, using conditional generative adversarial networks (cGANs). Several works extended pix2pix to image denoising tasks [24, 43,52]. Additionally, the widely-used content-aware image restoration (CARE) network [58] incorporates a U-Net architecture [45] to denoise low-resolution fluorescence data. ...
Preprint
Full-text available
Advances in microscopy imaging enable researchers to visualize structures at the nanoscale level thereby unraveling intricate details of biological organization. However, challenges such as image noise, photobleaching of fluorophores, and low tolerability of biological samples to high light doses remain, restricting temporal resolutions and experiment durations. Reduced laser doses enable longer measurements at the cost of lower resolution and increased noise, which hinders accurate downstream analyses. Here we train a denoising diffusion probabilistic model (DDPM) to predict high-resolution images by conditioning the model on low-resolution information. Additionally, the probabilistic aspect of the DDPM allows for repeated generation of images that tend to further increase the signal-to-noise ratio. We show that our model achieves a performance that is better or similar to the previously best-performing methods, across four highly diverse datasets. Importantly, while any of the previous methods show competitive performance for some, but not all datasets, our method consistently achieves high performance across all four data sets, suggesting high generalizability.
... They have been used in image denoising in several studies. Authors in [34] use a Pix2Pix GAN to denoise ultrasound images. Speckle noise using Rayleigh distribution was added to the training images to generate image pairs of original and noisy images. ...
Preprint
Visual crowd counting estimates the density of the crowd using deep learning models such as convolution neural networks (CNNs). The performance of the model heavily relies on the quality of the training data that constitutes crowd images. In harsh weather such as fog, dust, and low light conditions, the inference performance may severely degrade on the noisy and blur images. In this paper, we propose the use of Pix2Pix generative adversarial network (GAN) to first denoise the crowd images prior to passing them to the counting model. A Pix2Pix network is trained using synthetic noisy images generated from original crowd images and then the pretrained generator is then used in the inference engine to estimate the crowd density in unseen, noisy crowd images. The performance is tested on JHU-Crowd dataset to validate the significance of the proposed method particularly when high reliability and accuracy are required.
Article
Full-text available
Abstract With the explosion in the number of digital images taken every day, the demand for more accurate and visually pleasing images is increasing. However, the images captured by modern cameras are inevitably degraded by noise, which leads to deteriorated visual image quality. Therefore, work is required to reduce noise without losing image features (edges, corners, and other sharp structures). So far, researchers have already proposed various methods for decreasing noise. Each method has its own advantages and disadvantages. In this paper, we summarize some important research in the field of image denoising. First, we give the formulation of the image denoising problem, and then we present several image denoising techniques. In addition, we discuss the characteristics of these techniques. Finally, we provide several promising directions for future research.
Article
Full-text available
We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.
Article
Full-text available
This paper presents an algorithm for recovering the radio frequency (RF) signal provided by the ultrasound probe from the log-compressed ultrasound images displayed in ultrasound equipment. Commercial ecographs perform nonlinear image compression to reduce the dynamic range of the Ultrasound (US) signal in order to improve image visualization. Moreover, the clinician may adjust other parameters, such as brightness, gain and contrast, to improve image quality of a given anatomical detail. These operations significantly change the statistical distribution of the original RF raw signal, which is assumed, based on physical considerations on the signal formation process, to be Rayleigh distributed. Therefore, the image pixels are no longer Rayleigh distributed and the RF signal is not usually available in the common ultrasound equipment. For statistical data processing purposes, more important than having "good looking" images, it is important to have realistic models to describe the data. In this paper, a nonlinear compression parametric function is used to model the pre-processed image in order to recover the original RF image as well the contrast and brightness parameters. Tests using synthetic and real data and statistical measures such as the Kolmogorov-Smirnov and Kullback-Leibler divergences are used to assess the results. It is shown that the proposed estimation model clearly represents better the observed data than by taking the general assumption of the data being modeled by a Rayleigh distribution.
Article
Atherosclerotic plaque detection by carotid ultrasound provides cardiovascular disease risk stratification. The advantages and disadvantages of two-dimensional (2D) and three-dimensional (3D) ultrasound methods for carotid arterial plaque quantification are reviewed. Advanced and emerging methods of carotid arterial plaque activity and composition analysis by ultrasound are considered. Recommendations for the standardization of focused 2D and 3D carotid arterial plaque ultrasound image acquisition and measurement for the purpose of cardiovascular disease stratification are formulated. Potential clinical application towards cardiovascular risk stratification of recommended focused carotid arterial plaque quantification approaches are summarized.
Article
Background and objective: Accurate, reliable, efficient, and precise measurements of the lumen geometry of the common carotid artery (CCA) are important for (a) managing the progression/regression of atherosclerotic build-up and (b) the risk of stroke. The image-based degree of stenosis in the carotid artery and the plaque burden can be predicted using the automated carotid lumen diameter (LD)/inter-adventitial diameter (IAD) measurements from B-mode ultrasound images. The objective of this review is to present the state-of-the-art methods and systems for the measurement of LD/IAD in CCA based on automated or semi-automated strategies. Further, the performance of these systems is compared based on various metrics for its measurements. Methods: The automated algorithms proposed for the segmentation of carotid lumen are broadly classified into two different categories as: region-based and boundary-based. These techniques are discussed in detail specifying their pros and cons. Further, we discuss the challenges encountered in the segmentation process along with its quantitative assessment. Lastly, we present stenosis quantification and risk stratification strategies. Results: Even though, we have found more boundary-based approaches compared to region-based approaches in the literature, however, the region-based strategy yield more satisfactory performance. Novel risk stratification strategies are presented. On a patient database containing 203 patients, 9 patients are identified as high risk patients, whereas 27 patients are identified as medium risk patients. Conclusions: We have presented different techniques for the lumen segmentation of the common carotid artery from B-mode ultrasound images and measurement of lumen diameter and inter-adventitial diameter. We believe that the issue regarding boundary-based techniques can be compensated by taking regional statistics embedded with boundary-based information.
Article
Carotid artery intima-media thickness (IMT) is a noninvasive measurement of the artery wall thickness, inclusive of atherosclerotic plaque, obtained using ultrasound imaging. In the MESA (Multi-Ethnic Study of Atherosclerosis) study, IMT measurements are used as a surrogate for subclinical cardiovascular disease and as a variable predictive of cardiovascular events. IMT measurements of the common carotid artery are available in more than 99% of the MESA population and are predictive of cardiovascular events. More importantly, IMT and plaque thickness measurements made in the internal carotid artery and carotid bulb are also available in more than 98% of the population and are also strongly predictive of cardiovascular events. This article reviews the techniques used to obtain the MESA IMT values, compares them to those made in other epidemiological studies, and summarizes how they have been used in the MESA study as both surrogates for and predictors of cardiovascular disease.
Article
In the ultrasound imaging process, the complex summation at the transducer face is assumed to be linear. The envelope detection process in B-scanning is a nonlinear step which yields essentially the magnitude of the complex field or voltage. It is shown that Rayleigh statistics govern the first-order behavior of the magnitude; and the autocorrelation of the resulting image speckle is obtained by the method of Middleton. The corresponding power spectrum follows immediately by Fourier transformation. Theoretical and experimentally determined autocorrelation functions and power spectra derived fron B-scans of a scattering phantom containing many scatterers per resolution cell are presented. These functions lead naturally to the definition of the average speckle spot or cell size, and this in turn is comparable to the resolution cell. Each independent speckle serves as a degree of freedom that determines the number of samples of tissue available over a target. As the speckle cell size decreases this number increaseas in a manner predictable from the physical parameters of the cell size. However, it is found that the speckle cell is broadened, the degrees of freedom diminished, when the object structure is correlated. This yields the possibility of deducing information about the object structure from the second-order statistics of the speckle texture.
Medical Ultrasound B-Mode Modeling, Despeckling and Tissue Characterization Assessing the Atherosclerotic Disease
  • José Carlos Rosa Seabra
José Carlos Rosa Seabra. Medical Ultrasound B-Mode Modeling, Despeckling and Tissue Characterization Assessing the Atherosclerotic Disease. PhD thesis, Instituto Superior Técnico, 2011.