Content uploaded by Afonso Raposo
Author content
All content in this area was uploaded by Afonso Raposo on Jan 11, 2022
Content may be subject to copyright.
000
001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031
032
033
034
035
036
037
038
039
040
041
042
043
044
045
046
047
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
Ultrasound denoising using the pix2pix GAN
Afonso Raposo1
afonso.raposo@tecnico.ulisboa.pt
António Azeitona1
antoniorrazeitona@tecnico.ulisboa.pt
Manya Afonso2
manya.afonso@wur.nl
J. Miguel Sanches1
jmrs@tecnico.ulisboa.pt
1Institute for Systems and Robotics (ISR), LARSyS,
Instituto Superior Técnico,
Departamento de Bioengenharia,
Universidade de Lisboa
2Wageningen University and Research,
Wageningen, The Netherlands
Abstract
The use of ultrasound (US) as an imaging technique is essential for the
diagnosis of atherosclerotic cardiovascular disease (ASCVD), which de-
pends on US images of the carotid artery. However, US images are
plagued by a specific type of noise called Speckle noise, which lowers
image quality dramatically. As an attempt to improve US image qual-
ity, the use of a generative adversarial network (GAN) is explored. The
GAN chosen for this is the pix2pix model and the dataset used for train-
ing is composed of images containing simple geometric shapes of various
scales and their equivalent corrupted with Speckle noise following the
Log-Compression model. The results of this GAN are displayed and a
noticeable improvement can be verified in the image quality.
1 Introduction
The two main predictors used for the diagnosis and assessment of atheroscle-
rotic cardiovascular disease risk are the carotid intima-media thickness
and analysis of the carotid arterial plaque, both of which are obtained by
the use of ultrasound (US) imaging [4, 6]. Although there exist some
promising studies on the development of a fully automatic segmentation
technique, the performance is still far from ideal due to the high content
of Speckle noise [5]. The current approach for the denoising of US im-
ages is based on the use of non-linear filters such as anisotropic diffusion
filters and adaptive median filters [2]. These filters tend to preserve the
contours of the structure but over-smooth the remaining areas.
Figure 1: Ultrasound image of a liver (left) and corresponding images re-
sulting from anisotropic diffusion filtering (middle) and adaptive median
filtering (right).
The introduction of Generative Adversarial Networks (GANs) as a
means to generate images presents a new opportunity for developing novel
denoising techniques. The most widespread of these networks is the
pix2pix [3], which can be trained with pixel-wise paired images in a way
that it can receive a certain image as an input and then output a version of
that image with different characteristics, image-to-image translation.
2 Problem Formulation
The pix2pix network requires pairs of images to be trained. In this case,
these pairs consist of US images with Speckle noise and the same image
without Speckle noise.
Speckle noise follows the Rayleigh distribution:
ρ(yi) = yi
σ2e−y2
i
2σ2(1)
This work was supported by Portuguese funds through FCT (Fundação para a Ciência
e Tecnologia) through the projects reference UIDP/50009/2020 and through the reference
UID/EEA/50009/2019, LARSyS - FCT Plurianual funding 2020-2023.
Where ρis the p.d.f., yiis the intensity value of the ith pixel in the
grayscale ultrasound image and σis a scale factor dependent on the scat-
tering amplitude of the particles in the medium [9].
B-mode US images suffer logarithmic compression after the acquisi-
tion of the data, which can be modeled by the equation 2.
zi j =αlog(yi j +1) + β(2)
Where i and j are the positions of the pixel, z is the pixel after the com-
pression, y is the pixel of the radio frequency (RF) image, and αand β
are parameters dependent on the contrast and brightness [7], respectively.
These mathematical models make possible the creation of synthetic
pairs of images to train the network, which, after trained, will accept US
images and return denoised versions of those images.
3 Methods
The dataset used was composed of synthetic images of several geomet-
ric shapes of varying dimensions, intensities, and number. These images
were generated with the draw.random shapes function from the skim-
age library, using the parameters: shape=(256, 256), allow_overlap=True,
min_shapes=128, max_shapes=256, min_size=10, max_size=50. The in-
tensity of the images was then inverted, so the background was black and
the pixels with the lowest intensities (<5) were corrected to have an inten-
sity of 5. A total of 2560 images was generated this way, corresponding
to the output of the training dataset (denoised US images).
The training input images (noisy US images), were computed based
on the images obtained using the method described above. To simulate the
US image with Speckle noise, the Rayleigh distribution function (equa-
tion 1) was used, taking the original synthetic image as the value for the
standard deviation (here denoted by σ). However, the model for logarith-
mic compression as displayed in figure 2, shows that after the noisy data
is obtained (RF image), there are a few steps to reach the final B-mode
US image.
Figure 2: Full diagram of the model for the generic processing operations
of an ultrasound system [8].
The first of these steps is an interpolation, which was mimicked by ap-
plying a 2D decimation of the images, reducing them to a quarter of their
size, followed by applying a linear interpolation, restoring the dimension
of the images.
The other step is the logarithmic compression expressed by equation
2. This process depends on two parameters, αand β, that are usually not
provided by US equipment manufacturers and, therefore, are unknown.
As a way to increase the versatility and robustness of the network, these
parameters were randomly selected for each image from a set of intervals:
[10,50]for αand [−50,50]for β.
Proceedings of RECPAD 2021 27th Portuguese Conference on Pattern Recognition
91
*/
063
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079
080
081
082
083
084
085
086
087
088
089
090
091
092
093
094
095
096
097
098
099
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
Figure 3: Example of the image pairs resulting from the method used: the
input image (left) and the target (right).
The training of the network consisted of using 2048 image pairs for
training and 512 image pairs for validation, for a total of 200 epochs.
4 Results and Discussion
After the network was trained, some images from the validation set were
fed to it so that the image generated could be compared to the target out-
put, as shown in figure 4.
Figure 4: Example of a simulated noisy US image from the validation set
(left), the output given by the trained network (middle) and the original
synthetic image (right).
Although some distortion is present, the structural information recov-
ered seems to be more than satisfactory and the values of the Structural
Similarity Index (SSIM, higher is better) and Peak Signal-to-Noise Ratio
(PSNR, higher is better) as quality assessment metrics were calculated [1]
and the values are shown in table 1.
Table 1: Values of peak signal-to-noise ratio and structural similarity in-
dex corresponding to different denoising techniques
Method PSNR SSIM
Anisotropic Filter 11.906 0.461
Adaptive Median 12.053 0.452
Ours 21.085 0.789
The drastic difference in the values can be explained easily when
comparing the images (figure 6), seeing as the classical filters do not im-
prove the intensity values of the image in the same way that the network
was able to.
Nonetheless, both the SSIM and PSNR, testify to the potential of the
network when compared to classical methods of filtering, showing almost
double the score.
The finished network was also used to denoise real US images of the
carotid artery, resulting in the images shown in figure 5.
In this case, quality assessment measures cannot be performed be-
cause, being a real US image, there is no ground truth image (a completely
clean image) available. Even so, visual comparison with the aforemen-
tioned classical methods is possible (figure 6).
As it can be seen, the image resulting from the GAN highlights the
more prominent structures, preserves most contours without retaining noise
while, admittedly, losing some of the "realness" of the image as a default
US image since the model was trained with geometric synthetic images.
5 Conclusion
The use of the pix2pix network as a tool to denoise and enhance the image
quality of US images is an appealing prospect and, as shown in this work,
reveals itself promising in this area. In this work, synthetic images were
Figure 5: Example of a real carotid US image from (left) and the output
given by the trained network (right).
ORIGINAL
ANISOT ROP IC
FILTE R
ADAP TIVE
MEDIAN FILTE R OUR METHOD
Figure 6: Comparing various ultrasound denoising algorithms to our
method.
used to train the network and, although the influence of the geometric
shapes is clear on the resulting carotid US image, the increased sharpness
and preservation of contours holds great value for both physicians and
automatic segmentation algorithms. There is room for improvement still,
specifically in the areas of the architecture of the network, adjustments
to the network’s loss function, and refinement of the datasets used for
training, making the use of GANs as a denoising tool an enticing avenue
for further study.
References
[1] Li Sze Chow and Raveendran Paramesran. Review of medical image quality
assessment. Biomedical Signal Processing and Control, 27:145–154, May
2016. doi: 10.1016/j.bspc.2016.02.006.
[2] Linwei Fan, Fan Zhang, Hui Fan, and Caiming Zhang. Brief review of image
denoising techniques. Visual Computing for Industry, Biomedicine, and Art, 2
(1), July 2019. doi: 10.1186/s42492- 019-0016-7.
[3] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. Image-to-
image translation with conditional adversarial networks, 2018.
[4] Amer M. Johri, Vijay Nambi, Tasneem Z. Naqvi, Steven B. Feinstein, Es-
ther S.H. Kim, Margaret M. Park, Harald Becher, and Henrik Sillesen. Rec-
ommendations for the assessment of carotid arterial plaque by ultrasound for
the characterization of atherosclerosis and evaluation of cardiovascular risk:
From the american society of echocardiography. Journal of the American So-
ciety of Echocardiography, 33(8):917–933, 2020. doi: https://doi.org/10.1016/
j.echo.2020.04.021.
[5] P Krishna Kumar, Tadashi Araki, Jeny Rajan, John R Laird, Andrew Nico-
laides, and Jasjit S. Suri. State-of-the-art review on automated lumen and ad-
ventitial border delineation and its measurements in carotid ultrasound. Com-
puter Methods and Programs in Biomedicine, 163:155–168, September 2018.
doi: 10.1016/j.cmpb.2018.05.015.
[6] Joseph F. Polak and Daniel H. O’Leary. Carotid intima-media thickness as sur-
rogate for and predictor of CVD. Global Heart, 11(3):295, September 2016.
doi: 10.1016/j.gheart.2016.08.006.
[7] Jose Seabra and Joao Sanches. Modeling log-compressed ultrasound images
for radio frequency signal recovery. In 2008 30th Annual International Confer-
ence of the IEEE Engineering in Medicine and Biology Society. IEEE, August
2008. doi: 10.1109/iembs.2008.4649181.
[8] José Carlos Rosa Seabra. Medical Ultrasound B-Mode Modeling, De-
speckling and Tissue Characterization Assessing the Atherosclerotic Disease.
PhD thesis, Instituto Superior Técnico, 2011.
[9] R.F. Wagner, S.W. Smith, J.M. Sandrik, and H. Lopez. Statistics of speckle
in ultrasound b-scans. IEEE Transactions on Sonics and Ultrasonics, 30(3):
156–163, May 1983. doi: 10.1109/t- su.1983.31404.
2
Proceedings of RECPAD 2021 27th Portuguese Conference on Pattern Recognition
92