ArticlePDF Available

Dynamic contrast-enhanced computed tomography diagnosis of primary liver cancers using transfer learning of pretrained convolutional neural networks: Is registration of multiphasic images necessary?

Authors:

Abstract and Figures

Purpose To evaluate the effect of image registration on the diagnostic performance of transfer learning (TL) using pretrained convolutional neural networks (CNNs) and three-phasic dynamic contrast-enhanced computed tomography (DCE-CT) for primary liver cancers. Methods We retrospectively evaluated 215 consecutive patients with histologically proven primary liver cancers, including six early, 58 well-differentiated, 109 moderately differentiated, 29 poorly differentiated hepatocellular carcinomas (HCCs), and 13 non-HCC malignant lesions containing cholangiocellular components. We performed TL using various pretrained CNNs and preoperative three-phasic DCE-CT images. Three-phasic DCE-CT images were manually registered to correct respiratory motion. The registered DCE-CT images were then assigned to the three color channels of an input image for TL: pre-contrast, early phase, and delayed phase images for the blue, red, and green channels, respectively. To evaluate the effects of image registration, the registered input image was intentionally misaligned in the three color channels by pixel shifts, rotations, and skews with various degrees. The diagnostic performances (DP) of the pretrained CNNs after TL in the test set were compared by three general radiologists (GRs) and two experienced abdominal radiologists (ARs). The effects of misalignment in the input image and the type of pretrained CNN on the DP were statistically evaluated. Results The mean DPs for histological subtype classification and differentiation in primary malignant liver tumors on DCE-CT for GR and AR were 39.1%, and 47.9%, respectively. The highest mean DPs for CNNs after TL with pixel shifts, rotations, and skew misalignments were 44.1%, 44.2%, and 43.7%, respectively. Two-way analysis of variance revealed that the DP is significantly affected by the type of pretrained CNN (P = 0.0001), but not by misalignments in input images other than skew deformations. Conclusion TL using pretrained CNNs is robust against misregistration of multiphasic images and comparable to experienced ARs in classifying primary liver cancers using three-phasic DCE-CT.
Content may be subject to copyright.
1
Title: Dynamic contrast-enhanced computed tomography diagnosis of primary liver cancers
using transfer learning of pre-trained convolutional neural networks: Is registration of
multi-phasic images necessary?
Authors: Akira Yamada (ORCID: 0000-0002-4199-203X), Kazuki Oyama, Sachie Fujita, Eriko
Yoshizawa, Fumihito Ichinohe, Daisuke Komatsu, Yasunari Fujinaga
Affiliation: Shinshu University School of Medicine, Department of Radiology
Corresponding author: Akira Yamada
Address: 3-1-1 Asahi, Matsumoto, Nagano 390-8621, Japan
E-mail: a_yamada@shinshu-u.ac.jp
Telephone number: +81-263-37-2650
Fax number: +81-263-37-3087
Conflict of interest:
The authors declare that they have no conflict of interest.
Manuscript Click here to access/download;Manuscript;manuscript_R1.docx
Click here to view linked References
2
Abstract
Purpose: To evaluate the effect of image registration on the diagnostic performance of transfer
learning (TL) using pre-trained convolutional neural networks (CNNs) and three-phasic
dynamic contrast-enhanced computed tomography (DCE-CT) for primary liver cancers.
Methods: We retrospectively evaluated 215 consecutive patients with histologically proven
primary liver cancers, including six early, 58 well-differentiated, 109 moderately-differentiated,
and 29 poorly-differentiated hepatocellular carcinomas (HCCs), and 13 non-HCC malignant
lesions containing cholangiocellular components. We performed TL using various pre-trained
CNNs and preoperative three-phasic DCE-CT images. Three-phasic DCE-CT images were
manually registered to correct respiratory motion. The registered DCE-CT images were then
assigned to the three color channels of an input image for TL: pre-contrast, early phase, and
delayed phase images for the blue, red, and green channels, respectively. To evaluate the effects
of image registration, the registered input image was intentionally misaligned in the three color
channels by pixel shifts, rotations, and skews with various degrees. The diagnostic
performances (DP) of the pre-trained CNNs after TL in the test set were compared by three
general radiologists (GRs) and two experienced abdominal radiologists (ARs). The effects of
misalignment in the input image and the type of pre-trained CNN on the DP were statistically
evaluated.
Results: The mean DPs for histological subtype classification and differentiation in primary
malignant liver tumors on DCE-CT for GR and AR were 39.1%, and 47.9%, respectively.
The highest mean DPs for CNNs after TL with pixel shifts, rotations, and skew misalignments
were 44.1%, 44.2%, and 43.7%, respectively. Two-way analysis of variance revealed that the
DP is significantly affected by the type of pre-trained CNN (P = 0.0001), but not by
misalignments in input images other than skew deformations.
Conclusion: TL using pre-trained CNNs is robust against misregistration of multi-phasic images,
and comparable to experienced ARs in classifying primary liver cancers using three-phasic
DCE-CT.
Key words:
Primary liver cancer; Dynamic contrast-enhanced computed tomography; Transfer learning;
Convolutional neural network; Registration
3
Introduction
Diagnostic imaging of primary liver cancers is important, because primary liver
cancers are often treated through imaging diagnosis only, without pathological diagnosis [1].
Furthermore, the therapeutic strategy can differ significantly depending on the pathological
subtype. For example, it has been reported that macro-trabecular and compact types, which are
common in poorly differentiated hepatocellular carcinoma (HCC), exhibit higher rates of
recurrence after transarterial catheter embolization than after hepatectomy or radio frequency
ablation [2].
A convolutional neural network (CNN) is a machine learning algorithm that has
attracted considerable attention in diagnostic imaging, because it can perform as well as or
better than humans in image classification tasks [3]. The advantage of transfer learning (TL)
using pre-trained CNNs compared to usual deep learning algorithms utilizing untrained CNNs is
that TL can achieve a high classification performance with a relatively small dataset [3]. The
usefulness of TL with pre-trained CNNs for liver disease has been demonstrated by several
studies [4,5]. However, the effect of misregistration of input images on the diagnostic
performance of CNNs has not been fully investigated. In particular, this is an important issue for
radiologists when multiple images are employed as input images for a CNN, such as in dynamic
contrast-enhanced computed tomography (DCE-CT), because respiratory misregistration
frequently occurs in DCE-CT imaging of the liver. Furthermore, manual registration is laborious
and time-consuming for radiologists.
The purpose of this study was to evaluate the effects of image registration on the
diagnostic performance of TL using a pre-trained CNN and three-phasic DCE-CT for primary
liver cancers.
4
Materials and Methods
Subjects
We retrospectively evaluated 215 consecutive patients (median age = 70 years; age
range = 3485 years; male:female = 165:52) with histologically proven primary liver cancers in
a single institute (Shinshu University Hospital, Matsumoto, Japan) from 2005 to 2010,
including six early (eHCC), 58 well-differentiated (wHCC), 109 moderately differentiated
(mHCC), and 29 poorly differentiated (pHCC) HCCs, and 13 non-HCC malignant lesions
containing cholangiocellular components (CCC). Written informed consent was obtained from
all patients when preoperative DCE-CT was performed. The patients who did not undergo
preoperative DCE-CT within 1 month before hepatectomy were excluded from the study.
DCE-CT protocol
Three-phasic DCE-CT (a precontrast phase and two phases after intravenous
contrast agent injection) was performed at 40 (early phase) and 130 s (delayed phase) after
injection, using a 64row CT scanner. The scan parameters were as follows: the range was
whole abdomen from the upper level of the diaphragm; the tube voltage was 120 kVp; the tube
current was 500 mA; the matrix had 512 × 512 pixels; the field of view was 320 × 320 mm; the
size of collimation was 0.625 mm; and the reconstruction thickness was 2.5 mm. A non-ionic
iodinated contrast agent (Iopamiron 370 mg/mL; Bayer Healthcare, Berlin, Germany) was
administered intravenously through a 22gauge catheter in the median cubital vein. The total
dose was 100 mL, and the rate of injection was 3 mL/s.
TL using pre-trained CNNs
TL was performed using various pre-trained CNNs (Alexnet, VGG-16, VGG-19,
GoogLeNet, Inception-v3, ResNet-50, and ResNet-101) and preoperative three-phasic DCE-CT
images at the maximal cross-sectional lesion area. In the image presentation in TL, three-phasic
DCE-CT DICOM images were manually registered to correct respiratory motion by an
abdominal radiologist (A.Y.) who has 18 years of diagnostic experience. The registered
three-phasic DCE-CT DICOM images were then cropped at the hepatic lesion, and assigned to
the three color channels of an input JPEG image for TL as follows: pre-contrast, early phase,
and delayed phase images for the blue, red, and green channels, respectively. The window level
and width for DICOM images were fixed as 80 and 350 Hounsfield units, respectively. The
5
image size was transformed according to the utilized pre-trained CNN (227 × 227 pixels for
Alexnet, 299 × 299 pixels for Inception-v3, and 224 × 224 pixels for the other CNNs). To
evaluate the effect of registration, manually registered input images were intentionally
misaligned to various degrees in the three color channels by pixel shifts (0, 1, 2, 4, 8, 16, and 32
pixels), rotations (0, 1, 2, 4, 8, 16, and 32 degrees), and skews (0%, 1%, 2%, 4%, 8%, 16%, and
32%) (Fig. 1). The image with 0 pixel shift, 0 degree rotation, and 0% skew represents the
original registered image. The input images with specific degrees of misalignment were divided
into training (70%) and test (30%) sets, such that the proportion of histological subtypes was the
same in both sets. In the transfer learning procedure, the final three layers of the pre-trained
CNNs, originally developed for the ImageNet dataset (1,000 classes), were replaced by a fully
connected layer, a softmax layer, and a classification output layer (with five classes in this study,
including eHCC, wHCC, mHCC, pHCC, and CCC). To learn faster in the new layers than in the
transferred layers, the initial learning rate was set to a small value (0.0001). Meanwhile, the
learning rate factor for the new fully connected layer was set to a large value (20). The
mini-batch size was set to 10. A classification test was performed on the pre-trained CNNs after
TL with 500 iterations of training using the five-fold cross validation method. The mean value
of the obtained results was utilized for a statistical analysis. All the procedures were carried out
using MATLAB software (2018a, MathWorks, Natick, MA, USA).
Statistical analysis
The diagnostic performances (DP = [number of correctly classified cases] / [total
number of cases] × 100) of the pre-trained CNNs after TL in the test set were compared by three
general radiologists (GRs) and two experienced abdominal radiologists (ARs). The observer
agreement was tested by weighted kappa. The effects of misalignments (pixel shifts, rotations,
and skews) in the input image and the type of pre-trained CNN on DP were statistically
evaluated by two-way analysis of variance (ANOVA) and a multiple comparison test using
Turkey’s honest significant difference criterion. A probability value of less than 0.05 or no
overlapping in a 95% confidence interval were regarded as statistically significant. All the
procedures were carried out using MATLAB software (2018a, MathWorks, Natick, MA, USA).
6
Results
The mean DPs for the classification of histological subtype and differentiation in
primary malignant liver tumors on DCE-CT for GR and AR were 39.1% and 47.9%,
respectively. The mean weighted kappa between observers was 0.92 (range 0.90-0.95).
Two-way ANOVA revealed that the type of pre-trained CNN (P < 0.0001) had a
significant effect on the DP, whereas the degree of misalignment in input images for TL (P =
0.17) when pixel-shift misalignment was applied (Table 1) did not. A multi-comparison
revealed that GoogLeNet exhibited the highest mean DP (44.1%) using input images misaligned
by pixel shift. Statistical significance was observed between GoogLeNet and some other
pre-trained CNNs (VGG-16, VGG-19, ResNet-50, and ResNet-101) (Fig. 2).
Significant effects on the DP were observed for the type of pre-trained CNN (P <
0.0001) and degree of misalignment of input images for TL (P = 0.001) when a rotation
misalignment was applied (Table 2). However, a multi-comparison revealed that there was no
significant difference in the DPs of CNNs between registered and misaligned input images (Fig.
3). GoogLeNet exhibited the highest mean DP (44.2%) using input images misaligned by
rotation. Statistical significance was observed between GoogLeNet and some other pre-trained
CNNs (Alexnet, VGG-16, VGG-19, and ResNet-50) (Fig. 3).
Two-way ANOVA revealed that the type of pre-trained CNN (P < 0.0001) and the
degree of misalignment in input images for TL (P < 0.0001) had a significant effect on the DP
when skew misalignment was applied (Table 3). There was a significant decrease in the DPs of
CNNs when the skew ratios in input images were 4% and 8% (Fig. 4). Inception-v3 and
GoogLeNet exhibited higher mean DPs (43.7% and 43.4%) even if skew misalignment was
applied. Statistical significance was observed between these two pre-trained CNNs and the
others (Alexnet, VGG-16, VGG-19, ResNet-50, and ResNet-101) (Fig. 4).
7
Discussion
Our results demonstrate the high diagnostic performance of TL using a pre-trained
CNN, which is comparable to experienced ARs in classifying primary liver cancers using
three-phasic DCE-CT. Our results also clarify that TL using particular pre-trained CNNs
(GoogLeNet and Inception-v3) was robust against misregistration of DCE-CT images, even if
the pre-trained CNNs were trained using RGB images without misalignment in the color
channels [6]. One of the common features between GoogLeNet and Inception-v3 is the
inception architecture, which enables efficient parameter reduction and allows for training
high-quality networks on relatively modest-sized training sets [7,8]. This architecture may relate
to the robustness against misregistration and higher diagnostic performance of TL using
multi-phasic DCE-CT images. However, further study is required to confirm this.
Our findings in this study can accelerate the application of TL using pre-trained
CNNs, not only in dynamic contrast-enhanced study, but also for multi-parametric imaging,
such as magnetic resonance imaging. This is because this approach can be more easily applied
in a clinical setting, without time-consuming registration procedures and using smaller training
datasets compared to conventional deep learning algorithms using untrained CNNs [3].
However, special caution should be exercised when applied this approach to hollow organs,
such as the heart or alimentary tract, which are frequently accompanied by skew-type
deformations, because some CNNs were not robust to skew misregistration.
In conclusion, TL using pre-trained CNNs is robust against misregistrations, and
comparable to experienced ARs in the classification of primary liver cancers using three-phasic
DCE-CT. Therefore, there is no need for the correction of misregistrations for TL using
pre-trained CNNs.
8
References
1) Torzilli G, Minagawa M, Takayama T, Inoue K, Hui AM, Kubota K, Ohtomo K,
Makuuchi M. (1999) Accurate preoperative evaluation of liver mass lesions without fine-needle
biopsy. Hepatology. 30:889-93.
2) Okabe H, Yoshizumi T, Yamashita YI, Imai K, Hayashi H, Nakagawa S, Itoh S, Harimoto N,
Ikegami T, Uchiyama H, Beppu T, Aishima S, Shirabe K, Baba H, Maehara Y. (2018)
Histological architectural classification determines recurrence pattern and prognosis after
curative hepatectomy in patients with hepatocellular carcinoma. PLoS One. 13:e0203856.
https://doi.org/10.1371/journal.pone.0203856.
3) Yasaka K, Akai H, Kunimatsu A, Kiryu S, Abe O. (2018) Deep learning with convolutional
neural network in radiology. Jpn J Radiol. 36:257-272.
https://doi.org10.1007/s11604-018-0726-3.
4) Byra M, Styczynski G, Szmigielski C, Kalinowski P, Michałowski Ł, Paluszkiewicz R,
Ziarkiewicz-Wróblewska B, Zieniewicz K, Sobieraj P, Nowicki A. (2018) Transfer learning
with deep convolutional neural network for liver steatosis assessment in ultrasound images. Int J
Comput Assist Radiol Surg. 13:1895-1903. https://doi.org/10.1007/s11548-018-1843-2.
5) Yu Y, Wang J, Ng CW, Ma Y, Mo S, Fong ELS, Xing J, Song Z, Xie Y, Si K, Wee A, Welsch
RE, So PTC, Yu H. (2018) Deep learning enables automated scoring of liver fibrosis stages. Sci
Rep. 8:16016. https://doi.org/10.1038/s41598-018-34300-2.
6) Deng J, Dong W, Socher R, Li L, Li K, Fei-Fei L. (2009) ImageNet: A large-scale hierarchical
image database. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
https://doi.org/10.1109/CVPR.2009.5206848.
7) Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V,
Rabinovich A. (2015) Going deeper with convolutions. IEEE Conference on Computer Vision
and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2015.7298594.
8) Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. (2016) Rethinking the inception
architecture for computer vision. IEEE Conference on Computer Vision and Pattern Recognition
(CVPR). https://doi.org/10.1109/CVPR.2016.308.
9
Figure Captions
Fig. 1 Illustration of input image preparation from three-phasic DCE-CT for transfer learning
using a pre-trained CNN. Three-phasic DCE-CT images were manually registered to correct
respiratory motion (misaligned value = 0). The registered three-phasic DCE-CT images were
then assigned into the three color channels of an input image for transfer learning as follows:
pre-contrast, early phase, and delayed phase images for the blue, red, and green channels,
respectively. The manually registered input images were intentionally misaligned in the three
color channels by pixel shifts, rotations, and skews with various misaligned values, to generate
misaligned input images for transfer learning. DCE-CT = dynamic contrast-enhanced computed
tomography, CNN = convolutional neural network
Fig. 2 Multi-comparison of diagnostic performances of CNNs according to pixel-shift values in
misaligned input images and the type of pre-trained CNN. Circles and bars indicate the mean
values and 95% confidence intervals, respectively. There was no significant difference between
diagnostic performances of CNNs for registered and misaligned input images. GoogLeNet
exhibited the highest mean diagnostic performance (44.1%) using input images by misaligned
pixel shift. Statistical significance was observed between GoogLeNet and some other
pre-trained CNNs (VGG-16, VGG-19, ResNet-50, and ResNet-101). CNN = convolutional
neural network
Fig. 3 Multi-comparison of diagnostic performances of CNNs according to the rotation value in
misaligned input images and the type of pre-trained CNN. Circles and bars indicate the mean
values and 95% confidence intervals, respectively. There was no significant difference between
the diagnostic performances of CNNs for registered and misaligned input images. GoogLeNet
exhibited the highest mean diagnostic performance (44.2%) using input images misaligned by
rotation. Statistical significance was observed between GoogLeNet and some other pre-trained
CNNs (Alexnet, VGG-16, VGG-19, and ResNet-50). CNN = convolutional neural network
Fig. 4 Multi-comparison of diagnostic performances of CNNs according to skew values in
misaligned input images and the type of pre-trained CNN. Circles and bars indicate the mean
values and 95% confidence intervals, respectively. There was a significant decrease in the
diagnostic performances of CNNs when skew values in misaligned input images were 4% and
10
8%. Inception-v3 and GoogLeNet exhibited higher mean diagnostic performances (43.7% and
43.4%) using input images misaligned by skewing. Statistical significance was observed
between these two pre-trained CNNs and the others (Alexnet, VGG-16, VGG-19, ResNet-50,
and ResNet-101). CNN = convolutional neural network
DCE-CT
Pre Contrast
Early Phase
Delayed Phase
Registered Input Image
(Misaligned Value = 0)
Blue Channel
Green Channel
Red
Channel
Misaligned Input Image
(Various Misaligned Values)
(8 pixels)
(32 pixels)
(8 degrees)
(32 degrees)
(8%)
(32%)
Pixel-shift
Rotation
Skew
Figure1
34 36 38 40 42 44 46
Diagnostic Performance [%]
ResNet-101
ResNet-50
Inception-v3
GoogLeNet
VGG-19
VGG-16
Alexnet
Type of pre-trained CNN
34 36 38 40 42 44 46
Diagnostic Performance [%]
32
16
8
4
2
1
0
Pixel-shift Value [pixel]
Figure2
34 36 38 40 42 44 46
Diagnostic Performance [%]
32
16
8
4
2
1
0
Rotation Value [degree]
34 36 38 40 42 44 46
Diagnostic Performance [%]
ResNet-101
ResNet-50
Inception-v3
GoogLeNet
VGG-19
VGG-16
Alexnet
Type of pre-trained CNN
Figure3
34 36 38 40 42 44 46
Diagnostic Performance [%]
32
16
8
4
2
1
0
Skew Value [%]
34 36 38 40 42 44 46
Diagnostic Performance [%]
ResNet-101
ResNet-50
Inception-v3
GoogLeNet
VGG-19
VGG-16
Alexnet
Type of pre-trained CNN
Figure4
1
Table 1. Mean diagnostic performances of pre-trained CNNs after transfer learning in
classification of primary liver malignant tumors using three-phasic DCE-CT misaligned by
pixel shifts
Pixel shift
[pixel]
Alexnet
VGG-16
VGG-19
GoogLeNet
Inception-v3
ResNet-50
ResNet-101
0
42.7%
42.8%
37.7%
42.5%
40.8%
38.6%
41.9%
1
39.4%
37.8%
37.8%
44.0%
44.9%
33.8%
40.6%
2
44.0%
40.9%
38.5%
44.3%
40.3%
38.8%
39.7%
4
36.3%
39.1%
38.2%
44.3%
38.5%
41.5%
35.7%
8
41.8%
41.8%
47.1%
45.5%
43.4%
36.3%
38.2%
16
43.7%
40.9%
37.8%
47.1%
41.2%
39.7%
33.8%
32
39.1%
41.2%
40.9%
40.9%
43.4%
36.6%
39.1%
The image with 0 pixel shift represents the original registered image. Only images with the
same pixel shift were used for the training and validation. Diagnostic performances (DP =
[number of correctly classified cases] / [total number of cases] × 100) of various pre-trained
CNNs after transfer learning in the test set are shown. Two-way ANOVA revealed that the type
of pre-trained CNN (P < 0.0001) has a significant effect on the DP, but pixel-shift values in the
input images for TL do not (P = 0.17). CNN = convolutional neural network, ANOVA = analysis
of variance
Table
2
Table 2. Mean diagnostic performances of pre-trained CNNs after transfer learning in
classification of primary liver malignant tumors using three-phasic DCE-CT misaligned by
rotation
Rotation
[degree]
Alexnet
VGG-16
VGG-19
GoogLeNet
Inception-v3
ResNet-50
ResNet-101
0
42.7%
42.8%
37.7%
42.5%
40.8%
38.6%
41.9%
1
38.5%
38.8%
39.7%
41.2%
41.8%
37.8%
40.0%
2
39.7%
42.5%
41.2%
43.1%
43.7%
36.6%
38.2%
4
39.7%
37.8%
43.7%
47.4%
45.5%
32.3%
39.1%
8
40.6%
35.1%
34.2%
42.8%
42.5%
32.3%
41.5%
16
40.6%
39.1%
36.0%
48.3%
47.7%
46.2%
42.8%
32
40.0%
39.7%
40.3%
44.3%
41.5%
44.3%
44.9%
The image with 0 degree rotation represents the original registered image. Only images with the
same rotation were used for the training and validation. Diagnostic performances (DP =
[number of correctly classified cases] / [total number of cases] × 100) of various pre-trained
CNNs after transfer learning in the test set are shown. Two-way ANOVA revealed that the type
of pre-trained CNN (P < 0.0001) and rotation value in the input image for TL (P = 0.001) have
significant effects on the DP. CNN = convolutional neural network, ANOVA = analysis of
variance
3
Table 3. Mean diagnostic performances of pre-trained CNNs after transfer learning in
classification of primary liver malignant tumors using three-phasic DCE-CT by misaligned
skewing
Skew
[%]
Alexnet
VGG-16
VGG-19
GoogLeNet
Inception-v3
ResNet-50
ResNet-101
0
42.7%
42.8%
37.7%
42.5%
40.8%
38.6%
41.9%
1
37.8%
41.2%
36.9%
43.7%
47.4%
36.9%
40.6%
2
35.4%
39.7%
38.8%
43.4%
45.5%
35.1%
38.5%
4
35.7%
38.2%
30.5%
44.3%
40.3%
35.4%
38.8%
8
39.1%
37.2%
35.7%
40.6%
36.6%
30.5%
39.4%
16
42.8%
40.0%
39.4%
45.2%
48.6%
36.6%
32.0%
32
44.9%
37.5%
37.5%
44.0%
46.5%
41.8%
42.8%
The image with 0% skew represents the original registered image. Only images with the same
skewing were used for the training and validation. Diagnostic performances (DP = [number of
correctly classified cases] / [total number of cases] × 100) of various pre-trained CNNs after
transfer learning in the test set are shown. Two-way ANOVA revealed that the type of
pre-trained CNN (P < 0.0001) and the skew value in an input image for TL (P < 0.0001) have
significant effects on the DP. CNN = convolutional neural network, ANOVA = analysis of
variance
... Within the reviewed DLR studies, the majority of them are targeting a characterization of the tumor, either the classification of FLLs (Focal Liver Lesions) [176,177,178,179], the estimation of the fibrosis stage [180], or the prediction of the histological grade [181] when two of them focused on the response to treatments, either for recurrence after resection [182] or the response after TACE (TransArterial ChemoEmbolization) [183]. ...
... Regarding the multiphasic studies, there is no consensus about the delay between the injection of the contrast agent and the acquisition of the different phases. They tend to prefer triphasic acquisition, with images acquired before the injection of contrast agent, at early arterial phase and a third phase, either portal venous [182,177,179] or a delayed one [176,178]. Peng et al. decided to get rid of the NECT (Non-Enhanced CT ) phase, but still chose a triphasic acquisition (AR, PV, DELAY). ...
... The manual delineation of the ROIs is usually done by one or more experts on the raw image [176,177,180,178,182,181], and only one study decided to perform an automatic segmentation of both the parenchyma and the lesion with the application of a random-walker algorithm, before being checked by experts [179]. When the method is based on a 2D approach, selected images are often those presenting the maximal cross-sectional proportion of the lesions, except one study targeting the estimation of the fibrosis stage, that centered the ROI so it displayed the ventral aspect of the liver [180]. ...
Thesis
To evaluate the status of a liver tumor, we usually perform a biopsy followed by an anatomo-pathological evaluation of the extracted sample. However, the biopsy, due to the small sampling size, does not testify the intra and inter-tumor heterogeneity, thus struggling in assessing precisely the phenotypical characteristics of the patients. Recent progress in medical imaging and data science fields enabled the emergence of a new technique called radiomics, that is partially answering these challenges. In this thesis, we have been focusing on hepatocellular carcinoma, and we built new imaging methods to characterize this widespread pathology. By incorporating temporal information through multiphase images, specialized UNet-like networks have been stacked in a cascaded architecture to provide a semantic segmentation of both the liver and its internal tissue (parenchyma, active & necrotic part of the tumor). To characterize the strong heterogeneity that resides in the tumor, we predict the histological grade on a fine scale (slice-wise), by re-using the features learned from the semantic segmentation network. Our preliminary results enable the production of a fine-detailed map of the tumor that separates well differentiated areas from poorly ones. Even though these results need to be confirmed with a larger cohort, we believe that medical images combined with deep modeling techniques may soon be introduced in a clinical workflow to help diagnose and evaluate the phenotypical characteristics of pathologies such as liver cancer.
... Several studies [26], [27] have established the utility of TL with pre-trained CNNs for liver cancer. Furthermore, authors in [28] used CT scans converted to Jpeg format to classify liver masses (cell masses). In image classification and detection challenges, deep residual networks (ResNet) of 50, 100, 150, or 1000 layers have obtained state-of-the-art results. ...
... We observed marked differences in the enhancement features of vascular characteristics and tumor morphology between HCC and LM. We then developed an ML model using the training cohort and subsequently validated it using an independent cohort of patients with hepatitis and extrahepatic tumors [19]. Terz et al. reported that LI-RADS could result in a reliable non-invasive diagnosis in patients with HCC [20]. ...
Article
Full-text available
Background CEUS LI-RADS (Contrast Enhanced Ultrasound Liver Imaging Reporting and Data System) has good diagnostic efficacy for differentiating hepatic carcinoma (HCC) from solid malignant tumors. However, it can be problematic in patients with both chronic hepatitis B and extrahepatic primary malignancy. We explored the diagnostic performance of LI-RADS criteria and CEUS-based machine learning (ML) models in such patients. Methods Consecutive patients with hepatitis and HCC or liver metastasis (LM) who were included in a multicenter liver cancer database between July 2017 and January 2022 were enrolled in this study. LI-RADS and enhancement features were assessed in a training cohort, and ML models were constructed using gradient boosting, random forest, and generalized linear models. The diagnostic performance of the ML models was compared with LI-RADS in a validation cohort of patients with both chronic hepatitis and extrahepatic malignancy. Results The mild washout time was adjusted to 54 s from 60 s, increasing accuracy from 76.8 to 79.4%. Through feature screening, washout type II, rim enhancement and unclear border were identified as the top three predictor variables. Using LI-RADS to differentiate HCC from LM, the sensitivity, specificity, and AUC were 68.2%, 88.6%, and 0.784, respectively. In comparison, the random forest and generalized linear model both showed significantly higher sensitivity and accuracy than LI-RADS (0.83 vs. 0.784; all P < 0.001). Conclusions Compared with LI-RADS, the random forest and generalized linear model had higher accuracy for differentiating HCC from LM in patients with chronic hepatitis B and extrahepatic malignancy.
... Furthermore, Yamada et al. (Yamada et al., 2019) determined that the diagnostic performance of transfer learning (TL) using a pretrained CNN was robust to the error registration of multiphase HCC images (Cao et al., 2020), and they retrospectively evaluated over 200 consecutive patients with actual primary liver cancer. Their results indicated that the CNN combined with a DCE-CT graphics processing model has good effects for liver cancer prevention diagnosis by observing the diagnostic work of another research team (Yasaka et al., 2018) using DL methods and CNN to differentiate liver masses in dynamic CT scans by building a CNN model with six convolutional, three maximum pooling, and three fully connected layers to achieve a median AUC of 0.84. ...
Article
Full-text available
Hepatocellular carcinoma (HCC) is the most common type of liver cancer with a high morbidity and fatality rate. Traditional diagnostic methods for HCC are primarily based on clinical presentation, imaging features, and histopathology. With the rapid development of artificial intelligence (AI), which is increasingly used in the diagnosis, treatment, and prognosis prediction of HCC, an automated approach to HCC status classification is promising. AI integrates labeled clinical data, trains on new data of the same type, and performs interpretation tasks. Several studies have shown that AI techniques can help clinicians and radiologists be more efficient and reduce the misdiagnosis rate. However, the coverage of AI technologies leads to difficulty in which the type of AI technology is preferred to choose for a given problem and situation. Solving this concern, it can significantly reduce the time required to determine the required healthcare approach and provide more precise and personalized solutions for different problems. In our review of research work, we summarize existing research works, compare and classify the main results of these according to the specified data, information, knowledge, wisdom (DIKW) framework.
... Transfer learning is a technique for adapting a model learned in one domain to another domain. Deep convolutional activation features learned from ImageNet, a large natural image database, have been successfully transferred to differentiate patients into the degree of differentiation for HCCs from CT images, with little training data [24]. We hypothesized that a DL model for lung cancer would be applicable to segmentation of poorly differentiated HCC, which could then be retrained for welldifferentiated HCC, because of the more similarity between lung cancer and poorly differentiated HCC, as well as that between poorly and well-differentiated HCCs, than natural images. ...
Preprint
Full-text available
Aim: The aim of this study was to develop dual segmentation models for poorly and well-differentiated hepatocellular carcinoma (HCC), using two-step transfer learning (TSTL) based on dynamic contrast-enhanced (DCE) computed tomography (CT) images. Methods: From 2013 to 2019, DCE CT images of 128 patients with 80 poorly differentiated and 48 well-differentiated HCCs were selected at our hospital. In the first transfer learning (TL) step, a pre-trained segmentation model with 192 CT images of lung cancer patients was retrained as a poorly differentiated HCC model. In the second TL step, a well-differentiated HCC model was built from a poorly differentiated HCC model. The average 3D Dice’s similarity coefficient (3D-DSC) and 95th-percentile of the Hausdorff distance (95% HD) were employed to evaluate the segmentation accuracy, based on a nested 4-fold cross-validation test. The DSC denotes the degree of regional similarity between the HCC reference regions and the regions estimated using the proposed models. The 95% HD is defined as the 95th-percentile of the maximum measures of how far two subsets of a metric space are from each other. Results: The average 3D-DSC and 95% HD were 0.849 ± 0.078 and 1.98 ± 0.71 mm, respectively, for poorly differentiated HCC regions, and 0.811 ± 0.089 and 2.01 ± 0.84 mm, respectively, for well-differentiated HCC regions. The average 3D-DSC for both regions was 1.2 times superior to that calculated without the TSTL. Conclusion: The proposed model using TSTL from the lung cancer dataset showed the potential to segment poorly and well-differentiated HCC regions on DCE CT images.
Article
Full-text available
Hepatocellular carcinoma (HCC) is a major cause of cancer-related deaths worldwide. This review explores the recent progress in the application of artificial intelligence (AI) in radiological diagnosis of HCC. The Barcelona Classification of Liver Cancer criteria guides treatment decisions based on tumour characteristics and liver function indicators, but HCC often remains undetected until intermediate or advanced stages, limiting treatment options and patient outcomes. Timely and accurate diagnostic methods are crucial for enabling curative therapies and improving patient outcomes. AI, particularly deep learning and neural network models, has shown promise in the radiological detection of HCC. AI offers several advantages in HCC diagnosis, including reducing diagnostic variability, optimising data analysis and reallocating healthcare resources. By providing objective and consistent analysis of imaging data, AI can overcome the limitations of human interpretation and enhance the accuracy of HCC diagnosis. Furthermore, AI systems can assist healthcare professionals in managing the increasing workload by serving as a reliable diagnostic tool. Integration of AI with information systems enables comprehensive analysis of patient data, facilitating more informed and reliable diagnoses. The advancements in AI-based radiological diagnosis hold significant potential to improve early detection, treatment selection and patient outcomes in HCC. Further research and clinical implementation of AI models in routine practice are necessary to harness the full potential of this technology in HCC management.
Article
This review outlines the current status and challenges of the clinical applications of artificial intelligence in liver imaging using computed tomography or magnetic resonance imaging based on a topic analysis of PubMed search results using latent Dirichlet allocation. LDA revealed that "segmentation," "hepatocellular carcinoma and radiomics," "metastasis," "fibrosis," and "reconstruction" were current main topic keywords. Automatic liver segmentation technology using deep learning is beginning to assume new clinical significance as part of whole-body composition analysis. It has also been applied to the screening of large populations and the acquisition of training data for machine learning models and has resulted in the development of imaging biomarkers that have a significant impact on important clinical issues, such as the estimation of liver fibrosis, recurrence, and prognosis of malignant tumors. Deep learning reconstruction is expanding as a new technological clinical application of artificial intelligence and has shown results in reducing contrast and radiation doses. However, there is much missing evidence, such as external validation of machine learning models and the evaluation of the diagnostic performance of specific diseases using deep learning reconstruction, suggesting that the clinical application of these technologies is still in development.
Article
The aim of this study was to develop dual segmentation models for poorly and well-differentiated hepatocellular carcinoma (HCC), using two-step transfer learning (TSTL) based on dynamic contrast-enhanced (DCE) computed tomography (CT) images. From 2013 to 2019, DCE-CT images of 128 patients with 80 poorly differentiated and 48 well-differentiated HCCs were selected at our hospital. In the first transfer learning (TL) step, a pre-trained segmentation model with 192 CT images of lung cancer patients was retrained as a poorly differentiated HCC model. In the second TL step, a well-differentiated HCC model was built from a poorly differentiated HCC model. The average three-dimensional Dice’s similarity coefficient (3D-DSC) and 95th-percentile of the Hausdorff distance (95% HD) were mainly employed to evaluate the segmentation accuracy, based on a nested fourfold cross-validation test. The DSC denotes the degree of regional similarity between the HCC reference regions and the regions estimated using the proposed models. The 95% HD is defined as the 95th-percentile of the maximum measures of how far two subsets of a metric space are from each other. The average 3D-DSC and 95% HD were 0.849 ± 0.078 and 1.98 ± 0.71 mm, respectively, for poorly differentiated HCC regions, and 0.811 ± 0.089 and 2.01 ± 0.84 mm, respectively, for well-differentiated HCC regions. The average 3D-DSC for both regions was 1.2 times superior to that calculated without the TSTL. The proposed model using TSTL from the lung cancer dataset showed the potential to segment poorly and well-differentiated HCC regions on DCE-CT images.
Article
Objectives To develop and validate deep convolutional neural network (DCNN) models for the diagnosis of adrenal adenoma (AA) using CT. Methods This retrospective study enrolled 112 patients who underwent abdominal CT (non-contrast, early, and delayed phases) with 107 adrenal lesions (83 AAs and 24 non-AAs) confirmed pathologically and with eight lesions confirmed by follow-up as metastatic carcinomas. Three patients had adrenal lesions on both sides. We constructed 6 DCNN models from 6 types of input images for comparison: non-contrast images only (Model A), delayed Phase images only (Model B), three phasic images merged into a 3-channel (Model C), relative-washout-rate (RWR) image maps only (Model D), non-contrast and RWR maps merged into a 2-channel (Model E), and delayed phase and RWR maps merged into a 2-channel (Model F). These input images were prepared manually with cropping and registration of CT images. Each DCNN model with six convolutional layers was trained with data augmentation and hyper-parameter tuning. The optimal threshold values for binary classification were determined from the receiver-operating characteristic curve analyses. We adopted the nested cross-validation method, in which the outer 5-fold cross-validation was used to assess the diagnostic performance of the models and the inner 5-fold cross-validation was used to tune hyperparameters of the models. Results The AUCs with 95% confidence intervals of Models A–F were 0.94 [0.90, 0.98], 0.80 [0.69, 0.89], 0.97 [0.94, 1.00], 0.92 [0.85, 0.97], 0.99 [0.97, 1.00] and 0.94 [0.86, 0.99], respectively. Model E showed high AUC greater than 0.95. Conclusion DCNN models may be a useful tool for the diagnosis of AA using CT. Advances in knowledge The current study demonstrates a deep learning-based approach could differentiate adrenal adenoma from non-adenoma using multiphasic CT.
Article
Full-text available
Current liver fibrosis scoring by computer-assisted image analytics is not fully automated as it requires manual preprocessing (segmentation and feature extraction) typically based on domain knowledge in liver pathology. Deep learning-based algorithms can potentially classify these images without the need for preprocessing through learning from a large dataset of images. We investigated the performance of classification models built using a deep learning-based algorithm pre-trained using multiple sources of images to score liver fibrosis and compared them against conventional non-deep learning-based algorithms - artificial neural networks (ANN), multinomial logistic regression (MLR), support vector machines (SVM) and random forests (RF). Automated feature classification and fibrosis scoring were achieved by using a transfer learning-based deep learning network, AlexNet-Convolutional Neural Networks (CNN), with balanced area under receiver operating characteristic (AUROC) values of up to 0.85–0.95 versus ANN (AUROC of up to 0.87–1.00), MLR (AUROC of up to 0.73–1.00), SVM (AUROC of up to 0.69–0.99) and RF (AUROC of up to 0.94–0.99). Results indicate that a deep learning-based algorithm with transfer learning enables the construction of a fully automated and accurate prediction model for scoring liver fibrosis stages that is comparable to other conventional non-deep learning-based algorithms that are not fully automated.
Article
Full-text available
Aim The clinical impact of pathological classification based on architectural pattern in hepatocellular carcinoma (HCC) remains elusive in spite of its well-known and common feature. Methods The prognostic impact of pathological classification was examined with prospective database. Three hundred and eighty HCC patients who underwent curative hepatectomy as an initial treatment in Kumamoto University were enrolled as a test cohort. The outcome was confirmed with a validation cohort in Kyushu University. Results Macrotrabecular (macro-T) subtype (n = 38) and compact subtype (n = 43) showed similar biological and prognostic features. Both showed higher AFP level and worse overall survival than microrabecular (micro-T) subtype (n = 266). Multivariate analysis for overall survival revealed that DCP ≥ 40, multiple tumor and macro-T/compact subtype were associated with poor overall survival (risk ratio = 2.2, 1.6 and 1.6; p = 0.002, 0.020, and 0.047, respectively). Of note, 32% of macro-T/compact subtype showed early recurrence within 1 year, which showed substantially low (5%) 5 year overall survival, whereas 16% of micro-T/PG subtype did. Twenty-one percent of macro-T/compact subtype showed multiple intrahepatic metastases (≥ 4) or distant metastases, which resulted in non-curative treatment, whereas 5% of micro-T/PG subtype did. In validation cohort, macro-T/compact subtype was an independent predictor of worse overall survival. Conclusion Macro-T/compact subtype is biologically discriminated from micro-T and PG subtypes due to its aggressive features and poor prognosis after curative treatment. Additional treatment with curative hepatectomy on Macro-T/compact subtype should be discussed because of high possibility of systemic residual cancer cell.
Article
Full-text available
Purpose: The nonalcoholic fatty liver disease is the most common liver abnormality. Up to date, liver biopsy is the reference standard for direct liver steatosis quantification in hepatic tissue samples. In this paper we propose a neural network-based approach for nonalcoholic fatty liver disease assessment in ultrasound. Methods: We used the Inception-ResNet-v2 deep convolutional neural network pre-trained on the ImageNet dataset to extract high-level features in liver B-mode ultrasound image sequences. The steatosis level of each liver was graded by wedge biopsy. The proposed approach was compared with the hepatorenal index technique and the gray-level co-occurrence matrix algorithm. After the feature extraction, we applied the support vector machine algorithm to classify images containing fatty liver. Based on liver biopsy, the fatty liver was defined to have more than 5% of hepatocytes with steatosis. Next, we used the features and the Lasso regression method to assess the steatosis level. Results: The area under the receiver operating characteristics curve obtained using the proposed approach was equal to 0.977, being higher than the one obtained with the hepatorenal index method, 0.959, and much higher than in the case of the gray-level co-occurrence matrix algorithm, 0.893. For regression the Spearman correlation coefficients between the steatosis level and the proposed approach, the hepatorenal index and the gray-level co-occurrence matrix algorithm were equal to 0.78, 0.80 and 0.39, respectively. Conclusions: The proposed approach may help the sonographers automatically diagnose the amount of fat in the liver. The presented approach is efficient and in comparison with other methods does not require the sonographers to select the region of interest.
Conference Paper
Full-text available
The explosion of image data on the Internet has the potential to foster more sophisticated and robust models and algorithms to index, retrieve, organize and interact with images and multimedia data. But exactly how such data can be harnessed and organized remains a critical problem. We introduce here a new database called ldquoImageNetrdquo, a large-scale ontology of images built upon the backbone of the WordNet structure. ImageNet aims to populate the majority of the 80,000 synsets of WordNet with an average of 500-1000 clean and full resolution images. This will result in tens of millions of annotated images organized by the semantic hierarchy of WordNet. This paper offers a detailed analysis of ImageNet in its current state: 12 subtrees with 5247 synsets and 3.2 million images in total. We show that ImageNet is much larger in scale and diversity and much more accurate than the current image datasets. Constructing such a large-scale database is a challenging task. We describe the data collection scheme with Amazon Mechanical Turk. Lastly, we illustrate the usefulness of ImageNet through three simple applications in object recognition, image classification and automatic object clustering. We hope that the scale, accuracy, diversity and hierarchical structure of ImageNet can offer unparalleled opportunities to researchers in the computer vision community and beyond.
Article
Deep learning with a convolutional neural network (CNN) is gaining attention recently for its high performance in image recognition. Images themselves can be utilized in a learning process with this technique, and feature extraction in advance of the learning process is not required. Important features can be automatically learned. Thanks to the development of hardware and software in addition to techniques regarding deep learning, application of this technique to radiological images for predicting clinically useful information, such as the detection and the evaluation of lesions, etc., are beginning to be investigated. This article illustrates basic technical knowledge regarding deep learning with CNNs along the actual course (collecting data, implementing CNNs, and training and testing phases). Pitfalls regarding this technique and how to manage them are also illustrated. We also described some advanced topics of deep learning, results of recent clinical studies, and the future directions of clinical application of deep learning techniques.
Conference Paper
Convolutional networks are at the core of most stateof-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks. Although increased model size and computational cost tend to translate to immediate quality gains for most tasks (as long as enough labeled data is provided for training), computational efficiency and low parameter count are still enabling factors for various use cases such as mobile vision and big-data scenarios. Here we are exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization. We benchmark our methods on the ILSVRC 2012 classification challenge validation set demonstrate substantial gains over the state of the art: 21.2% top-1 and 5.6% top-5 error for single frame evaluation using a network with a computational cost of 5 billion multiply-adds per inference and with using less than 25 million parameters. With an ensemble of 4 models and multi-crop evaluation, we report 3.5% top-5 error and 17.3% top-1 error.
Article
Fine-needle biopsy (FNB) is associated with problems, such as tumor seeding, which are probably underestimated. The aim of this study was to validate prospectively the accuracy of our diagnostic work-up without FNB, for defining indications for surgery in a cohort of patients with focal liver lesions (FLLs). Between January 1997 and December 1998, 160 consecutive patients carrying 225 FLLs admitted to our department were evaluated prospectively. Preoperative diagnoses were established by means of clinical histories, serum tumor marker levels, ultrasonography, and spiral computed tomography (CT). Angiography, magnetic resonance imaging (MRI), and Lipiodol-CT were performed when it was considered necessary to plan the surgical strategy. All the patients underwent surgery and results of pathological examinations were obtained for all of them. The preoperative diagnoses of 221 of the 225 lesions (98.2%) were confirmed, and the indications for liver resection in 156 of the 160 patients (97.5%) were correct. The respective accuracy, sensitivity, specificity, and positive and negative predictive values were 99.6%, 100%, 98.9%, 99.3%, and 100% for diagnosis of hepatocellular carcinoma (HCC); 99.1%, 100%, 98.8%, 96.9%, and 100% for metastases; 99.6%, 100%, 99.5%, 91%, and 100% for cholangiocellular carcinomas (CCCs); all 100% for mixed HCC-CCCs; and 98.7%, 57.1%, 100%, 100%, and 98.6% for benign tumors. In view of these results, the fact that the real risks of FNB have yet to be established and the possibility that tumor seeding has a major impact on patient prognosis, the use of FNB should be drastically limited.