ArticlePDF Available

Dynamic contrast-enhanced computed tomography diagnosis of primary liver cancers using transfer learning of pretrained convolutional neural networks: Is registration of multiphasic images necessary?

May 2019
International Journal of Computer Assisted Radiology and Surgery 14(8)

May 2019
14(8)

DOI:10.1007/s11548-019-01987-1

Authors:

Akira Yamada

Shinshu University

Eriko Yoshizawa

Shinshu University

Show all 7 authorsHide

Purpose To evaluate the effect of image registration on the diagnostic performance of transfer learning (TL) using pretrained convolutional neural networks (CNNs) and three-phasic dynamic contrast-enhanced computed tomography (DCE-CT) for primary liver cancers. Methods We retrospectively evaluated 215 consecutive patients with histologically proven primary liver cancers, including six early, 58 well-differentiated, 109 moderately differentiated, 29 poorly differentiated hepatocellular carcinomas (HCCs), and 13 non-HCC malignant lesions containing cholangiocellular components. We performed TL using various pretrained CNNs and preoperative three-phasic DCE-CT images. Three-phasic DCE-CT images were manually registered to correct respiratory motion. The registered DCE-CT images were then assigned to the three color channels of an input image for TL: pre-contrast, early phase, and delayed phase images for the blue, red, and green channels, respectively. To evaluate the effects of image registration, the registered input image was intentionally misaligned in the three color channels by pixel shifts, rotations, and skews with various degrees. The diagnostic performances (DP) of the pretrained CNNs after TL in the test set were compared by three general radiologists (GRs) and two experienced abdominal radiologists (ARs). The effects of misalignment in the input image and the type of pretrained CNN on the DP were statistically evaluated. Results The mean DPs for histological subtype classification and differentiation in primary malignant liver tumors on DCE-CT for GR and AR were 39.1%, and 47.9%, respectively. The highest mean DPs for CNNs after TL with pixel shifts, rotations, and skew misalignments were 44.1%, 44.2%, and 43.7%, respectively. Two-way analysis of variance revealed that the DP is significantly affected by the type of pretrained CNN (P = 0.0001), but not by misalignments in input images other than skew deformations. Conclusion TL using pretrained CNNs is robust against misregistration of multiphasic images and comparable to experienced ARs in classifying primary liver cancers using three-phasic DCE-CT.

Mean diagnostic performances of pre-trained CNNs after transfer learning in classification of primary liver malignant tumors using three-phasic DCE-CT misaligned by

…

Figures - uploaded by Akira Yamada

Content may be subject to copyright.

Content uploaded by Akira Yamada

Content may be subject to copyright.

Title: Dynamic contrast-enhanced computed tomography diagnosis of primary liver cancers

using transfer learning of pre-trained convolutional neural networks: Is registration of

multi-phasic images necessary?

Authors: Akira Yamada (ORCID: 0000-0002-4199-203X), Kazuki Oyama, Sachie Fujita, Eriko

Yoshizawa, Fumihito Ichinohe, Daisuke Komatsu, Yasunari Fujinaga

Affiliation: Shinshu University School of Medicine, Department of Radiology

Corresponding author: Akira Yamada

Address: 3-1-1 Asahi, Matsumoto, Nagano 390-8621, Japan

E-mail: a_yamada@shinshu-u.ac.jp

Telephone number: +81-263-37-2650

Fax number: +81-263-37-3087

Conflict of interest:

The authors declare that they have no conflict of interest.

Manuscript Click here to access/download;Manuscript;manuscript_R1.docx

Click here to view linked References

Abstract

Purpose: To evaluate the effect of image registration on the diagnostic performance of transfer

learning (TL) using pre-trained convolutional neural networks (CNNs) and three-phasic

dynamic contrast-enhanced computed tomography (DCE-CT) for primary liver cancers.

Methods: We retrospectively evaluated 215 consecutive patients with histologically proven

primary liver cancers, including six early, 58 well-differentiated, 109 moderately-differentiated,

and 29 poorly-differentiated hepatocellular carcinomas (HCCs), and 13 non-HCC malignant

lesions containing cholangiocellular components. We performed TL using various pre-trained

CNNs and preoperative three-phasic DCE-CT images. Three-phasic DCE-CT images were

manually registered to correct respiratory motion. The registered DCE-CT images were then

assigned to the three color channels of an input image for TL: pre-contrast, early phase, and

delayed phase images for the blue, red, and green channels, respectively. To evaluate the effects

of image registration, the registered input image was intentionally misaligned in the three color

channels by pixel shifts, rotations, and skews with various degrees. The diagnostic

performances (DP) of the pre-trained CNNs after TL in the test set were compared by three

general radiologists (GRs) and two experienced abdominal radiologists (ARs). The effects of

misalignment in the input image and the type of pre-trained CNN on the DP were statistically

evaluated.

Results: The mean DPs for histological subtype classification and differentiation in primary

malignant liver tumors on DCE-CT for GR and AR were 39.1%, and 47.9%, respectively.

The highest mean DPs for CNNs after TL with pixel shifts, rotations, and skew misalignments

were 44.1%, 44.2%, and 43.7%, respectively. Two-way analysis of variance revealed that the

DP is significantly affected by the type of pre-trained CNN (P = 0.0001), but not by

misalignments in input images other than skew deformations.

Conclusion: TL using pre-trained CNNs is robust against misregistration of multi-phasic images,

and comparable to experienced ARs in classifying primary liver cancers using three-phasic

DCE-CT.

Key words:

Primary liver cancer; Dynamic contrast-enhanced computed tomography; Transfer learning;

Convolutional neural network; Registration

Introduction

Diagnostic imaging of primary liver cancers is important, because primary liver

cancers are often treated through imaging diagnosis only, without pathological diagnosis [1].

Furthermore, the therapeutic strategy can differ significantly depending on the pathological

subtype. For example, it has been reported that macro-trabecular and compact types, which are

common in poorly differentiated hepatocellular carcinoma (HCC), exhibit higher rates of

recurrence after transarterial catheter embolization than after hepatectomy or radio frequency

ablation [2].

A convolutional neural network (CNN) is a machine learning algorithm that has

attracted considerable attention in diagnostic imaging, because it can perform as well as or

better than humans in image classification tasks [3]. The advantage of transfer learning (TL)

using pre-trained CNNs compared to usual deep learning algorithms utilizing untrained CNNs is

that TL can achieve a high classification performance with a relatively small dataset [3]. The

usefulness of TL with pre-trained CNNs for liver disease has been demonstrated by several

studies [4,5]. However, the effect of misregistration of input images on the diagnostic

performance of CNNs has not been fully investigated. In particular, this is an important issue for

radiologists when multiple images are employed as input images for a CNN, such as in dynamic

contrast-enhanced computed tomography (DCE-CT), because respiratory misregistration

frequently occurs in DCE-CT imaging of the liver. Furthermore, manual registration is laborious

and time-consuming for radiologists.

The purpose of this study was to evaluate the effects of image registration on the

diagnostic performance of TL using a pre-trained CNN and three-phasic DCE-CT for primary

liver cancers.

Materials and Methods

Subjects

We retrospectively evaluated 215 consecutive patients (median age = 70 years; age

range = 34–85 years; male:female = 165:52) with histologically proven primary liver cancers in

a single institute (Shinshu University Hospital, Matsumoto, Japan) from 2005 to 2010,

including six early (eHCC), 58 well-differentiated (wHCC), 109 moderately differentiated

(mHCC), and 29 poorly differentiated (pHCC) HCCs, and 13 non-HCC malignant lesions

containing cholangiocellular components (CCC). Written informed consent was obtained from

all patients when preoperative DCE-CT was performed. The patients who did not undergo

preoperative DCE-CT within 1 month before hepatectomy were excluded from the study.

DCE-CT protocol

Three-phasic DCE-CT (a pre‐contrast phase and two phases after intravenous

contrast agent injection) was performed at 40 (early phase) and 130 s (delayed phase) after

injection, using a 64‐row CT scanner. The scan parameters were as follows: the range was

whole abdomen from the upper level of the diaphragm; the tube voltage was 120 kVp; the tube

current was 500 mA; the matrix had 512 × 512 pixels; the field of view was 320 × 320 mm; the

size of collimation was 0.625 mm; and the reconstruction thickness was 2.5 mm. A non-ionic

iodinated contrast agent (Iopamiron 370 mg/mL; Bayer Healthcare, Berlin, Germany) was

administered intravenously through a 22‐gauge catheter in the median cubital vein. The total

dose was 100 mL, and the rate of injection was 3 mL/s.

TL using pre-trained CNNs

TL was performed using various pre-trained CNNs (Alexnet, VGG-16, VGG-19,

GoogLeNet, Inception-v3, ResNet-50, and ResNet-101) and preoperative three-phasic DCE-CT

images at the maximal cross-sectional lesion area. In the image presentation in TL, three-phasic

DCE-CT DICOM images were manually registered to correct respiratory motion by an

abdominal radiologist (A.Y.) who has 18 years of diagnostic experience. The registered

three-phasic DCE-CT DICOM images were then cropped at the hepatic lesion, and assigned to

the three color channels of an input JPEG image for TL as follows: pre-contrast, early phase,

and delayed phase images for the blue, red, and green channels, respectively. The window level

and width for DICOM images were fixed as 80 and 350 Hounsfield units, respectively. The

image size was transformed according to the utilized pre-trained CNN (227 × 227 pixels for

Alexnet, 299 × 299 pixels for Inception-v3, and 224 × 224 pixels for the other CNNs). To

evaluate the effect of registration, manually registered input images were intentionally

misaligned to various degrees in the three color channels by pixel shifts (0, 1, 2, 4, 8, 16, and 32

pixels), rotations (0, 1, 2, 4, 8, 16, and 32 degrees), and skews (0%, 1%, 2%, 4%, 8%, 16%, and

32%) (Fig. 1). The image with 0 pixel shift, 0 degree rotation, and 0% skew represents the

original registered image. The input images with specific degrees of misalignment were divided

into training (70%) and test (30%) sets, such that the proportion of histological subtypes was the

same in both sets. In the transfer learning procedure, the final three layers of the pre-trained

CNNs, originally developed for the ImageNet dataset (1,000 classes), were replaced by a fully

connected layer, a softmax layer, and a classification output layer (with five classes in this study,

including eHCC, wHCC, mHCC, pHCC, and CCC). To learn faster in the new layers than in the

transferred layers, the initial learning rate was set to a small value (0.0001). Meanwhile, the

learning rate factor for the new fully connected layer was set to a large value (20). The

mini-batch size was set to 10. A classification test was performed on the pre-trained CNNs after

TL with 500 iterations of training using the five-fold cross validation method. The mean value

of the obtained results was utilized for a statistical analysis. All the procedures were carried out

using MATLAB software (2018a, MathWorks, Natick, MA, USA).

Statistical analysis

The diagnostic performances (DP = [number of correctly classified cases] / [total

number of cases] × 100) of the pre-trained CNNs after TL in the test set were compared by three

general radiologists (GRs) and two experienced abdominal radiologists (ARs). The observer

agreement was tested by weighted kappa. The effects of misalignments (pixel shifts, rotations,

and skews) in the input image and the type of pre-trained CNN on DP were statistically

evaluated by two-way analysis of variance (ANOVA) and a multiple comparison test using

Turkey’s honest significant difference criterion. A probability value of less than 0.05 or no

overlapping in a 95% confidence interval were regarded as statistically significant. All the

procedures were carried out using MATLAB software (2018a, MathWorks, Natick, MA, USA).

Results

The mean DPs for the classification of histological subtype and differentiation in

primary malignant liver tumors on DCE-CT for GR and AR were 39.1% and 47.9%,

respectively. The mean weighted kappa between observers was 0.92 (range 0.90-0.95).

Two-way ANOVA revealed that the type of pre-trained CNN (P < 0.0001) had a

significant effect on the DP, whereas the degree of misalignment in input images for TL (P =

0.17) when pixel-shift misalignment was applied (Table 1) did not. A multi-comparison

revealed that GoogLeNet exhibited the highest mean DP (44.1%) using input images misaligned

by pixel shift. Statistical significance was observed between GoogLeNet and some other

pre-trained CNNs (VGG-16, VGG-19, ResNet-50, and ResNet-101) (Fig. 2).

Significant effects on the DP were observed for the type of pre-trained CNN (P <

0.0001) and degree of misalignment of input images for TL (P = 0.001) when a rotation

misalignment was applied (Table 2). However, a multi-comparison revealed that there was no

significant difference in the DPs of CNNs between registered and misaligned input images (Fig.

3). GoogLeNet exhibited the highest mean DP (44.2%) using input images misaligned by

rotation. Statistical significance was observed between GoogLeNet and some other pre-trained

CNNs (Alexnet, VGG-16, VGG-19, and ResNet-50) (Fig. 3).

Two-way ANOVA revealed that the type of pre-trained CNN (P < 0.0001) and the

degree of misalignment in input images for TL (P < 0.0001) had a significant effect on the DP

when skew misalignment was applied (Table 3). There was a significant decrease in the DPs of

CNNs when the skew ratios in input images were 4% and 8% (Fig. 4). Inception-v3 and

GoogLeNet exhibited higher mean DPs (43.7% and 43.4%) even if skew misalignment was

applied. Statistical significance was observed between these two pre-trained CNNs and the

others (Alexnet, VGG-16, VGG-19, ResNet-50, and ResNet-101) (Fig. 4).

Discussion

Our results demonstrate the high diagnostic performance of TL using a pre-trained

CNN, which is comparable to experienced ARs in classifying primary liver cancers using

three-phasic DCE-CT. Our results also clarify that TL using particular pre-trained CNNs

(GoogLeNet and Inception-v3) was robust against misregistration of DCE-CT images, even if

the pre-trained CNNs were trained using RGB images without misalignment in the color

channels [6]. One of the common features between GoogLeNet and Inception-v3 is the

inception architecture, which enables efficient parameter reduction and allows for training

high-quality networks on relatively modest-sized training sets [7,8]. This architecture may relate

to the robustness against misregistration and higher diagnostic performance of TL using

multi-phasic DCE-CT images. However, further study is required to confirm this.

Our findings in this study can accelerate the application of TL using pre-trained

CNNs, not only in dynamic contrast-enhanced study, but also for multi-parametric imaging,

such as magnetic resonance imaging. This is because this approach can be more easily applied

in a clinical setting, without time-consuming registration procedures and using smaller training

datasets compared to conventional deep learning algorithms using untrained CNNs [3].

However, special caution should be exercised when applied this approach to hollow organs,

such as the heart or alimentary tract, which are frequently accompanied by skew-type

deformations, because some CNNs were not robust to skew misregistration.

In conclusion, TL using pre-trained CNNs is robust against misregistrations, and

comparable to experienced ARs in the classification of primary liver cancers using three-phasic

DCE-CT. Therefore, there is no need for the correction of misregistrations for TL using

pre-trained CNNs.

References

1) Torzilli G, Minagawa M, Takayama T, Inoue K, Hui AM, Kubota K, Ohtomo K,

Makuuchi M. (1999) Accurate preoperative evaluation of liver mass lesions without fine-needle

biopsy. Hepatology. 30:889-93.

2) Okabe H, Yoshizumi T, Yamashita YI, Imai K, Hayashi H, Nakagawa S, Itoh S, Harimoto N,

Ikegami T, Uchiyama H, Beppu T, Aishima S, Shirabe K, Baba H, Maehara Y. (2018)

Histological architectural classification determines recurrence pattern and prognosis after

curative hepatectomy in patients with hepatocellular carcinoma. PLoS One. 13:e0203856.

https://doi.org/10.1371/journal.pone.0203856.

3) Yasaka K, Akai H, Kunimatsu A, Kiryu S, Abe O. (2018) Deep learning with convolutional

neural network in radiology. Jpn J Radiol. 36:257-272.

https://doi.org10.1007/s11604-018-0726-3.

4) Byra M, Styczynski G, Szmigielski C, Kalinowski P, Michałowski Ł, Paluszkiewicz R,

Ziarkiewicz-Wróblewska B, Zieniewicz K, Sobieraj P, Nowicki A. (2018) Transfer learning

with deep convolutional neural network for liver steatosis assessment in ultrasound images. Int J

Comput Assist Radiol Surg. 13:1895-1903. https://doi.org/10.1007/s11548-018-1843-2.

5) Yu Y, Wang J, Ng CW, Ma Y, Mo S, Fong ELS, Xing J, Song Z, Xie Y, Si K, Wee A, Welsch

RE, So PTC, Yu H. (2018) Deep learning enables automated scoring of liver fibrosis stages. Sci

Rep. 8:16016. https://doi.org/10.1038/s41598-018-34300-2.

6) Deng J, Dong W, Socher R, Li L, Li K, Fei-Fei L. (2009) ImageNet: A large-scale hierarchical

image database. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

https://doi.org/10.1109/CVPR.2009.5206848.

7) Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V,

Rabinovich A. (2015) Going deeper with convolutions. IEEE Conference on Computer Vision

and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR.2015.7298594.

8) Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. (2016) Rethinking the inception

architecture for computer vision. IEEE Conference on Computer Vision and Pattern Recognition

(CVPR). https://doi.org/10.1109/CVPR.2016.308.

Figure Captions

Fig. 1 Illustration of input image preparation from three-phasic DCE-CT for transfer learning

using a pre-trained CNN. Three-phasic DCE-CT images were manually registered to correct

respiratory motion (misaligned value = 0). The registered three-phasic DCE-CT images were

then assigned into the three color channels of an input image for transfer learning as follows:

pre-contrast, early phase, and delayed phase images for the blue, red, and green channels,

respectively. The manually registered input images were intentionally misaligned in the three

color channels by pixel shifts, rotations, and skews with various misaligned values, to generate

misaligned input images for transfer learning. DCE-CT = dynamic contrast-enhanced computed

tomography, CNN = convolutional neural network

Fig. 2 Multi-comparison of diagnostic performances of CNNs according to pixel-shift values in

misaligned input images and the type of pre-trained CNN. Circles and bars indicate the mean

values and 95% confidence intervals, respectively. There was no significant difference between

diagnostic performances of CNNs for registered and misaligned input images. GoogLeNet

exhibited the highest mean diagnostic performance (44.1%) using input images by misaligned

pixel shift. Statistical significance was observed between GoogLeNet and some other

pre-trained CNNs (VGG-16, VGG-19, ResNet-50, and ResNet-101). CNN = convolutional

neural network

Fig. 3 Multi-comparison of diagnostic performances of CNNs according to the rotation value in

misaligned input images and the type of pre-trained CNN. Circles and bars indicate the mean

values and 95% confidence intervals, respectively. There was no significant difference between

the diagnostic performances of CNNs for registered and misaligned input images. GoogLeNet

exhibited the highest mean diagnostic performance (44.2%) using input images misaligned by

rotation. Statistical significance was observed between GoogLeNet and some other pre-trained

CNNs (Alexnet, VGG-16, VGG-19, and ResNet-50). CNN = convolutional neural network

Fig. 4 Multi-comparison of diagnostic performances of CNNs according to skew values in

misaligned input images and the type of pre-trained CNN. Circles and bars indicate the mean

values and 95% confidence intervals, respectively. There was a significant decrease in the

diagnostic performances of CNNs when skew values in misaligned input images were 4% and

8%. Inception-v3 and GoogLeNet exhibited higher mean diagnostic performances (43.7% and

43.4%) using input images misaligned by skewing. Statistical significance was observed

between these two pre-trained CNNs and the others (Alexnet, VGG-16, VGG-19, ResNet-50,

and ResNet-101). CNN = convolutional neural network

DCE-CT

Pre Contrast

Early Phase

Delayed Phase

Registered Input Image

(Misaligned Value = 0)

Blue Channel

Green Channel

Red

Channel

Misaligned Input Image

(Various Misaligned Values)

(8 pixels)

(32 pixels)

(8 degrees)

(32 degrees)

(8%)

(32%)

Pixel-shift

Rotation

Skew

Figure1

34 36 38 40 42 44 46

Diagnostic Performance [%]

ResNet-101

ResNet-50

Inception-v3

GoogLeNet

VGG-19

VGG-16

Alexnet

Type of pre-trained CNN

34 36 38 40 42 44 46

Diagnostic Performance [%]

Pixel-shift Value [pixel]

Figure2

34 36 38 40 42 44 46

Diagnostic Performance [%]

Rotation Value [degree]

34 36 38 40 42 44 46

Diagnostic Performance [%]

ResNet-101

ResNet-50

Inception-v3

GoogLeNet

VGG-19

VGG-16

Alexnet

Type of pre-trained CNN

Figure3

34 36 38 40 42 44 46

Diagnostic Performance [%]

Skew Value [%]

34 36 38 40 42 44 46

Diagnostic Performance [%]

ResNet-101

ResNet-50

Inception-v3

GoogLeNet

VGG-19

VGG-16

Alexnet

Type of pre-trained CNN

Figure4

Table 1. Mean diagnostic performances of pre-trained CNNs after transfer learning in

classification of primary liver malignant tumors using three-phasic DCE-CT misaligned by

pixel shifts

Pixel shift

[pixel]

Type of pre-trained CNN

Alexnet

VGG-16

VGG-19

GoogLeNet

Inception-v3

ResNet-50

ResNet-101

42.7%

42.8%

37.7%

42.5%

40.8%

38.6%

41.9%

39.4%

37.8%

44.0%

44.9%

33.8%

40.6%

44.0%

40.9%

38.5%

44.3%

40.3%

38.8%

39.7%

36.3%

39.1%

38.2%

44.3%

38.5%

41.5%

35.7%

41.8%

47.1%

45.5%

43.4%

36.3%

38.2%

43.7%

40.9%

37.8%

47.1%

41.2%

39.7%

33.8%

39.1%

41.2%

40.9%

43.4%

36.6%

39.1%

The image with 0 pixel shift represents the original registered image. Only images with the

same pixel shift were used for the training and validation. Diagnostic performances (DP =

[number of correctly classified cases] / [total number of cases] × 100) of various pre-trained

CNNs after transfer learning in the test set are shown. Two-way ANOVA revealed that the type

of pre-trained CNN (P < 0.0001) has a significant effect on the DP, but pixel-shift values in the

input images for TL do not (P = 0.17). CNN = convolutional neural network, ANOVA = analysis

of variance

Table

Table 2. Mean diagnostic performances of pre-trained CNNs after transfer learning in

classification of primary liver malignant tumors using three-phasic DCE-CT misaligned by

rotation

Rotation

[degree]

Type of pre-trained CNN

Alexnet

VGG-16

VGG-19

GoogLeNet

Inception-v3

ResNet-50

ResNet-101

42.7%

42.8%

37.7%

42.5%

40.8%

38.6%

41.9%

38.5%

38.8%

39.7%

41.2%

41.8%

37.8%

40.0%

39.7%

42.5%

41.2%

43.1%

43.7%

36.6%

38.2%

39.7%

37.8%

43.7%

47.4%

45.5%

32.3%

39.1%

40.6%

35.1%

34.2%

42.8%

42.5%

32.3%

41.5%

40.6%

39.1%

36.0%

48.3%

47.7%

46.2%

42.8%

40.0%

39.7%

40.3%

44.3%

41.5%

44.3%

44.9%

The image with 0 degree rotation represents the original registered image. Only images with the

same rotation were used for the training and validation. Diagnostic performances (DP =

[number of correctly classified cases] / [total number of cases] × 100) of various pre-trained

CNNs after transfer learning in the test set are shown. Two-way ANOVA revealed that the type

of pre-trained CNN (P < 0.0001) and rotation value in the input image for TL (P = 0.001) have

significant effects on the DP. CNN = convolutional neural network, ANOVA = analysis of

variance

Table 3. Mean diagnostic performances of pre-trained CNNs after transfer learning in

classification of primary liver malignant tumors using three-phasic DCE-CT by misaligned

skewing

Skew

[%]

Type of pre-trained CNN

Alexnet

VGG-16

VGG-19

GoogLeNet

Inception-v3

ResNet-50

ResNet-101

42.7%

42.8%

37.7%

42.5%

40.8%

38.6%

41.9%

37.8%

41.2%

36.9%

43.7%

47.4%

36.9%

40.6%

35.4%

39.7%

38.8%

43.4%

45.5%

35.1%

38.5%

35.7%

38.2%

30.5%

44.3%

40.3%

35.4%

38.8%

39.1%

37.2%

35.7%

40.6%

36.6%

30.5%

39.4%

42.8%

40.0%

39.4%

45.2%

48.6%

36.6%

32.0%

44.9%

37.5%

44.0%

46.5%

41.8%

42.8%

The image with 0% skew represents the original registered image. Only images with the same

skewing were used for the training and validation. Diagnostic performances (DP = [number of

correctly classified cases] / [total number of cases] × 100) of various pre-trained CNNs after

transfer learning in the test set are shown. Two-way ANOVA revealed that the type of

pre-trained CNN (P < 0.0001) and the skew value in an input image for TL (P < 0.0001) have

significant effects on the DP. CNN = convolutional neural network, ANOVA = analysis of

variance

New radiomics approaches for hepatic tumor characterization through imaging analysis

Thesis

Jun 2021

Farid Ouhmich

To evaluate the status of a liver tumor, we usually perform a biopsy followed by an anatomo-pathological evaluation of the extracted sample. However, the biopsy, due to the small sampling size, does not testify the intra and inter-tumor heterogeneity, thus struggling in assessing precisely the phenotypical characteristics of the patients. Recent progress in medical imaging and data science fields enabled the emergence of a new technique called radiomics, that is partially answering these challenges. In this thesis, we have been focusing on hepatocellular carcinoma, and we built new imaging methods to characterize this widespread pathology. By incorporating temporal information through multiphase images, specialized UNet-like networks have been stacked in a cascaded architecture to provide a semantic segmentation of both the liver and its internal tissue (parenchyma, active & necrotic part of the tumor). To characterize the strong heterogeneity that resides in the tumor, we predict the histological grade on a fine scale (slice-wise), by re-using the features learned from the semantic segmentation network. Our preliminary results enable the production of a fine-detailed map of the tumor that separates well differentiated areas from poorly ones. Even though these results need to be confirmed with a larger cohort, we believe that medical images combined with deep modeling techniques may soon be introduced in a clinical workflow to help diagnose and evaluate the phenotypical characteristics of pathologies such as liver cancer.

Automated Detection of Primary Liver Cancer Using Different Deep Learning Approaches

Article

Full-text available

Jan 2024

Comparison of machine learning models and CEUS LI-RADS in differentiation of hepatic carcinoma and liver metastases in patients at risk of both hepatitis and extrahepatic malignancy

Article

Full-text available

Jun 2023
Canc Imag

Background CEUS LI-RADS (Contrast Enhanced Ultrasound Liver Imaging Reporting and Data System) has good diagnostic efficacy for differentiating hepatic carcinoma (HCC) from solid malignant tumors. However, it can be problematic in patients with both chronic hepatitis B and extrahepatic primary malignancy. We explored the diagnostic performance of LI-RADS criteria and CEUS-based machine learning (ML) models in such patients. Methods Consecutive patients with hepatitis and HCC or liver metastasis (LM) who were included in a multicenter liver cancer database between July 2017 and January 2022 were enrolled in this study. LI-RADS and enhancement features were assessed in a training cohort, and ML models were constructed using gradient boosting, random forest, and generalized linear models. The diagnostic performance of the ML models was compared with LI-RADS in a validation cohort of patients with both chronic hepatitis and extrahepatic malignancy. Results The mild washout time was adjusted to 54 s from 60 s, increasing accuracy from 76.8 to 79.4%. Through feature screening, washout type II, rim enhancement and unclear border were identified as the top three predictor variables. Using LI-RADS to differentiate HCC from LM, the sensitivity, specificity, and AUC were 68.2%, 88.6%, and 0.784, respectively. In comparison, the random forest and generalized linear model both showed significantly higher sensitivity and accuracy than LI-RADS (0.83 vs. 0.784; all P < 0.001). Conclusions Compared with LI-RADS, the random forest and generalized linear model had higher accuracy for differentiating HCC from LM in patients with chronic hepatitis B and extrahepatic malignancy.

Revisiting artificial intelligence diagnosis of hepatocellular carcinoma with DIKWH framework

Article

Full-text available

Mar 2023

Hepatocellular carcinoma (HCC) is the most common type of liver cancer with a high morbidity and fatality rate. Traditional diagnostic methods for HCC are primarily based on clinical presentation, imaging features, and histopathology. With the rapid development of artificial intelligence (AI), which is increasingly used in the diagnosis, treatment, and prognosis prediction of HCC, an automated approach to HCC status classification is promising. AI integrates labeled clinical data, trains on new data of the same type, and performs interpretation tasks. Several studies have shown that AI techniques can help clinicians and radiologists be more efficient and reduce the misdiagnosis rate. However, the coverage of AI technologies leads to difficulty in which the type of AI technology is preferred to choose for a given problem and situation. Solving this concern, it can significantly reduce the time required to determine the required healthcare approach and provide more precise and personalized solutions for different problems. In our review of research work, we summarize existing research works, compare and classify the main results of these according to the specified data, information, knowledge, wisdom (DIKW) framework.

Dual segmentation models for poorly and well-differentiated hepatocellular carcinoma using two-step transfer deep learning on dynamic contrast-enhanced CT images

Preprint

Full-text available

Jun 2022

Aim: The aim of this study was to develop dual segmentation models for poorly and well-differentiated hepatocellular carcinoma (HCC), using two-step transfer learning (TSTL) based on dynamic contrast-enhanced (DCE) computed tomography (CT) images. Methods: From 2013 to 2019, DCE CT images of 128 patients with 80 poorly differentiated and 48 well-differentiated HCCs were selected at our hospital. In the first transfer learning (TL) step, a pre-trained segmentation model with 192 CT images of lung cancer patients was retrained as a poorly differentiated HCC model. In the second TL step, a well-differentiated HCC model was built from a poorly differentiated HCC model. The average 3D Dice’s similarity coefficient (3D-DSC) and 95th-percentile of the Hausdorff distance (95% HD) were employed to evaluate the segmentation accuracy, based on a nested 4-fold cross-validation test. The DSC denotes the degree of regional similarity between the HCC reference regions and the regions estimated using the proposed models. The 95% HD is defined as the 95th-percentile of the maximum measures of how far two subsets of a metric space are from each other. Results: The average 3D-DSC and 95% HD were 0.849 ± 0.078 and 1.98 ± 0.71 mm, respectively, for poorly differentiated HCC regions, and 0.811 ± 0.089 and 2.01 ± 0.84 mm, respectively, for well-differentiated HCC regions. The average 3D-DSC for both regions was 1.2 times superior to that calculated without the TSTL. Conclusion: The proposed model using TSTL from the lung cancer dataset showed the potential to segment poorly and well-differentiated HCC regions on DCE CT images.

Application of artificial intelligence in the diagnosis of hepatocellular carcinoma

Article

Full-text available

Nov 2023

Hepatocellular carcinoma (HCC) is a major cause of cancer-related deaths worldwide. This review explores the recent progress in the application of artificial intelligence (AI) in radiological diagnosis of HCC. The Barcelona Classification of Liver Cancer criteria guides treatment decisions based on tumour characteristics and liver function indicators, but HCC often remains undetected until intermediate or advanced stages, limiting treatment options and patient outcomes. Timely and accurate diagnostic methods are crucial for enabling curative therapies and improving patient outcomes. AI, particularly deep learning and neural network models, has shown promise in the radiological detection of HCC. AI offers several advantages in HCC diagnosis, including reducing diagnostic variability, optimising data analysis and reallocating healthcare resources. By providing objective and consistent analysis of imaging data, AI can overcome the limitations of human interpretation and enhance the accuracy of HCC diagnosis. Furthermore, AI systems can assist healthcare professionals in managing the increasing workload by serving as a reliable diagnostic tool. Integration of AI with information systems enables comprehensive analysis of patient data, facilitating more informed and reliable diagnoses. The advancements in AI-based radiological diagnosis hold significant potential to improve early detection, treatment selection and patient outcomes in HCC. Further research and clinical implementation of AI models in routine practice are necessary to harness the full potential of this technology in HCC management.

Clinical applications of artificial intelligence in liver imaging

Article

May 2023
RADIOL MED

This review outlines the current status and challenges of the clinical applications of artificial intelligence in liver imaging using computed tomography or magnetic resonance imaging based on a topic analysis of PubMed search results using latent Dirichlet allocation. LDA revealed that "segmentation," "hepatocellular carcinoma and radiomics," "metastasis," "fibrosis," and "reconstruction" were current main topic keywords. Automatic liver segmentation technology using deep learning is beginning to assume new clinical significance as part of whole-body composition analysis. It has also been applied to the screening of large populations and the acquisition of training data for machine learning models and has resulted in the development of imaging biomarkers that have a significant impact on important clinical issues, such as the estimation of liver fibrosis, recurrence, and prognosis of malignant tumors. Deep learning reconstruction is expanding as a new technological clinical application of artificial intelligence and has shown results in reducing contrast and radiation doses. However, there is much missing evidence, such as external validation of machine learning models and the evaluation of the diagnostic performance of specific diseases using deep learning reconstruction, suggesting that the clinical application of these technologies is still in development.

When Liver Disease Diagnosis Encounters Deep Learning: Analysis, Challenges, and Prospects

Article

Mar 2023

Dual segmentation models for poorly and well-differentiated hepatocellular carcinoma using two-step transfer deep learning on dynamic contrast-enhanced CT images

Article

Dec 2022

The aim of this study was to develop dual segmentation models for poorly and well-differentiated hepatocellular carcinoma (HCC), using two-step transfer learning (TSTL) based on dynamic contrast-enhanced (DCE) computed tomography (CT) images. From 2013 to 2019, DCE-CT images of 128 patients with 80 poorly differentiated and 48 well-differentiated HCCs were selected at our hospital. In the first transfer learning (TL) step, a pre-trained segmentation model with 192 CT images of lung cancer patients was retrained as a poorly differentiated HCC model. In the second TL step, a well-differentiated HCC model was built from a poorly differentiated HCC model. The average three-dimensional Dice’s similarity coefficient (3D-DSC) and 95th-percentile of the Hausdorff distance (95% HD) were mainly employed to evaluate the segmentation accuracy, based on a nested fourfold cross-validation test. The DSC denotes the degree of regional similarity between the HCC reference regions and the regions estimated using the proposed models. The 95% HD is defined as the 95th-percentile of the maximum measures of how far two subsets of a metric space are from each other. The average 3D-DSC and 95% HD were 0.849 ± 0.078 and 1.98 ± 0.71 mm, respectively, for poorly differentiated HCC regions, and 0.811 ± 0.089 and 2.01 ± 0.84 mm, respectively, for well-differentiated HCC regions. The average 3D-DSC for both regions was 1.2 times superior to that calculated without the TSTL. The proposed model using TSTL from the lung cancer dataset showed the potential to segment poorly and well-differentiated HCC regions on DCE-CT images.

A deep learning-based approach for the diagnosis of adrenal adenoma: A new trial using CT

Article

May 2022

Objectives To develop and validate deep convolutional neural network (DCNN) models for the diagnosis of adrenal adenoma (AA) using CT. Methods This retrospective study enrolled 112 patients who underwent abdominal CT (non-contrast, early, and delayed phases) with 107 adrenal lesions (83 AAs and 24 non-AAs) confirmed pathologically and with eight lesions confirmed by follow-up as metastatic carcinomas. Three patients had adrenal lesions on both sides. We constructed 6 DCNN models from 6 types of input images for comparison: non-contrast images only (Model A), delayed Phase images only (Model B), three phasic images merged into a 3-channel (Model C), relative-washout-rate (RWR) image maps only (Model D), non-contrast and RWR maps merged into a 2-channel (Model E), and delayed phase and RWR maps merged into a 2-channel (Model F). These input images were prepared manually with cropping and registration of CT images. Each DCNN model with six convolutional layers was trained with data augmentation and hyper-parameter tuning. The optimal threshold values for binary classification were determined from the receiver-operating characteristic curve analyses. We adopted the nested cross-validation method, in which the outer 5-fold cross-validation was used to assess the diagnostic performance of the models and the inner 5-fold cross-validation was used to tune hyperparameters of the models. Results The AUCs with 95% confidence intervals of Models A–F were 0.94 [0.90, 0.98], 0.80 [0.69, 0.89], 0.97 [0.94, 1.00], 0.92 [0.85, 0.97], 0.99 [0.97, 1.00] and 0.94 [0.86, 0.99], respectively. Model E showed high AUC greater than 0.95. Conclusion DCNN models may be a useful tool for the diagnosis of AA using CT. Advances in knowledge The current study demonstrates a deep learning-based approach could differentiate adrenal adenoma from non-adenoma using multiphasic CT.

Deep learning enables automated scoring of liver fibrosis stages

Article

Full-text available

Oct 2018

Current liver fibrosis scoring by computer-assisted image analytics is not fully automated as it requires manual preprocessing (segmentation and feature extraction) typically based on domain knowledge in liver pathology. Deep learning-based algorithms can potentially classify these images without the need for preprocessing through learning from a large dataset of images. We investigated the performance of classification models built using a deep learning-based algorithm pre-trained using multiple sources of images to score liver fibrosis and compared them against conventional non-deep learning-based algorithms - artificial neural networks (ANN), multinomial logistic regression (MLR), support vector machines (SVM) and random forests (RF). Automated feature classification and fibrosis scoring were achieved by using a transfer learning-based deep learning network, AlexNet-Convolutional Neural Networks (CNN), with balanced area under receiver operating characteristic (AUROC) values of up to 0.85–0.95 versus ANN (AUROC of up to 0.87–1.00), MLR (AUROC of up to 0.73–1.00), SVM (AUROC of up to 0.69–0.99) and RF (AUROC of up to 0.94–0.99). Results indicate that a deep learning-based algorithm with transfer learning enables the construction of a fully automated and accurate prediction model for scoring liver fibrosis stages that is comparable to other conventional non-deep learning-based algorithms that are not fully automated.

Histological architectural classification determines recurrence pattern and prognosis after curative hepatectomy in patients with hepatocellular carcinoma

Article

Full-text available

Sep 2018
PLOS ONE

Aim The clinical impact of pathological classification based on architectural pattern in hepatocellular carcinoma (HCC) remains elusive in spite of its well-known and common feature. Methods The prognostic impact of pathological classification was examined with prospective database. Three hundred and eighty HCC patients who underwent curative hepatectomy as an initial treatment in Kumamoto University were enrolled as a test cohort. The outcome was confirmed with a validation cohort in Kyushu University. Results Macrotrabecular (macro-T) subtype (n = 38) and compact subtype (n = 43) showed similar biological and prognostic features. Both showed higher AFP level and worse overall survival than microrabecular (micro-T) subtype (n = 266). Multivariate analysis for overall survival revealed that DCP ≥ 40, multiple tumor and macro-T/compact subtype were associated with poor overall survival (risk ratio = 2.2, 1.6 and 1.6; p = 0.002, 0.020, and 0.047, respectively). Of note, 32% of macro-T/compact subtype showed early recurrence within 1 year, which showed substantially low (5%) 5 year overall survival, whereas 16% of micro-T/PG subtype did. Twenty-one percent of macro-T/compact subtype showed multiple intrahepatic metastases (≥ 4) or distant metastases, which resulted in non-curative treatment, whereas 5% of micro-T/PG subtype did. In validation cohort, macro-T/compact subtype was an independent predictor of worse overall survival. Conclusion Macro-T/compact subtype is biologically discriminated from micro-T and PG subtypes due to its aggressive features and poor prognosis after curative treatment. Additional treatment with curative hepatectomy on Macro-T/compact subtype should be discussed because of high possibility of systemic residual cancer cell.

Transfer learning with deep convolutional neural network for liver steatosis assessment in ultrasound images

Article

Full-text available

Aug 2018

Purpose: The nonalcoholic fatty liver disease is the most common liver abnormality. Up to date, liver biopsy is the reference standard for direct liver steatosis quantification in hepatic tissue samples. In this paper we propose a neural network-based approach for nonalcoholic fatty liver disease assessment in ultrasound. Methods: We used the Inception-ResNet-v2 deep convolutional neural network pre-trained on the ImageNet dataset to extract high-level features in liver B-mode ultrasound image sequences. The steatosis level of each liver was graded by wedge biopsy. The proposed approach was compared with the hepatorenal index technique and the gray-level co-occurrence matrix algorithm. After the feature extraction, we applied the support vector machine algorithm to classify images containing fatty liver. Based on liver biopsy, the fatty liver was defined to have more than 5% of hepatocytes with steatosis. Next, we used the features and the Lasso regression method to assess the steatosis level. Results: The area under the receiver operating characteristics curve obtained using the proposed approach was equal to 0.977, being higher than the one obtained with the hepatorenal index method, 0.959, and much higher than in the case of the gray-level co-occurrence matrix algorithm, 0.893. For regression the Spearman correlation coefficients between the steatosis level and the proposed approach, the hepatorenal index and the gray-level co-occurrence matrix algorithm were equal to 0.78, 0.80 and 0.39, respectively. Conclusions: The proposed approach may help the sonographers automatically diagnose the amount of fat in the liver. The presented approach is efficient and in comparison with other methods does not require the sonographers to select the region of interest.

Going deeper with convolutions

Conference Paper

Full-text available

Jun 2015

ImageNet: a Large-Scale Hierarchical Image Database

Conference Paper

Full-text available

Jun 2009
IEEE Comput Soc Conf Comput Vis Pattern Recogn

The explosion of image data on the Internet has the potential to foster more sophisticated and robust models and algorithms to index, retrieve, organize and interact with images and multimedia data. But exactly how such data can be harnessed and organized remains a critical problem. We introduce here a new database called ldquoImageNetrdquo, a large-scale ontology of images built upon the backbone of the WordNet structure. ImageNet aims to populate the majority of the 80,000 synsets of WordNet with an average of 500-1000 clean and full resolution images. This will result in tens of millions of annotated images organized by the semantic hierarchy of WordNet. This paper offers a detailed analysis of ImageNet in its current state: 12 subtrees with 5247 synsets and 3.2 million images in total. We show that ImageNet is much larger in scale and diversity and much more accurate than the current image datasets. Constructing such a large-scale database is a challenging task. We describe the data collection scheme with Amazon Mechanical Turk. Lastly, we illustrate the usefulness of ImageNet through three simple applications in object recognition, image classification and automatic object clustering. We hope that the scale, accuracy, diversity and hierarchical structure of ImageNet can offer unparalleled opportunities to researchers in the computer vision community and beyond.

Deep learning with convolutional neural network in radiology

Article

Mar 2018

Deep learning with a convolutional neural network (CNN) is gaining attention recently for its high performance in image recognition. Images themselves can be utilized in a learning process with this technique, and feature extraction in advance of the learning process is not required. Important features can be automatically learned. Thanks to the development of hardware and software in addition to techniques regarding deep learning, application of this technique to radiological images for predicting clinically useful information, such as the detection and the evaluation of lesions, etc., are beginning to be investigated. This article illustrates basic technical knowledge regarding deep learning with CNNs along the actual course (collecting data, implementing CNNs, and training and testing phases). Pitfalls regarding this technique and how to manage them are also illustrated. We also described some advanced topics of deep learning, results of recent clinical studies, and the future directions of clinical application of deep learning techniques.

Rethinking the Inception Architecture for Computer Vision

Conference Paper

Jun 2016

Convolutional networks are at the core of most stateof-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks. Although increased model size and computational cost tend to translate to immediate quality gains for most tasks (as long as enough labeled data is provided for training), computational efficiency and low parameter count are still enabling factors for various use cases such as mobile vision and big-data scenarios. Here we are exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization. We benchmark our methods on the ILSVRC 2012 classification challenge validation set demonstrate substantial gains over the state of the art: 21.2% top-1 and 5.6% top-5 error for single frame evaluation using a network with a computational cost of 5 billion multiply-adds per inference and with using less than 25 million parameters. With an ensemble of 4 models and multi-crop evaluation, we report 3.5% top-5 error and 17.3% top-1 error.

Accurate preoperative evaluation of liver mass lesions without fine‐needle biopsy

Article

Oct 1999
HEPATOLOGY

Fine-needle biopsy (FNB) is associated with problems, such as tumor seeding, which are probably underestimated. The aim of this study was to validate prospectively the accuracy of our diagnostic work-up without FNB, for defining indications for surgery in a cohort of patients with focal liver lesions (FLLs). Between January 1997 and December 1998, 160 consecutive patients carrying 225 FLLs admitted to our department were evaluated prospectively. Preoperative diagnoses were established by means of clinical histories, serum tumor marker levels, ultrasonography, and spiral computed tomography (CT). Angiography, magnetic resonance imaging (MRI), and Lipiodol-CT were performed when it was considered necessary to plan the surgical strategy. All the patients underwent surgery and results of pathological examinations were obtained for all of them. The preoperative diagnoses of 221 of the 225 lesions (98.2%) were confirmed, and the indications for liver resection in 156 of the 160 patients (97.5%) were correct. The respective accuracy, sensitivity, specificity, and positive and negative predictive values were 99.6%, 100%, 98.9%, 99.3%, and 100% for diagnosis of hepatocellular carcinoma (HCC); 99.1%, 100%, 98.8%, 96.9%, and 100% for metastases; 99.6%, 100%, 99.5%, 91%, and 100% for cholangiocellular carcinomas (CCCs); all 100% for mixed HCC-CCCs; and 98.7%, 57.1%, 100%, 100%, and 98.6% for benign tumors. In view of these results, the fact that the real risks of FNB have yet to be established and the possibility that tumor seeding has a major impact on patient prognosis, the use of FNB should be drastically limited.

Dynamic contrast-enhanced computed tomography diagnosis of primary liver cancers using transfer learning of pretrained convolutional neural networks: Is registration of multiphasic images necessary?

Abstract and Figures

Recommended publications

Impact on Liver Cancer Treatment of a First Erroneous Diagnosis of Hemangioma

The Treatment Response of Hepatocellular Carcinoma: Evaluation with Imaging Biomarkers

Significance of contrast enhanced computed tomography to reveal extra hepatic collateral supply of h...

Deep learning based liver cancer detection using watershed transform and Gaussian mixture model tech...