Content uploaded by Baek Hwan Cho
Author content
All content in this area was uploaded by Baek Hwan Cho on Aug 08, 2022
Content may be subject to copyright.
Available via license: CC BY-NC 4.0
Content may be subject to copyright.
301
Feasibility of a deep learning-based diagnostic
platform to evaluate lower urinary tract disorders
in men using simple uroflowmetry
Seokhwan Bang1,*,† , Sokhib Tukhtaev2,* , Kwang Jin Ko1, Deok Hyun Han1, Minki Baek1,
Hwang Gyun Jeon1, Baek Hwan Cho2, Kyu-Sung Lee1
1Department of Urology, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, 2Medical AI Research Center, Samsung Medical Center,
Sungkyunkwan University School of Medicine, Seoul, Korea
Purpose: To diagnose lower urinary tract symptoms (LUTS) in a noninvasive manner, we created a prediction model for bladder
outlet obstruction (BOO) and detrusor underactivity (DUA) using simple uroflowmetry. In this study, we used deep learning to ana-
lyze simple uroflowmetry.
Materials and Methods: We performed a retrospective review of 4,835 male patients aged ≥40 years who underwent a urody-
namic study at a single center. We excluded patients with a disease or a history of surgery that could affect LUTS. A total of 1,792
patients were included in the study. We extracted a simple uroflowmetry graph automatically using the ABBYY Flexicapture® im-
age capture program (ABBYY, Moscow, Russia). We applied a convolutional neural network (CNN), a deep learning method to pre-
dict DUA and BOO. A 5-fold cross-validation average value of the area under the receiver operating characteristic (AUROC) curve
was chosen as an evaluation metric. When it comes to binary classification, this metric provides a richer measure of classification
performance. Additionally, we provided the corresponding average precision-recall (PR) curves.
Results: Among the 1,792 patients, 482 (26.90%) had BOO, and 893 (49.83%) had DUA. The average AUROC scores of DUA
and BOO, which were measured using 5-fold cross-validation, were 73.30% (mean average precision [mAP]=0.70) and 72.23%
(mAP=0.45), respectively.
Conclusions: Our study suggests that it is possible to differentiate DUA from non-DUA and BOO from non-BOO using a simple uro-
flowmetry graph with a fine-tuned VGG16, which is a well-known CNN model.
Keywords: Artificial intelligence; Bladder outlet obstruction; Detrusor underactivity; Lower urinary tract symptoms
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0) which permits unrestricted
non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Original Article - Lower Urinary Tract Dysfunction
Received: 9 November, 2021 • Revised: 23 January, 2022 • Accepted: 24 February, 2022 • Published online: 25 March, 2022
Corresponding Author: Kyu-Sung Lee https://orcid.org/0000-0003-0891-2488
Department of Urology, Samsung Medical Center, Sungkyunkwan University School of Medicine, 81 Irwon-ro, Gangnam-gu, Seoul 06351, Korea
TEL: +82-2-3410-3554, FAX: +82-2-3410-3027, E-mail: ksleedr@skku.edu
Baek Hwan Cho https://orcid.org/0000-0001-7722-5660
Medical AI Research Center, Samsung Medical Center, Sungkyunkwan University School of Medicine, 81 Irwon-ro, Gangnam-gu, Seoul 06351, Korea
TEL: +82-2-3410-0885, FAX: +82-2-3410-0878, E-mail: baekhwan.cho@samsung.com
*These authors contributed equally to this study and should be considered co-first authors.
†Current affiliation: Department of Urology, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea.
ⓒ The Korean Urological Association www.icurology.org
Investig Clin Urol 2022;63:301-308.
https://doi.org/10.4111/icu.20210434
pISSN 2466-0493 • eISSN 2466-054X
302 www.icurology.org
Bang et al
https://doi.org/10.4111/icu.20210434
INTRODUCTION
Lower urinary tract symptoms ( LUTS) is a common
disease with multif actorial causes. The most common cause
of LUTS in men is benign prostate hyperplasia (BPH). Up
to 50% of men over 50 years of age and 80% of men over
80 years of age experience LUTS caused by BPH [1]. Detru-
sor underactivity (DUA) is another very common cause
of LUTS. One review f ound that between 9% and 28% of
patien ts with LUTS under 50 years of age had DU A, while
48% of those over 70 years of age had DUA [2]. LUTS is a
concept that includes voiding dysf unction and storage dys-
f unction, each f eature represented by DUA and bladder out-
let obstruction (BOO), respectively [3]. It is critical to distin-
guish between these two diseases because their treatments
and clinical responses diff er.
Urodynamic studies (UDSs) are the gold standard f or
the diagnosis and evaluation of LUTS. However, the use of
UDS is limited by its invasiveness. Porru et al. [4] found that
4% to 45% of patients experience UDS complications, mostly
urinary tract inf ection and hematuria. In addition, several
patients report feeling shame and discomf ort during the test
and post-t est anx iety [5].
Simple urof lowmetry, one component of UDS, is a simple,
noninvasive diagnostic screening procedure used to calculate
the flow rate of urine over time. Urof lowmetry produces a
uroflowmetry graph that contains information regarding
the voiding volume and maximum urine flow rate (Qmax) [6].
Several previous trials have attempted to categorize simple
urof lowmetry graphs into several groups; however, there has
been insuf ficient evidence and ob jective standards, including
lack of pressure data, to achieve this end. There is a lack of
evidence that uroflowmetry can distinguish obstructed void-
ing and DUA. However, as we have mentioned, this distinc-
tion is crucial in determining the appropriate treatment for
LUTS.
Medical image analysis, which uses deep learning algo-
rithms, has recently become more popular for developing
technologies such as image recognition [7,8]. Many studies
have used deep learning algorithms to classif y and diagnose
several diseases based on images [9]. For instance, convolu-
tional neural networks (CNNs) have recently f ocused on
optimizing technology for analyzing, patterning, and predict-
ing trends. In 2012, the CNN proposed by Krizhevsky et al.
[10] emphasized its high perf ormance in image recognition
at classification task. Since then, researchers in the medical
domain have been exploiting deep learning algorithms f or
various tasks to fully or partially automate the disease diag-
nosis.
This study sought to develop a f ully automated device
to distinguish DUA and BOO using patterns of simple uro-
f lowmetry with a deep learning method.
MATERIALS AND METHODS
1. Ethics statement
This study was perf ormed at a single center and was
conducted according to the tenets of the Declaration of Hel-
sinki. The Institutional Review Board of Samsung Medical
Center approved this study (approval number: 2019-12-062).
Informed consent was waived by the Institutional Review
Board of Samsung Medical Center (Seoul, Korea) because of
the study’s retrospective design.
2. Patients
We retrospectively reviewed the clinical data of 4,835
men who underwent a pressure-flow study at Samsung
Medi cal Center between December 2006 and December
2017. We analyzed all patients who were ≥40 years of age
and who underwent a pressure-f low study and f ocused on
the pattern of uroflowmetry regardless of storage f unction.
Those with diseases that can af fect lower urinary tract
function, bladder cancer, and prostate cancer were excluded.
Patients who underwent previous prostate, bladder, and/or
urethral surgeries and those with indwelling catheters (or
needing regular catheterization) were also excluded. Patients
with a history of cerebrovascular accident, neurologic dis-
orders, and spinal or pelvic bone trauma that could af f ect
LUTS were excluded. Patients who had voided volumes less
than 150 mL during simple uroflowmetry were also ex-
cluded. Finally, we excluded 77 patients whose study graphs
were insufficient for analysis. Theref ore, 1,792 patients were
ultimately included (Fig. 1).
Fig. 1. Study design. CVA, cerebrovascular accident.
Exclusion criteria
Catheterized: 1,187
CVA history: 274
Bladder, prostate cancer or
lower urinary tract surgery: 664
Voided volume <150 mL: 841
Insufficient test: 77
Screening
n=4,835
Analysis
n=1,792
Enrollment
n=1,869
303
Investig Clin Urol 2022;63:301-308. www.icurology.org
AI-based diagnostic platform to evaluate male LUTS
3. Urodynamic examination
The UDS were performed by experts according to the In-
ternational Continence Society Good Urodynamics Practices
protocol using an Aquarius TT UDS system (Laborie Medi-
cal Technologies, Toronto, ON, Canada) and a DORADO-KT
(Laborie Medical Technologies) [11]. The UDS are recorded
in f our versions (7 Rel Z, 8 Rel A, 11 Rel 6, 12 Rel 0), each of
which has a different output format.
DUA was defined as a bladder contractility index
(BCI=PdetQmax+5Qmax) <100 [12]. BOO was defined as a
BOO index (BOOI=PdetQma x–2Qmax) >40 [12].
4. Data pre-processing
The patients’ personal information and identification
numbers were deleted according to the regulations. The
urof lowmetry graph was extracted separately. The original
graph was composed of data, and numerical information (and
data that were not necessary for deep learning procedure).
We separated the graph data using ABBY Y Flexicapture®
(ABBY Y, Moscow, Russia), a program that permits the au-
tomated extraction of necessary parts from an image, except
text. Using the ABBY Y program, we extracted a urof lowm-
etry graph from the simple uroflowmetry test sheet (Fig. 2).
Deep learning models typically require a fixed image
specification f or training. Szegedy et al. [13] gained more
accuracy with a 299×299 pixels input size, keeping the com-
putational ef f ort constant. Zoph et al. [14] used both 299×299
and 331×331 pixels for training ImageNet models. Similarly,
we resized the resolution of all images to 299×299 pixels. Ow-
ing to the limited number of urof lowmetry graphs datasets,
we performed a data augmentation technique f or better
classif ication performance of the trained models. The aim of
data augmentation is to expand the size of a training datas-
et by generating modified images in the dataset. The nature
of uroflowmetry graphs is greatly different f rom natural
images such as dogs, cars, and pedestrian images. Thus, it is
impractical to apply popular data augmentation techniques
such as f lipping and rotation because the spatial correlation
of the urof lowmetry graph should be maintained. Therefore,
we applied the cropping approach only as data augmenta-
tion, where we cropped the lef t and right top/ bottom areas
along with the central area that maintained approximately
90% of the original graph.
5. Deep neural network model implementation
W e ad o pt e d R es N e t-18 [15 ], I n c ep t ion- V 3 [16 ], a n d V G G 16
[17] for the classification of the uroflowmetry images. Af ter
initializing with ImageNet-pretrained models, we extensively
tuned hyper-parameters such as the learning rate, batch size,
and activation functions in the training process. We trained
DUA classification models and BOO classif ication models
separately with the corresponding datasets.
To evaluate our models, 5-fold cross-validation was per-
f ormed. Pre-processed images were randomly divided into
Fig. 2. An outline of uroflowmetry graph extraction and data augmentation pipeline The ABBYY program provides the extraction area from the
original test sheet (A), then image augmentations (C) are made using the original crop (B).
ABC
304 www.icurology.org
Bang et al
https://doi.org/10.4111/icu.20210434
five non-overlapping subsets: f our subsets were used for
training and one was lef t for validation. This process was re-
peated for all five subsets so that each subset was evaluated
as a test set once. The results were averaged and recorded.
The average value of area under the receiver operating
characteristic (AUROC) curve derived f rom 5-fold cross-val-
idation and accordant mean average precision (mAP) values
for both DUA and BOO datasets were chosen as evaluation
metrics [18].
Keras, a high-level Python API, was used as our deep
learning platf orm, enabling f ast experimentation. The net-
works were implemented in the Ubuntu 16.04 LTS environ-
ment, equipped with a 1080Ti GeForce GPU series.
ResNet-18 has been heavily involved in the deep learn-
ing community for the last half decade, allowing researchers
to train deeper networks with the help of simply adding
identity mappings to every f ew stacked layers. We chose
ResNet-18 because it is light and suitable for our dataset at
hand. Similarly, Inception-V3 is a CNN model that gained
popularity in the deep learning community for its ap-
proach toward keeping the compute cost constant. Moreover,
Inception-V3 is known to improve the training ability of
a network through variations in properties. We employed
Inception-V3 to determine whether it could capture low-
level f eatures of our urof lowmetry graphs. The last network
we experimented with was VGG16, developed by the Visual
Geometry Group of the University of Oxford. It presented
a thoroughly evaluated network of increased depth, stick-
ing to 3×3 convolutional f ilters. The model is relatively more
straightforward than the ResNet and Inception counter-
parts and has achieved promising results in various tasks.
Therefore, we adopted VGG16 for the DUA and BOO datas-
ets as well. Since VGG16 outperf ormed the f ormer networks,
we present detailed explanations of hyperparameter tunings
of the VGG network alone. The model was optimized f or
DUA classification using a stochastic gradient descent with
a learning rate of 0.003. Likewise, the hyperparameters of
BOO classif ication were tuned as same as f or DUA except
for a learning rate of 0.01. The input size of 299×299 pixels
showed better results compared to smaller analogs for both
datasets.
6. Statistical analysis
Data analysis was performed using the Statistical Pack-
age for the Social Sciences (SPSS® St atistics v e rsi on 25.0;
SPSS Inc., IBM Corp., Chicago, IL, USA ), and a Student’s t-
test was used to compare patient characteristics. Statistical
significance was set at a p-value of <0.05.
RESULTS
As shown in Table 1, among the 1,792 patients, 482
(26.90%) had BOO, and 893 (49.83%) had DUA. There were
significant diff erences between BOO and non-BOO patients
in UDS parameters except time to voiding time. In DUA and
non-DUA patients, there were signif icant dif ferences in all
the pressure-flow study parameters, except age and voiding
vo lume.
As a result of deep learning evaluations, the mean 5-fold
cross-validation AUROC metrics for DUA classification
trained with ResNet-18 and Inception-V3 networks were
0.699 and 0.648, respectively. As mentioned, the best score of
Table 1. Baseline patient characteristics
Characteristic BOO p-value DUA p-value
No (n=1,310) Yes (n=482) No (n=899) Yes (n=893)
Age, y 66.41 64.01 <0.001 64.39 64.93 0.229
BOOI 18.06 61.08 <0.001 33.01 26.22 <0.001
BCI 98.86 114.68 <0.001 127.66 78.38 <0.001
Voiding efficacy 86.35 77.78 <0.001 86.16 81.92 <0.001
Qmax, mL/s 13.95 9.99 0.001 14.67 11.09 0.001
Average flow, mL/s 6.38 4.58 <0.001 6.86 4.95 <0.001
Voding time, s 66.13 72.83 0.022 54.38 81.58 <0.001
Flow time, s 50.00 57.28 <0.001 44.86 58.66 0.001
Time to peakflow, s 20.92 24.50 0.001 16.65 27.16 <0.001
Voided volume, mL 272.91 233.89 <0.001 262.04 262.79 0.881
Residual volume, mL 48.67 77.30 <0.001 45.82 66.99 <0.001
Values are presented as mean value only.
BOO, bladder outlet obstruction; DUA, detrusor underactivity; BOOI, bladder outelet obstruction index; BCI, bladder contractility index; Qmax,
maximum urine flow rate.
Student t-test.
305
Investig Clin Urol 2022;63:301-308. www.icurology.org
AI-based diagnostic platform to evaluate male LUTS
0.733 was obtained with a f ine-tuned VGG16 network. The
BOO classif ication trained with ResNet-18 and Inception-V3
networks were 0.661 and 0.560, respectively. The VGG16 net-
work trained with the BOO dataset also achieved a higher
discrimination rate of 0.722 than ResNet-18 and Inception-
V3. F i gs. 3 and 4 show the ROC curves and PR curves of
the VGG16 network f or the DUA and BOO datasets, re-
spectively. We also calculated the sensitivity and specif icity
Fig. 3. The mean ROC curve (A) and the mean PR curve (B) of VGG16 network for DUA vs. non-DUA classification. ROC, receiver operating charac-
teristic; DUA, detrusor underactivity; AUC, area under the curve; PR, precision-recall.
0.0
1.0
0.8
0.6
0.4
0.2
1.0
True positive rate
False positive rate
0.0
0.2 0.4 0.6 0.8
ROC curve of DUA vs. non-DUA
0.0
1.0
0.8
0.6
0.4
0.2
1.0
Precision
Recall
0.0
0.2 0.4 0.6 0.8
PR curve of DUA vs. non-DUA
ROC fold_01 (AUC=0.762)
ROC fold_02 (AUC=0.758)
ROC fold_03 (AUC=0.723)
ROC fold_04 (AUC=0.710)
ROC fold_05 (AUC=0.711)
Chance
Mean ROC (AUC=0.733+0.02)
+1 standard deviation
PR-curve fold_01 (area=0.753)
PR-curve fold_02 (area=0.710)
PR-curve fold_03 (area=0.698)
PR-curve fold_04 (area=0.685)
PR-curve fold_05 (area=0.674)
No-skill
Overall PR-curve (area=0.698)
AB
Fig. 4. The mean ROC curve (A) and the mean PR curve (B) of VGG16 network for BOO vs. non-BOO classification. ROC, receiver operating charac-
teristic; BOO, bladder outlet obstruction; AUC, area under the curve; PR, precision-recall.
0.0
1.0
0.8
0.6
0.4
0.2
1.0
True positive rate
False positive rate
0.0
0.2 0.4 0.6 0.8
ROC curve of BOO vs. non-BOO
0.0
1.0
0.8
0.6
0.4
0.2
1.0
Precision
Recall
0.0
0.2 0.4 0.6 0.8
PR curve of vs. non-BOO BOO
ROC fold_01 (AUC=0.728)
ROC fold_02 (AUC=0.722)
ROC fold_03 (AUC=0.720)
ROC fold_04 (AUC=0.727)
ROC fold_05 (AUC=0.712)
Chance
Mean ROC (AUC=0.722+0.00)
+1 standard deviation
PR-curve fold_01 (area=0.484)
PR-curve fold_02 (area=0.454)
PR-curve fold_03 (area=0.503)
PR-curve fold_04 (area=0.508)
PR-curve fold_05 (area=0.516)
No-skill
Overall PR-curve (area=0.452)
AB
Fig. 5. Model explainability with GRAD-
CAM++. The first row presents samples
from the VGG16 model trained with
the DUA dataset while the second row
depicts samples from the VGG16 model
trained with the BOO dataset. BOO,
bladder outlet obstruction.
0
50
100
150
200
250
0 50 100 150 200 250
Input image 0
50
100
150
200
250
0 50 100 150 200 250
Grad-CAM++ 0
50
100
150
200
250
0 50 100 150 200 250
Input image 0
50
100
150
200
250
0 50 100 150 200 250
Grad-CAM++
0
50
100
150
200
250
0 50 100 150 200 250
Input image 0
50
100
150
200
250
0 50 100 150 200 250
Grad-CAM++ 0
50
100
150
200
250
0 50 100 150 200 250
Input image 0
50
100
150
200
250
0 50 100 150 200 250
Grad-CAM++
306 www.icurology.org
Bang et al
https://doi.org/10.4111/icu.20210434
values of the DUA and BOO models. The sensitivity and the
specificity of VGG16 network for DUA dataset accounted
f o r 65.9% a nd 68.9% a t the ma x i mum Y oud e n ’s in dex [19].
The sensitivity and the specificity of VGG16 network f or
BOO dataset accounted f or 65.1% and 68.9% at the maximum
Youden’s index. Furthermore, because a f ine-tuned VGG16
was the winner among the three experimental models, we
only depicted the visualizations of a GRAD-CAM++ [20].
Visual explanation techniques such as GRA D-CAM++ are
used to produce rough localization mappings by highlight-
ing important regions in the image. GRAD-CAM++ provides
feature maps with respect to a specif ic class score to gener-
ate visual explanations. Fig. 5 illustrates some samples f rom
uroflowmetry images and their respective mappings next to
them. Evidently, GRAD-CAM++ activated the signal graphs
compared to background regions. This implies that models
learned to identify clinically proper regions in the images.
DISCUSSION
Since the introduction of simple uroflowmetry in 1948 [6],
several attempts have been made to establish a pattern of
analysis for this technique. Van de Beek et al. [21] attempted
to classif y and predict uroflowmetry. In this study, the group
attempted to f ormalize uroflowmetry and identify diagnostic
patterns among specialists. However, the predictive rate was
only 36%. Gacci et al. [22] published a common flow pattern
in 2007. They formulated urof lowmetric parameters and
searched f or the items of diagnostic suspicion of urof lowm-
etry curves. However, their agreement was not satisfactory,
as it had a kappa value of 0.05. Moreover, the analysis was
based on the lack of reproducibility and the characteristics
of simple uroflowmetry, which vary greatly depending on
the environment.
There have also been other attempts to predict or diag-
nose BOO. Bladder wall thickness (BWT) was predicted to be
increased by BOO as one of the indicators that can be mea-
sured by ultrasound [23]. Manieri et al. [24] f irst discussed
this possibility. Using 5 mm as the reference point and a
signif icant difference (r>0.6), this group f ound that 63% of
the normal group had values <5 mm, while 88% of patients
with BOO had values >5 mm. In contrast, Hakenberg et al.
[25] found that the BWT increased slightly with age, but not
signif icantly.
The penile cuf f test was also applied to measure BOO.
This test measures the detrusor contractility by detecting
the iso-volumetric bladder pressure [26]. An inf latable cuff
is placed around the penis shaf t and expands automatically
until the urine flow is interrupted. The next cuff then de-
flates rapidly to restart the f low. This cycle can be repeated
until the urination ends. The pressure required to interrupt
urinary flow during the cycle is considered to represent
bladder pressure (Pcuff.int) [27]. However, this method has
several limitations, including its high cost and the need f or
patients to be seated when they take t he test. The seated na-
ture of the test may introduce bias, as most men void while
standing [28]. We attempted to mitigate these limitations us-
ing deep learning.
The prediction of BPH through AI has also been sug-
gested by other researchers. Torshizi et al. [29] predicted
severity of BPH based on f uzzy-ontology, and the accuracy
was about 90%. However, the results of this study presented
the severity based on the results obtained through question-
naire and clinical examination, and our study showed a big
dif f erence in that it looked at the possi bility of diagnosis
only by graph analysis. In addition, a non-invasive prediction
of LUTS using ANN (artif icial neural network) was also
presented [30], but its accuracy did not satisf ied expectations.
In this study, we tried to overcome such limitations using
CNN, and the prediction of DUA is the f irst attemption.
In t his study, we proposed the use of a deep learning tool
as a diagnostic alternative to invasive UDSs. To our knowl-
edge, this is a novel approach. We believe that it can be used
as the basis for the development of a tool to compensate f or
the defect of the UDS. This study sought to determine if one
could use graph patterns to predict disease. We compared
patients with and without DUA and those with and without
BOO. We did not account for patients who may have both
DUA and BOO. Given the large number of other patients
with LUTS, the study attempted to identif y these complex
diseases. We used CNN to conf irm the accuracy of predic-
tions f or patients with BOO and DUA using only a simple
urof lowmetry graph. The raw signal data of the urodynamic
test results graph was not provided from the urodynamic
test device. Hence, an image capture software program, AB-
BY Y Flexicapture®, was used to extract 1,792 data samples
and there was no error case. Thus, we believe that the im-
age capture process was robust. This research is meaningful
in that it used a deep learning method to approach areas
that have not been investigated using prototype trials. We
consider that this is a meaningful work that will serve as
a cornerstone for f urther research. We experimented with
known algorithms offered for classification tasks such as
ResNet-18, Resnet-50, Inception-V3, Ef f icientnet-B0, however,
final predictions were not as good as VGG16’s (data not
shown). Besides, with our dataset, VGG19 attained the same
result as its VGG16 variants, therefore we decided to select
the lighter one. As this is a feasibility study of deep learn-
307
Investig Clin Urol 2022;63:301-308. www.icurology.org
AI-based diagnostic platform to evaluate male LUTS
ing models on urodynamic test data, a f urther study with a
larger dataset will be needed. Also, we will consider experi-
menting with recent models in our future study.
This study has several limitations. First, the prediction
rate of this study is only slightly over 70%, which indicates
that a higher prediction rate is required. Additionally, the
mean AUC scores of f ine-tuned VGG16 can be ameliorated
by increasing the number of training images. Second, the ca-
pacity to set the basis for model predictions is confined due
to the absence of external data. Although visual interpreta-
tions of GRAD-CAM++ in Fig. 5 provided some evidence
that the model discriminated between the signal graph and
gridlines in the background, the f ull interpretability needs
to be addressed in f uture work. And third, this study is ex-
cluded patients who had both BOO and DUA. We included
patients who had only BOO or only DUA. In f urther studies,
it is needed to be include this complexed situation to devel-
oped usef ul device to diagnose BOO and DUA.
CONCLUSIONS
Our study suggests possibility of automated and non-in-
vasive device to differentiate DUA f rom non-DUA and BOO
from non-BOO using a simple uroflowmetry graph with a
fine-tuned VGG16, which is a well-known CNN model.
CONFLICTS OF INTEREST
The authors have nothing to disclose.
FUNDING
This work was supported by a National Research Found-
ation of Korea (NRF) grant f unded by the Korean govern-
men t (Mi nist ry of Science and ICT ) ( No. 2017R1E1A1A01077487,
2020R1F1 A1070952).
AUTHORS’ CONTRIBUTIONS
Research conception and design: Seokhwan Bang and
Kyu-Sung Lee. Data acquisition: Sokhib Tukhtaev and Baek
Hwan Cho. Statistical analysis: Seokhwan Bang, Sokhib
Tukhtaev, and Baek Hwan Cho. Data analysis and inter-
pretation: Seokhwan Bang and Deok Hyun Han. Drafting
of the manuscript: Seokhwan Bang and Sokhib Tukhtaev.
Critical revision of the manuscript: Deok Hyun Han and
Kyu-Sung Lee. Obtaining f unding: Kyu-Sung Lee and Baek
Hwan Cho. Administrative, technical, or material support:
Minki Baek and Hwang Gyun Jeon. Supervision: Kwang Jin
Ko and Deok Hyun Han. Approval of the f inal manuscript:
Kyu-Sung Lee.
REFERENCES
1. Egan KB. The epidemiology of benign prostatic hyperplasia
associated with lower urinary tract symptoms: prevalence and
incident rates. Urol Clin North Am 2016;43:289-97.
2. Osman NI, Esperto F, Chapple CR. Detrusor underactivity and
the underactive bladder: a systematic review of preclinical and
clinical studies. Eur Urol 2018;74:633-43.
3. Han DH, Jeong YS, Choo MS, Lee KS. The efficacy of trans-
urethral resection of the prostate in the patients with weak
bladder contractility index. Urology 2008;71:657-61.
4. Porru D, Madeddu G, Campus G, Montisci I, Scarpa RM, Usai
E. Evaluation of morbidity of multi-channel pressure-flow
studies. Neurourol Urodyn 1999;18:647-52.
5. Yeung JY, Eschenbacher MA, Pauls RN. Pain and embarrass-
ment associated with urodynamic testing in women. Int Uro-
gynecol J 2014;25:645-50.
6. Chancellor MB, Rivas DA, Mulholland SG, Drake WM Jr. The
invention of the modern uroflowmeter by Willard M. Drake, Jr
at Jefferson Medical College. Urology 1998;51:671-4.
7. Shen D, Wu G, Suk HI. Deep learning in medical image analy-
sis. Annu Rev Biomed Eng 2017;19:221-48.
8. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Gha-
foorian M, et al. A survey on deep learning in medical image
analysis. Med Image Anal 2017;42:60-88.
9. Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanas-
wamy A, et al. Development and validation of a deep learning
algorithm for detection of diabetic retinopathy in retinal fun-
dus photographs. JAMA 2016;316:2402-10.
10. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification
with deep convolutional neural networks. Adv Neural Inf Pro-
cess Syst 2012;25:1097.
11. Schäfer W, Abrams P, Liao L, Mattiasson A, Pesce F, Spangberg
A, et al. Good urodynamic practices: uroflowmetry, filling
cystometry, and pressure-flow studies. Neurourol Urodyn
2002;21:261-74.
12. Abrams P. Bladder outlet obstruction index, bladder contrac-
tility index and bladder voiding efficiency: three simple indices
to define bladder voiding function. BJU Int 1999;84:14-5.
13. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethink-
ing the inception architecture for computer vision. ArXiv.
1512.00567 [Preprint]. 2015 [cited 2021 Jun 23]. Available
from: https://arxiv.org/abs/1512.00567.
14. Zoph B, Vasudevan V, Shlens J, Le QV. Learning transferable
architectures for scalable image recognition. ArXiv. 1707.07012
[Preprint]. 2018 [cited 2021 Jun 23]. Available from: https://
308 www.icurology.org
Bang et al
https://doi.org/10.4111/icu.20210434
arxiv.org/abs/1707.07012.
15. He K, Zhang X, Ren S, Sun J. Deep residual learning for image
recognition. ArXiv. 1512.03385 [Preprint]. 2015 [cited 2021
Aug 5]. Available from: https://arxiv.org/abs/1512.03385.
16. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, et al.
Going deeper with convolutions. ArXiv. 1409.4842 [Preprint].
2014 [cited 2021 Aug 5]. Available from: https://arxiv.org/
abs/1409.4842.
17. Simonyan K, Zisserman A. Very deep convolutional networks
for large-scale image recognition. ArXiv. 1409.1556 [Preprint].
2015 [cited 2021 Aug 5]. Available from: https://arxiv.org/
abs/1409.1556.
18. Hajian-Tilaki K. Receiver operating characteristic (ROC)
curve analysis for medical diagnostic test evaluation. Caspian J
Intern Med 2013;4:627-35.
19. Youden WJ. Index for rating diagnostic tests. Cancer
1950;3:32-5.
20. Chattopadhay A, Sarkar A, Howlader P, Balasubramanian VN.
Grad-CAM++: improved visual explanations for deep convo-
lutional networks. ArXiv. 1710.11063 [Preprint]. 2018 [cited
2018 Nov 9]. Available from: https://arxiv.org/abs/1710.11063.
21. Van de Beek C, Stoevelaar HJ, McDonnell J, Nijs HG, Casparie
AF, Janknegt RA. Interpretation of uroflowmetry curves by
urologists. J Urol 1997;157:164-8.
22. Gacci M, Del Popolo G, Artibani W, Tubaro A, Palli D, Vittori
G, et al. Visual assessment of uroflowmetry curves: description
and interpretation by urodynamists. World J Urol 2007;25:333-
7.
23. Lee HN, Lee YS, Han DH, Lee KS. Change of ultrasound esti-
mated bladder weight and bladder wall thickness after treat-
ment of bladder outlet obstruction with dutasteride. Low Urin
Tract Symptoms 2017;9:67-74.
24. Manieri C, Carter SS, Romano G, Trucchi A, Valenti M,
Tubaro A. The diagnosis of bladder outlet obstruction in men
by ultrasound measurement of bladder wall thickness. J Urol
1998;159:761-5.
25. Hakenberg OW, Linne C, Manseck A, Wirth MP. Bladder wall
thickness in normal adults and men with mild lower urinary
tract symptoms and benign prostatic enlargement. Neurourol
Urodyn 2000;19:585-93.
26. Van Mastrigt R, Pel JJ. Towards a noninvasive urodynamic di-
agnosis of infravesical obstruction. BJU Int 1999;84:195-203.
27. Griffiths CJ, Rix D, MacDonald AM, Drinnan MJ, Pickard RS,
Ramsden PD. Noninvasive measurement of bladder pressure
by controlled inflation of a penile cuff. J Urol 2002;167:1344-7.
28. Mangera A, Chapple C. Modern evaluation of lower urinary
tract symptoms in 2014. Curr Opin Urol 2014;24:15-20.
29. Torshizi AD, Zarandi MH, Torshizi GD, Eghbali K. A hy-
brid fuzzy-ontology based intelligent system to determine
level of severity and treatment recommendation for Benign
Prostatic Hyperplasia. Comput Methods Programs Biomed
2014;113:301-13.
30. Sonke GS, Heskes T, Verbeek AL, de la Rosette JJ, Kiemeney
LA. Prediction of bladder outlet obstruction in men with lower
urinary tract symptoms using artificial neural networks. J Urol
2000;163:300-5.