Conference PaperPDF Available

Invasive Ductal Carcinoma Detection by A Gated Recurrent Unit Network with Self Attention

Authors:

Abstract and Figures

Representing around 80% of breast cancer, Invasive Ductal Carcinoma is the most common type of breast cancer. In this work, we have proposed a self-attention GRU model to detect Invasive Ductal Carcinoma. Self-attention is a way to motivate the architecture paying the attention to different locations of the sequence generated by an image effectively mapping regions of the image. The model was used to discriminate between cancerous samples and non-cancerous samples through training on the breast cancer specimens. The ability of discriminative representation has been improved using the self-attention mechanism. We have achieved the best average accuracy of 86%, a mean f1 score of 86% from our proposed model (It should be noted that we used 1:1 train-test split to achieve this score). We also experimented with a baseline CNN, ResNets (ResNet-18, ResNet-34, ResNet-50) and RNN variants (LSTM, LSTM + Attention). Our simple recurrent architectures with the attention mechanism outperformed Convolutional Networks which are traditional choices for image classification tasks. We have demonstrated how the scale of data can play a big role in model selection by studying different RNN, CNN variations for breast cancer detection scheme. This result is expected to be helpful in the early detection of breast cancer.
Content may be subject to copyright.
4th International Conference on Electrical Information and Communication Technology (EICT), 20-22
December 2019, Khulna, Bangladesh
Invasive Ductal Carcinoma Detection by A Gated
Recurrent Unit Network with Self Attention
Ananna Biswas, Zabir Al Nazi, Tasnim Azad Abir
Dept. of Electronics and Communication Engineering
Khulna University of Engineering & Technology
Khulna, Bangladesh.
ananna9265@gmail.com, zabiralnazi@yahoo.com, tasnim.abir@ece.kuet.ac.bd
Abstract—Representing around 80 percent of breast cancer,
Invasive Ductal Carcinoma is the most common type of breast
cancer. In this work, we have proposed a self-attention GRU
model to detect Invasive Ductal Carcinoma. Self-attention is
a way to motivate the architecture paying the attention to
different locations of the sequence generated by an image
effectively mapping regions of the image. The model was used
to discriminate between cancerous samples and non-cancerous
samples through training on the breast cancer specimens. The
ability of discriminative representation has been improved using
the self-attention mechanism. We have achieved the best average
accuracy of 86%, a mean f1 score of 86% from our proposed
model (It should be noted that, we used 1:1 train-test split to achieve
this score). We also experimented with a baseline CNN, ResNets
(ResNet-18, ResNet-34, ResNet-50) and RNN variants (LSTM,
LSTM + Attention). Our simple recurrent architectures with the
attention mechanism outperformed Convolutioal Networks which
are traditional choices for image classification tasks. We have
demonstrated how the scale of data can play a big role in model
selection by studying different RNN, CNN variations for breast
cancer detection scheme. This result is expected to be helpful in
the early detection of breast cancer.
Index Terms—Gated Recurrent Unit, Recurrent Neural Net-
work, Self Attention, Invasive Ductal Carcinoma
I. INTRODUCTION
In recent years, deep learning has shown promising perfor-
mance in different domains ranging from biomedical to med-
ical diagnosis [1]. Deep convolutional neural networks have
wide applications in medical disease classification. DCNN
performs significantly good when trained with an optimized
hyperparameter set but it needs a significant amount of data to
train. Though RNN variants have shown reliable performances
in image classification tasks, there are limited research works
in this sector [2]. So, we have proposed an RNN-variant model
for the detection of Invasive Ductal Carcinoma.
Breast cancer is the most common cancer in women world-
wide. Invasive Ductal Carcinoma represents around 80 percent
of the breast cancer types. It is one of the top reasons
for woman cancer mortality. Early detection and accurate
identification of cancer type can facilitate early diagnosis and
timely treatment of breast cancer which can reduce the rate
of deaths. Deaths of half a million breast cancer patients
have already taken place and nearly 1.7 million new cases
are arising per year. These numbers are expected to increase
significantly in the upcoming years. So, automatic detection
of breast cancer is an important step toward diagnosis. There
are related approaches in this domain to detect breast cancer
with deep learning. The authors used a semi-automated seg-
mentation method to characterize all microcalcifications in [3].
A discrimination classifier model was constructed to classify
breast lesions based on microcalcifications and breast masses.
They compared the performances of SVM, KNN, LDA and DL
models and Deep learning-based approaches showed superior
results.
In [4], researches proposed an end-to-end recognition
method by a novel CSDCNN model. Augmentation was
applied to the BreaKHis dataset to boost the performance
of the classifier. Authors proposed a convolutional neural
network-based scheme for the classification of hematoxylin
and eosin(HE) stained breast specimens in [5]. Multiple feature
lists were explored with a sliding window approach for WSI
classification which achieved an area under ROC of 0.92.
In [6], authors proposed an Inception Recurrent Residual
Convolutional Neural Network (IRRCNN) for breast cancer
classification. BreakHis and Breast Cancer Classification Chal-
lenge 2015 datasets were used for image-based and patch-
based evaluation. A CNN model with high generalized ac-
curacy and minimal complexity was used to detect Invasive
Ductal Carcinoma (IDC), Malignant and Benign tumors from
histopathology and textual image datasets in [7]. SVM, Deci-
sion Tree, Logistic Regression and KNN were used to compare
the accuracy among them.
In [8], researchers proposed a CNN model based on the ex-
traction of image patches to classify breast cancer histopatho-
logical images from BreakHis. In the paper, model adaptation
was avoided for simplifying model architecture and reducing
computational costs. Some investigations had been done for
achieving a high recognition rate. An effective Deep CNN
method was used for the classification of HE stained histo-
logical breast cancer images. Data augmentation and feature
extraction had been done for increasing the robustness of the
classifier.
In this work, we have proposed a Gated Recurrent Unit
with additive attention for the detection of Invasive Ductal
Carcinoma. The main goal of our work is to claim that a Gated
978-1-7281-6040-5/19/$31.00 ©2019 IEEE
Recurrent Unit (special RNN) can also be used effectively
in the particular fields of the image classification rather than
CNN. For the reliability of our claim, we have compared
different deep neural networks from CNN to RNN for specific
model structures which are described in the subsection named
ˆ
aDeep learning approaches with Convolutional and Recurrent
Networksˆ
a of the methodology section. We have also com-
pared our model with other approaches which are shown in
Fig 8. In the result analysis section, we have discussed our
results and compared findings for the justification of our claim.
Furthermore, the conclusion has comprised of summary and
future objectives of our work.
II. METHODOLOGY
Deep learning has shown tremendous success in various
domains like image processing, computer vision, medical
imaging, natural language processing and many others [9],
[10].But there are limited resources of RNN based image
classification. So,we have decided to adopt a relatively new
approach- Gated Recurrent Unit with Attention mechanism to
detect the invasive ductal carcinoma.
A. Dataset
In this article, Breast Histopathology Images were used
for the detection of Invasive Ductal Carcinoma (IDC). The
datasets were collected from [11]. 162 whole mount slide
breast cancer specimens formed the original dataset that
scanned at 40x. There were 277,524 patches of size 50 ×50
where 198,738 images were IDC negative and 78,786 images
were IDC positive.
B. Pre-processing
The images are normalized prior to training. A sample of
the images from each class has been shown in fig 1.
Fig. 1: The samples of histology patches of IDC positive and
negative images
The dataset was split into training and testing subsets, with
a ratio of 50:50 while 15% of training data were used for the
validation. After normalizing images are randomly shuffled.
After shuffling, the shape of the training input tensor has been
Fig. 2: IDC positive image and transformed signal represen-
tation
converted from (137162, 50, 50, 3) to (137162, 2500, 3) and
the testing input tensor from (137161, 50, 50, 3) to (137161,
2500, 3).This is how the image can be represented as signal
which is shown in fig 2.
C. Deep learning approaches with Convolutional and Recur-
rent Networks
Generally, CNN is the first choice for image classification
including medical image analysis as CNN has hierarchical fea-
ture extraction ability which reduces the need for synthesizing
a separate feature extractor [12]. It helps the convolutional
layers to learn effectively for classifying the images. In this
regard, Residual Neural Network (ResNet) is an updated
version of CNN that can train a deep neural network with 150+
layers using the skip connection strategy. On the contrary,
RNN has quite a similar ability to recognize image features
handling sequential data across time [13]. A few research
works can be found in the RNN based image classification
though it has shown superior performances in many sequential
tasks primarily. So, in this paper, we have considered using
the RNN variants with an attention mechanism for classifying
images that have shown promising performance.
RNN suffers from gradient exploding and vanishing prob-
lems while LSTM can solve the problems using various
memory cells replacing the hidden units that can control the
flow of the information and reduces long-term dependencies
of the input data. But LSTM has some limitations of having a
large number of parameters compared to RNN which increases
computational complexity. In this case, Gated Recurrent Unit
(GRU) is a good option which has the ability to reduce the
number of gates in LSTM. Moreover, GRU controls the flow
of information in the same way as LSTM without using a
memory unit which makes GRU less complex compared to
LSTM.
Fig. 3: Baseline CNN Architecture
So, we have used GRU over LSTM for better computational
efficiency and faster training ability. Here, we have also used
an additive attention with GRU that extends Neural Networkˆ
as
capabilities in various predictions. Using self attention, RNNs
focus on a specific part of a subset of the given input data.
At every time steps, it focuses on different positions of the
data in the inputs. The attention mechanism improves the
GRU networkˆ
as ability for discriminative representation. The
purpose of the attention learning mechanism is to exploit the
intrinsic self-attention ability of GRU that has enhanced the
performances of the model in image classification. To classify
images, we have evaluated different neural architectures. They
are CNN, ResNet18, ResNet34, ResNet50, LSTM, LSTM
+ attention, and GRU + attention. The architecture of each
network is described below.
Baseline Convolutional Neural Network: The baseline
Convolutional Neural Network (CNN) consists of two
convolutional layers, two pooling layers, and two fully
connected layers shown in fig 3. ReLU has been used
as activation in intermediate layers, and sigmoid in final
dense layer.
Input layer loads input data and feeds to convolutional
layers. Our input image size was 50-by-50 where the
number of channels was 3. We have used two convo-
lutional layers that produce a set of filters from the input
data. There is one pooling layer after each convolutional
layer. The pooling layers downsample the spatial dimen-
sion of the input. Two types of activation functions have
been used here that introduce non-linear properties to
our network. Two dense layers feed all outputs from
the previous layer. First dense layer has 50 neurons and
second dense has 2 neurons in the network. The flatten
layer took place between the fully connected layer and
the convolutional layer. It transforms a two-dimensional
matrix into a vector. The vector can be fed into a
fully connected neural network classifier. For the sake
of regularization and solving the overfitting problem in
CNN dropout has been used.
Fig. 4: ResNet Block
Residual Networks: Resnet-18 is a convolutional net-
work that exploits residual learning.CNN suffers from
overfitting and optimization problems that increase train-
ing error. Residual Networks can train such deep net-
works through residual modules [14].The Residual net-
work follows the skip connection technique that sim-
plifies the network. So, it learns better than a baseline
CNN. We have experimented ResNet-18 with ResNet-
34 and ResNet-50 where ResNet-18 has shown better
performance. ResNet-18 has a basic block with the input
and output layer. The basic block of a Residual Network
is shown in fig 4. [15]
Long Short-Term Memory Network: The LSTM is
a Recurrent Neural Network (RNN) that can solve the
problem of short-term memory. The LSTM network has
three gates (forget gate, input gate and output gate) and
one cell state that regulates the flow of the information.
There is a sigmoid function that is used to squishes values
between 0 and 1. The forget gate decides whether the
information is important or not.
Fig. 5: LSTM+attention Architecture
Information from the current input and the previous
hidden is passed to the next state through the sigmoid
function. The output that is closer to 0 will be forgotten
and the output that is closer to 1 will be stored. The
duty of the input gate is to update all the state where the
cell state works as a transport highway or as a memory.
The output gate selects the next hidden state that is used
for the prediction. The LSTM network which is used
here for breast cancer detection is shown in fig 5. An
attention mechanism is also added with this network for
more accurate results in the detection of invasive ductal
carcinoma.
Fig. 6: GRU+attention Architecture
Gated Recurrent Unit Network: The GRU is a new gen-
eration of Recurrent Neural Network (RNN) which has
almost same functionality as that of the LSTM network.
The GRU network has basically two gates (reset gate
and update gate) and no cell sate. The functions of the
different gates of the GRU are described below. 1. Update
Gate: It determines the quantity of the past information
needed to be passed along into the future. It is similar to
the output gate of the LSTM network. 2. Reset Gate: It
determines how much of past information have to forget.
It works as the combination of the input gate and forget
gate of the LSTM network. 3. Current Memory Gate: It
is not regarded as an individual part rather a sub-part of
the Reset Gate as it is incorporated into the reset gate.
It helps to reduce the effect that past information has on
the current information. Though the funtions of the two
(GRU and LSTM) network are quite similar, the network
the GRU network is less complex than the LSTM network
[16]. In this research work, the GRU network with an
attention mechanism has been used for the early detection
of invasive ductal carcinoma which is shown in fig 6.
The attention has focused on the discriminatory regions
between the non-cancerous and cancerous images that
helped to achieve better performance. In the processing
of breast cancer data, the self-attention technique follows
the context for each timestep.
Attention: We have used additive local attention [17]–
[19].
ht,t0=tanh(xT
tWt
+x0T
tWx+bt)(1)
et,t0=σ(Waht,t0+ba)(2)
at=softmax(et)(3)
lt=X
t0
at,t0xt0(4)
A GRU or LSTM layer was used as an encoder for the
hidden state representations (ht). The attention matrix
Adetermines the similitude of any token with adjacent
tokens from the input signal representations of the im-
ages. Similarity between the hidden state representations
htand h0
tof tokens xtand x0
tat timesteps tand t0
are captured by attention element at,t0. The attention
scheme is implemented based on equations [1-4], where
Wt, and Wxdenote the weight matrices for htand h0
t
and Wais for representation of non-linear relationship
for the hidden states; btand baare the bias vectors.
The point-wise sigmoid operation has been representation
by σ. Finally, the attention hidden state representation l
is calculate, where ltis a token at timestep twhich is
calculated based on weighted summation of h0
tof all other
tokens at timestep t0and at,t0.ltdenotes the ammount
to attend to a token based on their adjacent context and
token importance.
TABLE I: PERFORMANCE COMPARISON
Model Accuracy Precision Recall F1 Score Hyperparameters
Learning
rate Optimizer Loss
function
Activation
function
CNN 0.73 0.52 0.73 0.61 0.05 Adam categorical
crossentropy
relu,
softmax
ResNet-50 0.73 0.68 0.73 0.70 0.001 Adam categorical
crossentropy
relu,
softmax
ResNet-34 0.77 0.75 0.77 0.76 0.001 Adam categorical
crossentropy
relu,
softmax
ResNet-18 0.79 0.81 0.79 0.80 0.001 Adam categorical
crossentropy
relu,
softmax
LSTM 0.82 0.81 0.82 0.81 0.001 RMSprop categorical
crossentropy sigmoid
LSTM +
attention 0.85 0.84 0.85 0.84 0.001 Adam categorical
crossentropy
sigmoid,
softmax
GRU +
attention 0.86 0.87 0.86 0.86 0.001 Adam categorical
crossentropy
sigmoid,
softmax
III. RES ULT ANALYSI S
In this work, we compared CNN, ResNet and multiple
RNN variants with our model. The proposed attention +
GRU model outperformed other models while the numbers
of hyperparameters were almost the same which is shown in
Table 1. As the dataset contained good enough samples, we
decided to train on only 50% of the data. We have used 50%
of the input data (137162 samples) for the training and the rest
50% (137161 samples) for the testing. Most of the experiments
were performed in Google Colab1where a maximum of 12 GB
ram is allowed (12GB NVIDIA Tesla K80 GPU), so we had to
use smaller set for training, but even after training the models
on only 50% of the data, the performance was comparable to
benchmark results. We have got a better estimate of our model
accuracy with this test data. This 50:50 train/test split has given
us a measure of the classifierˆ
as strength compared to other
classifiers. From table 1, we can compare the performance of
all the models used in the work. The baseline-CNN model and
the Resnet-50 model have shown quite similar performances
where the ResNet-18 model has shown better performance
than the ResNet-34 and the ResNet-50 model.
Fig. 7: Training curve of the attention RNN-GRU model
For the ResNet-50 model, the normalized values have to
pass many of the layers where our image size was 50 by 50
pixels. So, due to low spatial resolution and smaller training
split deeper networks were not able to learn most of the
useful features resulting in poor accuracy (0.73) and f1 score
(0.70). We can observe this pattern in table 1, as the CNN
model gets deeper the performance drops. For solving this
problem, we have used the LSTM model that has achieved
better accuracy (0.82) and f1 score (0.81) than the ResNet-
18 model. Furthermore, we have added a self-attention layer
1https://colab.research.google.com/
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
CO MPARIS ON WI TH OTHE R
APP ROACH ES
Fig. 8: Comparative performance analysis
for exploiting the intrinsic self-attention ability of LSTM that
increases both accuracy (0.85) and f1 score (0.84). But LSTM
has more computational complexity which takes a longer time
than other RNN variants. So, we have explored GRU with an
attention layer for making the model simple that decreases the
training time. We get the training curve of the model in fig 7.
We have used 15% of the training data for the validation.
Adam optimizer, categorical crossentropy loss function, and
sigmoid and softmax activation (Attention) functions have
been used here.During training our data we have used regular-
ization which is not used in our validation part. Generally, the
training accuracy is greater than the validation accuracy which
is quite common to all. But in this case, validation accuracy
is greater than training accuracy due to the regularization.
When we used dropout in training- disabling some neurons,
some of information about each sample has lost. So, we have
discovered the low performance of the training than validation-
where dropout has not been used. Overall, the GRU + attention
model has achieved a promising performance (accuracy=0.86
and f1 score=0.86). As the spatial dimension is limited in
the data, the results suggest the images can be represented
as 1 dimensional signals and still be classified with good
performance.
Fig.8 illustrates the performance comparison of different
models. In [7], CNN architecture has been used in the
breast cancer classification using histopathological images
and achieved 81% accuracy. Handcrafted features have been
used in [20] with the CNN model to detect invasive ductal
carcinoma where 84% accuracy has been achieved. In this
case, our proposed architecture has achieved 86% accuracy
which is promising.
IV. CONCLUSION
In this paper, we have proposed an attention + GRU network
for classifying images that performs better than CNN and
RNN variants. We have experimented with different CNN and
RNN models with similar hyperparameters that are chosen
very carefully for the training. The main contribution of our
research work is that, we have tried to investigate the effect
of spatial dimension on model selection and showed that
an efficiently trainable Gated Recurrent Unit with Attention
can outperform traditional neural networks. We have used
the attention mechanism with our proposed model that can
focus on the specific part of the input data. The attention
enhanced the performance of the model and improved learning
for the transformed data. The model has less computational
complexity than other RNN variants that reduces the training
time which is an important aspect of the real time image
diagnosis. We are yet to explore attention for explaining the
predictions and the optimization of the hyperparameters for
improving the performances in the IDC classification task.
REFERENCES
[1] D. Shen, G. Wu, and H.-I. Suk, “Deep learning in medical image
analysis,” Annual review of biomedical engineering, vol. 19, pp. 221–
248, 2017.
[2] L. Mou, P. Ghamisi, and X. X. Zhu, “Deep recurrent neural networks for
hyperspectral image classification,” IEEE Transactions on Geoscience
and Remote Sensing, vol. 55, no. 7, pp. 3639–3655, 2017.
[3] J. Wang, X. Yang, H. Cai, W. Tan, C. Jin, and L. Li, “Discrimination
of breast cancer with microcalcifications on mammography by deep
learning,” Scientific reports, vol. 6, p. 27327, 2016.
[4] Z. Han, B. Wei, Y. Zheng, Y. Yin, K. Li, and S. Li, “Breast cancer
multi-classification from histopathological images with structured deep
learning model,” Scientific reports, vol. 7, no. 1, p. 4172, 2017.
[5] B. E. Bejnordi, J. Lin, B. Glass, M. Mullooly, G. L. Gierach, M. E.
Sherman, N. Karssemeijer, J. Van Der Laak, and A. H. Beck, “Deep
learning-based assessment of tumor-associated stroma for diagnosing
breast cancer in histopathology images,” in 2017 IEEE 14th Interna-
tional Symposium on Biomedical Imaging (ISBI 2017). IEEE, 2017,
pp. 929–932.
[6] M. Z. Alom, C. Yakopcic, M. S. Nasrin, T. M. Taha, and V. K.
Asari, “Breast cancer classification from histopathological images with
inception recurrent residual convolutional neural network, Journal of
digital imaging, pp. 1–13, 2019.
[7] P. Mohapatra, B. Panda, and S. Swain, “Enhancing histopathological
breast cancer image classification using deep learning,” vol. 8, 06 2019.
[8] F. A. Spanhol, L. S. Oliveira, C. Petitjean, and L. Heutte, “Breast
cancer histopathological image classification using convolutional neural
networks,” in 2016 international joint conference on neural networks
(IJCNN). IEEE, 2016, pp. 2560–2567.
[9] M. Z. Alom, T. M. Taha, C. Yakopcic, S. Westberg, P. Sidike, M. S.
Nasrin, M. Hasan, B. C. Van Essen, A. A. Awwal, and V. K. Asari,
“A state-of-the-art survey on deep learning theory and architectures,”
Electronics, vol. 8, no. 3, p. 292, 2019.
[10] Z. Al Nazi and T. A. Abir, “Automatic skin lesion segmentation and
melanoma detection: Transfer learning approach with u-net and dcnn-
svm,” in Proceedings of International Joint Conference on Computa-
tional Intelligence. Springer, 2019, pp. 371–381.
[11] P. Mooney, “Breast histopathology images,
http://spie.org/Publications/Proceedings/Paper/10.1117/12.2043872,
2017.
[12] A. Khan, A. Sohail, U. Zahoora, and A. S. Qureshi, “A survey of
the recent architectures of deep convolutional neural networks, arXiv
preprint arXiv:1901.06032, 2019.
[13] B. Chandra and R. K. Sharma, “On improving recurrent neural network
for image classification,” in 2017 International Joint Conference on
Neural Networks (IJCNN). IEEE, 2017, pp. 1904–1907.
[14] M. R. Mamun, Z. Al Nazi, and M. S. U. Yusuf, “Bangla handwritten
digit recognition approach with an ensemble of deep residual networks,”
in 2018 International Conference on Bangla Speech and Language
Processing (ICBSLP). IEEE, 2018, pp. 1–4.
[15] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
recognition,” in Proceedings of the IEEE conference on computer vision
and pattern recognition, 2016, pp. 770–778.
[16] R. Jozefowicz, W. Zaremba, and I. Sutskever, An empirical exploration
of recurrent network architectures,” in International Conference on
Machine Learning, 2015, pp. 2342–2350.
[17] G. Zheng, S. Mukherjee, X. L. Dong, and F. Li, “Opentag: Open attribute
value extraction from product profiles, in Proceedings of the 24th ACM
SIGKDD International Conference on Knowledge Discovery & Data
Mining. ACM, 2018, pp. 1049–1058.
[18] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by
jointly learning to align and translate,” arXiv preprint arXiv:1409.0473,
2014.
[19] M.-T. Luong, H. Pham, and C. D. Manning, “Effective ap-
proaches to attention-based neural machine translation,” arXiv preprint
arXiv:1508.04025, 2015.
[20] A. Cruz-Roa, A. Basavanhally, F. Gonz´
alez, H. Gilmore, M. Feldman,
S. Ganesan, N. Shih, J. Tomaszewski, and A. Madabhushi, “Automatic
detection of invasive ductal carcinoma in whole slide images with convo-
lutional neural networks,” in Medical Imaging 2014: Digital Pathology,
vol. 9041. International Society for Optics and Photonics, 2014, p.
904103.
Article
Breast cancer positions as the most well-known threat and the main source of malignant growth-related morbidity and mortality throughout the world. It is apical of all new cancer incidences analyzed among females. However, Machine learning algorithms have given rise to progress across different domains. There are various diagnostic methods available for cancer detection. However, cancer detection through histopathological images is considered to be more accurate. In this research, we have proposed the Stacked Generalized Ensemble (SGE) approach for breast cancer classification into Invasive Ductal Carcinoma+ and Invasive Ductal Carcinoma-. SGE is inspired by the stacking model which utilizes output predictions. Here, SGE uses six deep learning models as level-0 learner models or sub-models and Logistic regression is used as Level – 1 learner or meta – learner model. Invasive Ductal Carcinoma dataset for histopathology images is used for experimentation. The results of the proposed methodology have been compared and analyzed with existing machine learning and deep learning methods. The results demonstrate that the proposed methodology performed exponentially good in image classification in terms of accuracy, precision, recall, and F1 measure.
Article
Full-text available
In recent years, deep learning has garnered tremendous success in a variety of application domains. This new field of machine learning has been growing rapidly and has been applied to most traditional application domains, as well as some new areas that present more opportunities. Different methods have been proposed based on different categories of learning, including supervised, semi-supervised, and un-supervised learning. Experimental results show state-of-the-art performance using deep learning when compared to traditional machine learning approaches in the fields of image processing, computer vision, speech recognition, machine translation, art, medical imaging, medical information processing, robotics and control, bioinformatics, natural language processing, cybersecurity, and many others. This survey presents a brief survey on the advances that have occurred in the area of Deep Learning (DL), starting with the Deep Neural Network (DNN). The survey goes on to cover Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), including Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), Auto-Encoder (AE), Deep Belief Network (DBN), Generative Adversarial Network (GAN), and Deep Reinforcement Learning (DRL). Additionally, we have discussed recent developments, such as advanced variant DL techniques based on these DL approaches. This work considers most of the papers published after 2012 from when the history of deep learning began. Furthermore, DL approaches that have been explored and evaluated in different application domains are also included in this survey. We also included recently developed frameworks, SDKs, and benchmark datasets that are used for implementing and evaluating deep learning approaches. There are some surveys that have been published on DL using neural networks and a survey on Reinforcement Learning (RL). However, those papers have not discussed individual advanced techniques for training large-scale deep learning models and the recently developed method of generative models.
Conference Paper
Full-text available
Industrial pollution resulting in ozone layer depletion has influenced increased UV radiation in recent years which is a major environmental risk factor for invasive skin cancer Melanoma and other keratinocyte cancers. The incidence of deaths from Melanoma has risen worldwide in past two decades. Deep learning has been employed successfully for dermatologic diagnosis. In this work, we present a deep learning based scheme to automatically segment skin lesions and detect melanoma from dermoscopy images. U-Net was used for segmenting out the lesion from surrounding skin. The limitation of utilizing deep neural networks with limited medical data was solved with data augmentation and transfer learning. In our experiments, U-Net was used with spatial dropout to solve the problem of overfitting and different augmentation effects were applied on the training images to increase data samples. The model was evaluated on two different datasets. It achieved a mean dice score of 0.87 and a mean jaccard index of 0.80 on ISIC 2018 dataset. The trained model was assessed on PH² dataset where it achieved a mean dice score of 0.93 and a mean jaccard index of 0.87 with transfer learning. For classification of malignant melanoma, a DCNN-SVM model was used where we compared state of the art deep nets as feature extractors to find the applicability of transfer learning in dermatologic diagnosis domain. Our best model achieved a mean accuracy of 92% on PH² dataset. The findings of this study is expected to be useful in cancer diagnosis research.
Article
Full-text available
Deep Convolutional Neural Network (CNN) is a special type of Neural Networks, which has shown exemplary performance on several competitions related to Computer Vision and Image Processing. Some of the exciting application areas of CNN include Image Classification and Segmentation, Object Detection, Video Processing, Natural Language Processing, and Speech Recognition. The powerful learning ability of deep CNN is primarily due to the use of multiple feature extraction stages that can automatically learn representations from the data. The availability of a large amount of data and improvement in the hardware technology has accelerated the research in CNNs, and recently interesting deep CNN architectures have been reported. Several inspiring ideas to bring advancements in CNNs have been explored, such as the use of different activation and loss functions, parameter optimization, regularization, and architectural innovations. However, the significant improvement in the representational capacity of the deep CNN is achieved through architectural innovations. Notably, the ideas of exploiting spatial and channel information, depth and width of architecture, and multi-path information processing have gained substantial attention. Similarly, the idea of using a block of layers as a structural unit is also gaining popularity. This survey thus focuses on the intrinsic taxonomy present in the recently reported deep CNN architectures and, consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature-map exploitation, channel boosting, and attention. Additionally, the elementary understanding of CNN components, current challenges, and applications of CNN are also provided.
Conference Paper
Full-text available
This work presents an Xception ensemble network based Bangla handwritten digit classification scheme. Bangla handwritten digits are challenging to recognize due to some strong similar features between different classes. In this study, heavy augmentation has been used in the training set along with dropout in the model to avoid overfitting. Competitive performance has been achieved with optimized number of model parameters. An ensemble of three Xception networks was evaluated on a hidden test set where it showed promising performance of 96.69% accuracy, F1 score of 97.14%.
Preprint
Full-text available
Extraction of missing attribute values is to find values describing an attribute of interest from a free text input. Most past related work on extraction of missing attribute values work with a closed world assumption with the possible set of values known beforehand, or use dictionaries of values and hand-crafted features. How can we discover new attribute values that we have never seen before? Can we do this with limited human annotation or supervision? We study this problem in the context of product catalogs that often have missing values for many attributes of interest. In this work, we leverage product profile information such as titles and descriptions to discover missing values of product attributes. We develop a novel deep tagging model OpenTag for this extraction problem with the following contributions: (1) we formalize the problem as a sequence tagging task, and propose a joint model exploiting recurrent neural networks (specifically, bidirectional LSTM) to capture context and semantics, and Conditional Random Fields (CRF) to enforce tagging consistency; (2) we develop a novel attention mechanism to provide interpretable explanation for our model's decisions; (3) we propose a novel sampling strategy exploring active learning to reduce the burden of human annotation. OpenTag does not use any dictionary or hand-crafted features as in prior works. Extensive experiments in real-life datasets in different domains show that OpenTag with our active learning strategy discovers new attribute values from as few as 150 annotated samples (reduction in 3.3x amount of annotation effort) with a high F-score of 83%, outperforming state-of-the-art models.
Conference Paper
Full-text available
Diagnosis of breast carcinomas has so far been limited to the morphological interpretation of epithelial cells and the assessment of epithelial tissue architecture. Consequently, most of the automated systems have focused on characterizing the epithelial regions of the breast to detect cancer. In this paper, we propose a system for classification of hematoxylin and eosin (H&E) stained breast specimens based on convolutional neural networks that primarily targets the assessment of tumor-associated stroma to diagnose breast cancer patients. We evaluate the performance of our proposed system using a large cohort containing 646 breast tissue biopsies. Our evaluations show that the proposed system achieves an area under ROC of 0.92, demonstrating the discriminative power of previously neglected tumor associated stroma as a diagnostic biomarker.
Article
The Deep Convolutional Neural Network (DCNN) is one of the most powerful and successful deep learning approaches. DCNNs have already provided superior performance in different modalities of medical imaging including breast cancer classification, segmentation, and detection. Breast cancer is one of the most common and dangerous cancers impacting women worldwide. In this paper, we have proposed a method for breast cancer classification with the Inception Recurrent Residual Convolutional Neural Network (IRRCNN) model. The IRRCNN is a powerful DCNN model that combines the strength of the Inception Network (Inception-v4), the Residual Network (ResNet), and the Recurrent Convolutional Neural Network (RCNN). The IRRCNN shows superior performance against equivalent Inception Networks, Residual Networks, and RCNNs for object recognition tasks. In this paper, the IRRCNN approach is applied for breast cancer classification on two publicly available datasets including BreakHis and Breast Cancer (BC) classification challenge 2015. The experimental results are compared against the existing machine learning and deep learning–based approaches with respect to image-based, patch-based, image-level, and patient-level classification. The IRRCNN model provides superior classification performance in terms of sensitivity, area under the curve (AUC), the ROC curve, and global accuracy compared to existing approaches for both datasets.
Article
Automated breast cancer multi-classification from histopathological images plays a key role in computer-aided breast cancer diagnosis or prognosis. Breast cancer multi-classification is to identify subordinate classes of breast cancer (Ductal carcinoma, Fibroadenoma, Lobular carcinoma, etc.). However, breast cancer multi-classification from histopathological images faces two main challenges from: (1) the great difficulties in breast cancer multi-classification methods contrasting with the classification of binary classes (benign and malignant), and (2) the subtle differences in multiple classes due to the broad variability of high-resolution image appearances, high coherency of cancerous cells, and extensive inhomogeneity of color distribution. Therefore, automated breast cancer multi-classification from histopathological images is of great clinical significance yet has never been explored. Existing works in literature only focus on the binary classification but do not support further breast cancer quantitative assessment. In this study, we propose a breast cancer multi-classification method using a newly proposed deep learning model. The structured deep learning model has achieved remarkable performance (average 93.2% accuracy) on a large-scale dataset, which demonstrates the strength of our method in providing an efficient tool for breast cancer multi-classification in clinical settings.
Article
In recent years, vector-based machine learning algorithms, such as random forests, support vector machines, and 1-D convolutional neural networks, have shown promising results in hyperspectral image classification. Such methodologies, nevertheless, can lead to information loss in representing hyperspectral pixels, which intrinsically have a sequence-based data structure. A recurrent neural network (RNN), an important branch of the deep learning family, is mainly designed to handle sequential data. Can sequence-based RNN be an effective method of hyperspectral image classification? In this paper, we propose a novel RNN model that can effectively analyze hyperspectral pixels as sequential data and then determine information categories via network reasoning. As far as we know, this is the first time that an RNN framework has been proposed for hyperspectral image classification. Specifically, our RNN makes use of a newly proposed activation function, parametric rectified tanh (PRetanh), for hyperspectral sequential data analysis instead of the popular tanh or rectified linear unit. The proposed activation function makes it possible to use fairly high learning rates without the risk of divergence during the training procedure. Moreover, a modified gated recurrent unit, which uses PRetanh for hidden representation, is adopted to construct the recurrent layer in our network to efficiently process hyperspectral data and reduce the total number of parameters. Experimental results on three airborne hyperspectral images suggest competitive performance in the proposed mode. In addition, the proposed network architecture opens a new window for future research, showcasing the huge potential of deep recurrent networks for hyperspectral data analysis.