Conference PaperPDF Available

Lung nodule detection using 3D convolutional neural networks trained on weakly labeled data

Authors:

Abstract

Early detection of lung nodules is currently the one of the most effective ways to predict and treat lung cancer. As a result, the past decade has seen a lot of focus on computer aided diagnosis (CAD) of lung nodules, whose goal is to efficiently detect, segment lung nodules and classify them as being benign or malignant. Effective detection of such nodules remains a challenge due to their arbitrariness in shape, size and texture. In this paper, we propose to employ 3D convolutional neural networks (CNN) to learn highly discriminative features for nodule detection in lieu of hand-engineered ones such as geometric shape or texture. While 3D CNNs are promising tools to model the spatio-temporal statistics of data, they are limited by their need for detailed 3D labels, which can be prohibitively expensive when compared obtaining 2D labels. Existing CAD methods rely on obtaining detailed labels for lung nodules, to train models, which is also unrealistic and time consuming. To alleviate this challenge, we propose a solution wherein the expert needs to provide only a point label, i.e., the central pixel of of the nodule, and its largest expected size. We use unsupervised segmentation to grow out a 3D region, which is used to train the CNN. Using experiments on the SPIE-LUNGx dataset, we show that the network trained using these weak labels can produce reasonably low false positive rates with a high sensitivity, even in the absence of accurate 3D labels.
Lung Nodule Detection using 3D Convolutional Neural
Networks Trained on Weakly Labeled Data
Rushil Anirudh1, Jayaraman J. Thiagarajan2, Timo Bremer2, and Hyojin Kim2
1School of Electrical, Computer and Energy Engineering, Arizona State University
2Center for Applied Scientific Computing, Lawrence Livermore National Laboratory
ABSTRACT
Early detection of lung nodules is currently the one of the most effective ways to predict and treat lung cancer.
As a result, the past decade has seen a lot of focus on computer aided diagnosis (CAD) of lung nodules, whose
goal is to efficiently detect, segment lung nodules and classify them as being benign or malignant. Effective
detection of such nodules remains a challenge due to their arbitrariness in shape, size and texture. In this paper,
we propose to employ 3D convolutional neural networks (CNN) to learn highly discriminative features for nodule
detection in lieu of hand-engineered ones such as geometric shape or texture. While 3D CNNs are promising
tools to model the spatio-temporal statistics of data, they are limited by their need for detailed 3D labels, which
can be prohibitively expensive when compared obtaining 2D labels. Existing CAD methods rely on obtaining
detailed labels for lung nodules, to train models, which is also unrealistic and time consuming. To alleviate this
challenge, we propose a solution wherein the expert needs to provide only a point label, i.e., the central pixel of
of the nodule, and its largest expected size. We use unsupervised segmentation to grow out a 3D region, which is
used to train the CNN. Using experiments on the SPIE-LUNGx dataset, we show that the network trained using
these weak labels can produce reasonably low false positive rates with a high sensitivity, even in the absence of
accurate 3D labels.
1. INTRODUCTION
The last decade has seen significant advances in using machine learning for computer aided diagnosis (CAD),
which can significantly improve efficiency and reduce costs. The continued success of CAD tools can be at-
tributed to the development of feature representations that can work well under several different conditions with
invariances to properties such as brightness, shape, size and geometric transformations. More recently, advances
in representation learning (e.g. deep neural networks) have enabled inference of features from training data
in lieu of hand-tuned feature design by an expert.1These have resulted in significant boosts in accuracy for
tasks such as image recognition, natural language understanding, and speech recognition.1However, there are
significant hurdles before such successes can be transferred to benefit the medical imaging community. A major
limiting factor is the difficulty in obtaining annotated data, which is significantly more expensive than compared
to traditional computer vision. In this paper, we consider the problem of detecting early stage lung nodules
based on learned representations. This is a crucial problem in medical diagnosis since it is estimated that more
people died due to lung and bronchus cancer than all other cancers combined in 2015.2Classical approaches
typically segment the lung, extract features from the training data and train a classifier to detect potential nod-
ules.3However, lung nodule detection is inherently more challenging due to the high variability of nodule shape,
size, and texture. As a result, nodule detection techniques that employ classifiers learned using hand-engineered
features often provide poor generalization to novel test data. More recent approaches that employ deep neural
networks in their pipeline have achieved state-of-the-art detection performances. For example, Kumar et al.4
use an autoencoder (an unsupervised learning network) to extract useful features from annotated nodules, these
features are used to learn to classify nodules as being malignant or benign. Next, Ginneken et al.5have shown
promising results using an off-the-shelf convolutional neural network (CNN), one that is pre-trained for an image
recognition task. They use the network to obtain features which are used for classification. Two dimensional
CNNs have been used in other CAD methods such as pancreas segmentation, lymph nodes and colonic polyp
This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National
Laboratory under Contract DE-AC52-07NA27344. Corresponding author email: ranirudh@asu.edu
detection.6Most of these methods are trained individually on 2D images with 2D convolutional filters, whereas
the data at hand is inherently 3 dimensional. Roth et. al.7have addressed this by considering a ‘2.5D’ represen-
tation that takes slices of the images from a point of interest in 3 orthogonal views. These slices are combined
to be treated as a 3-channel image, which is used to train a deep network. In contrast, we propose to train a
full 3D convolutional network, that can directly learn 3D convolutional filters from the data. Such filters are
beneficial because it can capture the full range of variations expected from the lung nodules. However, there
are two crucial challenges in generalizing 2D convolutional networks to 3D. The foremost challenge is the need
for labeled training data, which can be prohibitively expensive to obtain for 3D images. In fact, most existing
systems for detection rely on detailed nodule segmentations provided by an expert for model training - which
is particularly unrealistic in the case convolutional neural nets since they require a much larger training dataset
for learning features with high representative power. In several computer vision applications, this problem is
addressed by outsourcing labeling using services such as Amazons MTurk, which cannot be readily adapted for
medical image analysis since experts are required to effectively interpret the data. Consequently, the proposed
detection system reduces the labeling effort on experts by working with “point labels” which are essentially
single pixel locations potentially indicating the center of the nodules. By using unsupervised learning methods
to estimate the true label from the weak information, we show that we can reduce the effort required on the
expert to label, while being able to train 3D networks that can discriminate effectively. The second challenge is
pertinent to the computational burden of 3D neural networks. While 3D convolutions are expensive, particularly
for processing 3D scans (typically 512×512 ×200), building a single network that can potentially handle nodules
of varying sizes in different regions of the scan is hard. To circumvent this, we propose to train our network
on smaller 3D regions centered around the nodule instead of the whole image, and simultaneously build two
networks with different context sizes, 41 ×41 ×7,and 25 ×25 ×7 respectively. The final detection is obtained
as the consensus of the two networks. Our primary contributions can be summarized as follows:
Contributions:
1. We present a modular system that leverages the robustness of 3D convolutional neural networks for the
problem of lung nodule detection. Our system learns the most discriminative features for nodule detection
instead of working with hand engineered features such as shape and texture. To the best of our knowledge,
we are the first to explore lung nodule detection using 3D convolutional filters.
2. Our system works with point labels, which specify a single voxel location that indicates the presence of
a nodule, and its largest cross sectional area. This is much more time efficient compared to the detailed
annotations of a nodule in the training set, which is highly impractical since experts are needed to provide
these. Using unsupervised learning methods, we estimate a final 3D label which is used to train our 3D
CNN.
3. By learning two different networks with varying context sizes, our detection system achieves improved
generalization.
4. We demonstrate promising results on the AAPM-SPIE-LungX nodule classification dataset.
2. METHODOLOGY
In this section we outline different aspects of our system, that can make predictions on 3D volumes of CT scans.
First we address the label estimation procedure for training data. Next we use these estimated labels to train
our 3D convolutional network.
2.1 Estimating weak labels
A limiting factor for using 3D CNNs is the cost of obtaining detailed 3D labels, which are significantly harder
to obtain than 2D labels. This is exacerbated by the fact that experts such as radiologists are needed to label
lung nodules, as opposed to crowd sourcing platforms such as Amazon MTurk, which have become a norm in
computer vision. Therefore, we begin by using only a single voxel location or a point label, which indicates the
presence of a nodule. Such point labels are a natural way for experts to annotate lung scans efficiently. We
Original Slice
Region of
Interest
Basic
Thresholding
Filtering
Superpixels
Final estimated
label
Figure 1: Estimating the ground truth per slice from a point label given by the expert. These 3D labels are used to train
the 3D-CNN.
process the slices in 2D, and combine them using 3D Gaussian filtering. First, we obtain 2D SLIC superpixels8
to oversegment each slice, as shown in the figure 1. These superpixels find contiguous regions in the image, which
are used to eliminate obvious regions that are not nodules based on size and intensity. The 3D Gaussian filtering
reduces noise and combines the 2D slices to form a coherent 3D nodule. We are able to do this accurately because
we are looking at a local neighborhood around the nodule. The size of the local neighborhood is determined
by the largest cross sectional area of the nodule, as given by the expert. The superpixels can effectively aid in
capturing nodules that are hard to distinguish at times, such as when they are touching a lung wall.
2.2 Training
The 3D CNN is trained to predict whether or not a single voxel is likely to be a nodule or not, based on the
spatio-temporal statistics around it. For example, if the location of the nodule is at V(x, y, z ), where Vis the
entire CT volume, we choose the input volume to be ˆv=V(xw:x+w, y w:y+w, z h:z+h), where his the
window size in X, Y planes and hin the Zplane. We used values in the range of w= 10 25 and h= 3,5. The
volume is thinner in the zplane because CT scans are typically sampled much more densely in X, Y planes than
in Z. There are at the most 2 nodules per scan, but training a 3D CNN requires many examples to effectively
learn the filters. Therefore in order to inflate our training set, we treat different voxels within the same nodule
as different positive examples. A typical nodule can range from 3 28 pixels wide at its largest size, and spans
37 slices typically. We center our volume at several different randomly sampled voxels within the nodule and
pick the resulting volume for a given w, h as a positive training example. Inflating training sets have been useful
to train networks that can achieve robustness and avoid overfitting.7, 9 The negative set is much harder to obtain
than the positive set because its hard to define it. A negative class contains all examples from the lung that are
not the nodule. A smart approach can provide a much better definition of what a negative sample should be.
In the ideal case, we only need to choose negative samples to be those which are expected to be easily mistaken
by our network. Therefore, we restrict the negative space to lie within the lung, since it is highly unlikely for
a nodule to be found outside it. Next, we random sample locations which have an intensity above a threshold
(400 500 on the Hounsfield scale). These sampling methods resulted in about 15Kpositive samples and
around 20Knegative samples from the AAPM-SPIE-LungX dataset.10,11
2.3 Architecture of the Convolutional Neural Net
We trained a 3D CNN using the MatConvNet toolbox for MATLAB.12 The toolbox allows us to specify the kind
of layers, and the number of filters needed. The network was designed to be similar to most of the popular models
for image recognition.9As shown in figure 2, our network contained 5 convolutional layers which were followed
by Rectified Linear Unit (ReLU) activation layers,92 max-pooling layers, and a final 2-way softmax layer for
classification. We also use dropout13 to regularize the learning problem. Of the five convolutional layers, two are
fully connected (FC) with convolution kernels of size 1×1. The generalization from 2D convolutional networks to
3D networks is trivial, in that the filters that are learned are 3 dimensional. Since we use a multiscale approach,
41x41x7
16x16x7x50
7x7x50x50
3x3x50x80 2x2x80x100
FC Layers
2-way Softmax
classifier
P(nodule)
Max
pooling
Layer
Max
pooling
Layer
CT
Volume
Figure 2: Overall design of the 3D convolutional neural network trained for lung nodule detection.
we train two different networks for each scale, the convolutional filter sizes for the larger 3D CNN are shown in
2. For the smaller scale, we use the same architecture, and modify the sizes of kernels accordingly.
2.4 Testing and candidate generation
Our network is trained end-to-end, i.e. it is able to make predictions regarding the presence of a nodule directly
from a CT volume of the appropriate size. However, a typical scan is 512 ×512 ×200 in size and searching the
entire 3D volume to make predictions is highly impractical. Instead, we reduce the search space significantly by
ruling out parts of the scan that are very unlikely to contain a lung nodule. Since the lung nodule is expected to be
inside the lung, we perform lung segmentation using morphological operations on each 2D slice. The segmentation
of the lung itself is a hard problem, and there are dedicated systems to perform effective segmentation in 2D and
3D. We also observed that most of the false positive detections on the system were because of the airways which
are part of the lung but look a lot like nodules when observed locally. Therefore a robust 3D segmentation can
significantly reduce the false positive rate, and improve speed of detection. Next, for each voxel we apply the dot
enhancement filter using the 3D Hessian. The resulting “dot score” is high if the region around the current voxel
is spherical.6The dot score map is thresholded in each local neighborhood to provide the final list of candidates.
This method can be very effective when the nodules are expected to be approximately round in shape. The dot
score is computed as |λ3|21, where λ1, λ3are the first and third eigenvalues of the 3D Hessian. The dot score
for a given volume essentially provides an estimation of its roundness such that a high score indicates a tendency
towards roundness. We set a low threshold to eliminate obviously non nodule-like elements, and run 3D Gaussian
smoothing filter to remove smaller stray particles within the volume. These steps significantly reduce the false
positives, resulting in around 80-200 3D nodule like candidates per scan. After smoothing, these can be easily
identified using a 3D connected component algorithm efficiently. Finally, we center a test volume at multiple
locations inside each candidate and obtain a prediction from the deep network. This also allows us to perform
voting in order to eliminate noisy predictions by running a smoothing filter on the predictions.
3. EXPERIMENTS
In this section we describe the dataset, experimental conditions, and results obtained for lung nodule detection.
SPIE-AAPM-LUNGx dataset: The dataset has been published for nodule classification, which requires
labeling each nodule as benign or malignant. We use the dataset for detection, as it does not contain detailed
labels for nodules, and hence a realistic test case. Of the 70 scans, we have used 20 for training and 47 for
testing. Three scans were discarded because there was ambiguity regarding the presence of a nodule at the
specified location. The label is provided as an (x, y, z) location along with information on the largest cross
sectional area of the nodule. We did not use this information, however, it could be used to estimate better labels.
3.1 Evaluation settings
For the test scans, we first generate ground truth labels in a similar fashion as described for the training data.
These estimated labels are used to evaluate the performance of our system on the dataset.
Multiscale CNN: The lung nodules vary significantly in size – typically from around 3mm - 20mm. Many
successful detection systems employ a multi-scale architecture. Since we are interested in 3D volumes, there are
several ways to choose the scale. We chose two scales at 25×25×7, and 41×41×7 experimentally. We train them
separately and obtain the predictions from each CNN to obtain the final result. The combination performed
much better in terms of sensitivity and accuracy as expected. Finally, we generate the free receiver operating
curves at various detection thresholds. At a particular threshold, we declare a match if there is a nodule around
a small radius (typically 5 10) of the ground truth. This is done by first estimating the centroid of each 3D
blob in the test prediction that is greater than the threshold. Next, we find the distances from each centroid to
the ground truth. Only the one that is closest and within a distance threshold is considered a positive, the rest
are considered false positives. For each threshold the total number of false positives divided by the number of
scans gives the average false positive rate.
Results: We compute the free receiver operating characteristic (FROC) for our system, which plots the sensi-
tivity against the average number of false positives per scan. The results are shown in figure 3a. As it can be
seen, even with a weak labeled system, we achieve sensitivities of 80% for 10 false positives per scan. Sample
predictions are shown in figure 3.
Average False Positives Per Scan
10 20 30 40
Sensitivity
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Free Reciever Operating Characteristic
scale 41x41
scale 25x25
2 scale CNN
(a) ROC curve (b) Sample results
Figure 3: Detection performance on the SPIE-AAPM LUNGx dataset.
4. CONCLUSION & FUTURE WORK:
We have presented a system for lung nodule detection that works with 3D convolutional networks trained using
weak label information. While the initial results look promising, there are areas to further improve the system.
Our current system currently processes superpixels in 2D, this could be improved with a 3D superpixel system
that clusters coherent spatio-temporal regions. A 3D lung segmentation approach could also eliminate the air
tracts which are a primary cause for false positives, but these cannot be differentiated when observed in 2D.
Next, the training set can be inflated even further using 3D transforms of existing labels, which has been done
for 2D CNNs. Such a technique will also ensure there is little overlap between the original label and transformed
label to avoid overfitting.
REFERENCES
[1] LeCun, Y., Bengio, Y., and Hinton, G., “Deep learning,” Nature 521(7553), 436–444 (2015).
[2] http://www.lung.org/lung- disease/lung-cancer/lung-cancer-screening-guidelines/
lung-cancer-screening-for-patients.pdf. [accessed 13-Aug-2015].
[3] Dhara, A. K., Mukhopadhyay, S., and Khandelwal, N., “Computer-aided detection and analysis of pul-
monary nodule from ct images: A survey,” IETE Technical Review 29(4), 265–275 (2012).
[4] Kumar, D., Wong, A., and Clausi, D. A., “Lung nodule classification using deep features in ct images,” in
[Computer and Robot Vision (CRV), 2015 12th Conference on], 133–138, IEEE (2015).
[5] van Ginneken, B., Setio, A. A., Jacobs, C., and Ciompi, F., “Off-the-shelf convolutional neural network
features for pulmonary nodule detection in computed tomography scans,” in [Biomedical Imaging (ISBI),
2015 IEEE 12th International Symposium on], 286–289, IEEE (2015).
[6] Choi, W.-J. and Choi, T.-S., “Automated pulmonary nodule detection based on three-dimensional shape-
based feature descriptor,” Computer methods and programs in biomedicine 113(1), 37–54 (2014).
[7] Roth, H. R., Lu, L., Seff, A., Cherry, K. M., Hoffman, J., Wang, S., Liu, J., Turkbey, E., and Summers,
R. M., “A new 2.5 d representation for lymph node detection using random sets of deep convolutional neural
network observations,” in [Medical Image Computing and Computer-Assisted Intervention–MICCAI 2014],
520–527, Springer International Publishing (2014).
[8] Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., and Susstrunk, S., “Slic superpixels compared
to state-of-the-art superpixel methods,” Pattern Analysis and Machine Intelligence, IEEE Transactions
on 34(11), 2274–2282 (2012).
[9] Krizhevsky, A., Sutskever, I., and Hinton, G. E., “Imagenet classification with deep convolutional neural
networks,” in [Advances in neural information processing systems], 1097–1105 (2012).
[10] “Spie-aapm-nci lung nodule classification challenge dataset.” https://wiki.cancerimagingarchive.net/
display/DOI/SPIE-AAPM-NCI+Lung+Nodule+Classification+Challenge+Dataset. [accessed 13-Aug-
2015].
[11] Armato, III, S. G., Hadjiiski, L., Tourassi, G. D., Drukker, K., Giger, M. L., Li, F., Redmond, G., Fara-
hani, K., Kirby, J. S., and Clarke, L. P., “Guest editorial: Lungx challenge for computerized lung nodule
classification: reflections and lessons learned,” Journal of Medical Imaging 2(2), 020103 (2015).
[12] Vedaldi, A. and Lenc, K., “Matconvnet-convolutional neural networks for matlab,” arXiv preprint
arXiv:1412.4564 (2014).
[13] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R., “Dropout: A simple way
to prevent neural networks from overfitting,” The Journal of Machine Learning Research 15(1), 1929–1958
(2014).
... In order to solve these problems, increasing numbers of researchers have applied deep learning methods to detect pulmonary nodules, and these have yielded encouraging results. The network most commonly used in papers selected for this study was a CNN [20,34,37,[50][51][52][53][54][55][71][72][73][74][75][76]82,83,90,[102][103][104][210][211][212][213][214][215][216][217][218][219][220][221][222][223][224][225][226][227]. In four articles, an attention mechanism was introduced into the neural network to improve the effectiveness and efficiency of the model [37,103,215,217]. ...
Article
Lung cancer has one of the highest mortalities of all cancers. According to the National Lung Screening Trial, patients who underwent low-dose computed tomography (CT) scanning once a year for 3 years showed a 20% decline in lung cancer mortality. To further improve the survival rate of lung cancer patients, computer-aided diagnosis (CAD) technology shows great potential. In this paper, we summarize existing CAD approaches applying deep learning to CT scan data for pre-processing, lung segmentation, false positive reduction, lung nodule detection, segmentation, classification and retrieval. Selected papers are drawn from academic journals and conferences up to November 2020. We discuss the development of deep learning, describe several important aspects of lung nodule CAD systems and assess the performance of the selected studies on various datasets, which include LIDC-IDRI, LUNA16, LIDC, DSB2017, NLST, TianChi, and ELCAP. Overall, in the detection studies reviewed, the sensitivity of these techniques is found to range from 61.61% to 98.10%, and the value of the FPs per scan is between 0.125 and 32. In the selected classification studies, the accuracy ranges from 75.01% to 97.58%. The precision of the selected retrieval studies is between 71.43% and 87.29%. Based on performance, deep learning based CAD technologies for detection and classification of pulmonary nodules achieve satisfactory results. However, there are still many challenges and limitations remaining including over-fitting, lack of interpretability and insufficient annotated data. This review helps researchers and radiologists to better understand CAD technology for pulmonary nodule detection, segmentation, classification and retrieval. We summarize the performance of current techniques, consider the challenges, and propose directions for future high-impact research.
... The researchers believe that 3D CNNs would provide more promising results than 2D CNN owing to the ability of the model to manipulate spatial information via 3D convolution and pooling operation. Up till now few works is reported in literature where 3D CNNs are employed in medical image analysis for lung cancer diagnosis [101,105,113,114]. Overall, some reports exhibit good performances in segmentation and classification. ...
Article
Full-text available
Abstract: Thus far, the most common cause of death in the world is cancer. It consists of abnormally expanding areas that are threatening to human survival. Hence, the timely detection of cancer is important to expanding the survival rate of patients. In this survey, we analyze the state-of-the-art approaches for multi-organ cancer detection, segmentation, and classification. This article promptly reviews the present-day works in the breast, brain, lung, and skin cancer domain. Afterwards, we analytically compared the existing approaches to provide insight into the ongoing trends and future challenges. This review also provides an objective description of widely employed imaging techniques, imaging modality, gold standard database, and related literature on each cancer in 2016-2021. The main goal is to systematically examine the cancer diagnosis systems for multi-organs of the human body as mentioned. Our critical survey analysis reveals that greater than 70% of deep learning researchers attain promising results with CNN-based approaches for the early diagnosis of multi-organ cancer. This survey includes the extensive discussion part along with current research challenges, possible solutions, and prospects. This research will endow novice researchers with valuable information to deepen their knowledge and also provide the room to develop new robust computer-aid diagnosis systems, which assist health professionals in bridging the gap between rapid diagnosis and treatment planning for cancer patients.
... It used weakly labeled data to train the neural networks in the field of pulmonary nodules detection. The experimental results were also superior to the traditional methods and the method produced fewer false positives (16,17). (19). ...
Article
Full-text available
Malignant pulmonary nodules are one of the main manifestations of lung cancer in early CT image screening. Since lung cancer may have no early obvious symptoms, it is important to develop a computer-aided detection (CAD) system to assist doctors to detect the malignant pulmonary nodules in the early stage of lung cancer CT diagnosis. Due to the recent successful applications of deep learning in image processing, more and more researchers have been trying to apply it to the diagnosis of pulmonary nodules. However, due to the ratio of nodules and non-nodules samples used in the training and testing datasets usually being different from the practical ratio of lung cancer, the CAD classification systems may easily produce higher false-positives while using this imbalanced dataset. This work introduces a filtering step to remove the irrelevant images from the dataset, and the results show that the false-positives can be reduced and the accuracy can be above 98%. There are two steps in nodule detection. Firstly, the images with pulmonary nodules are screened from the whole lung CT images of the patients. Secondly, the exact locations of pulmonary nodules will be detected using Faster R-CNN. Final results show that this method can effectively detect the pulmonary nodules in the CT images and hence potentially assist doctors in the early diagnosis of lung cancer.
... However, we acknowledge there are some limitations and weaknesses in the assumptions we had to make. First, due to the need of a mask that delineates the nodules to calculate radiomic features, our method would have to be dependent on lung nodule detection and segmentation methods such as the ones proposed by Huang et al. [44] and Anirudh et al. [45]. This dependence on pre-existing or human expert segmentation is not new, and is problem that still affects many aspects of medical image analysis and supervised machine learning. ...
Preprint
Full-text available
Early diagnosis of lung cancer is a key intervention for the treatment of lung cancer computer aided diagnosis (CAD) can play a crucial role. However, most published CAD methods treat lung cancer diagnosis as a lung nodule classification problem, which does not reflect clinical practice, where clinicians diagnose a patient based on a set of images of nodules, instead of one specific nodule. Besides, the low interpretability of the output provided by these methods presents an important barrier for their adoption. In this article, we treat lung cancer diagnosis as a multiple instance learning (MIL) problem in order to better reflect the diagnosis process in the clinical setting and for the higher interpretability of the output. We chose radiomics as the source of input features and deep attention-based MIL as the classification algorithm.The attention mechanism provides higher interpretability by estimating the importance of each instance in the set for the final diagnosis.In order to improve the model's performance in a small imbalanced dataset, we introduce a new bag simulation method for MIL.The results show that our method can achieve a mean accuracy of 0.807 with a standard error of the mean (SEM) of 0.069, a recall of 0.870 (SEM 0.061), a positive predictive value of 0.928 (SEM 0.078), a negative predictive value of 0.591 (SEM 0.155) and an area under the curve (AUC) of 0.842 (SEM 0.074), outperforming other MIL methods.Additional experiments show that the proposed oversampling strategy significantly improves the model's performance. In addition, our experiments show that our method provides an indication of the importance of each nodule in determining the diagnosis, which combined with the well-defined radiomic features, make the results more interpretable and acceptable for doctors and patients.
Article
Purpose: To commemorate the 50th anniversary of the first SPIE Medical Imaging meeting, we highlight some of the important publications published in the conference proceedings. Approach: We determined the top cited and downloaded papers. We also asked members of the editorial board of the Journal of Medical Imaging to select their favorite papers. Results: There was very little overlap between the three methods of highlighting papers. The downloads were mostly recent papers, whereas the favorite papers were mostly older papers. Conclusions: The three different methods combined provide an overview of the highlights of the papers published in the SPIE Medical Imaging conference proceedings over the last 50 years.
Article
Lung cancer has one of the highest incidence rates and mortality rates among all common cancers worldwide. Early detection of suspicious lung nodules is crucial in fighting lung cancer. In recent years, with the proliferation of clinical data like low-dose computed tomography (LDCT), histology whole slide images, electronic health records, and sensor readings from medical IoT devices etc., many artificial intelligence tools have taken more important roles in lung cancer management. In this survey, we lay out the current and emergent artificial intelligence methods for fighting lung cancers. Besides the commonly used CT image based deep learning models for detecting and diagnosing lung nodules, we also cover emergent AI techniques for lung cancer: 1) federated deep learning models for harnessing multi-center data with privacy in mind, 2) multi-modal deep learning models for integrating multiple sources of clinical and image data, 3) interpretable deep learning models for opening the black box for clinicians. In the big data era for cancer management, we believe this short survey will help AI researchers better understand the clinical challenges of lung cancer and will also help clinicians better understand the emergent AI tools.
Article
Full-text available
: Background: Cancer disease is the second largest disease after heart-attack in the world. Cancer is an abnormal growth of normal cell. Cancer is classified based on the cell type where it is mainly affected .There are different types of cancer like blood cancer, brain cancer, small intestine cancer, lung cancer, liver cancer etc. According to ICMR, among 1.27 billion Indian populations, the incidence of cancer is 70-90% per 100,000 populations and 70% of cancer is identified in the last stage accounting for high mortality. Though there are hundred form of cancer, the prognosis of bronchogenic carcinoma (lung cancer) is very poor because it can be identified only at a final stage. The beginning tumors are not more dangerous but the malignant tumors are more risky which spread to further portions of the body over blood stream or the lymph vessels. Prognosis and remedy is the biggest provocations in cancer for the medical field and physicians in the past few years. The CT scans support the doctors to detect cancer at early stage. When cancer is prognoses at benign stage, millions of human life across the world gets saved every year. Method: Noise in the CT scan input image is reduced by traditional adaptive median filtering and segmented by Region Based Neural Networks to extract a region of interest. To reduce the unwanted texture and noises and to detect wide-ranged images, Improved Canny Edge detector is implemented. The clinical characteristics of the patient were included as a feature reference. The considered features in clinical characteristics are status of patient smoking ,age of the patient, classification of tumor and T, N staging. Feature selection using Improved Glowworm Swarm Optimization and Classification Enhanced Transductive Support Vector Machines (ETSVM) is utilized to diagnose the distant metastasis of lung cancer. Result: Experimental results shows that ETSVM and Improved Glowworm Swarm Optimization achieved the best performance with an accuracy 90.7% and sensitivity 94.7%.
Book
Full-text available
This book constitutes the refereed post-conference proceedings of the International Conference on Context-Aware Systems and Applications, held in October 2021. Due to COVID-19 pandemic the conference was held virtually. The 25 revised full papers presented were carefully selected from 52 submissions. The papers cover a wide spectrum of modern approaches and techniques for smart computing systems and their applications.
Chapter
In order to effectively handle the problem of tumor detection on the LUNA16 dataset, we present a new methodology for data augmentation to address the issue of imbalance between the number of positive and negative candidates in this study. Furthermore, a new deep learning model - ASS (a model that combines Convnet sub-attention with Softmax loss) is also proposed and evaluated on patches with different sizes of the LUNA16. Data enrichment techniques are implemented in two ways: off-line augmentation increases the number of images based on the image under consideration, and on-line augmentation increases the number of images by rotating the image at four angles (0°, 90°, 180°, and 270°). We build candidate boxes of various sizes based on the coordinates of each candidate, and these candidate boxes are used to demonstrate the usefulness of the suggested ASS model. The results of cross-testing (with four cases: case 1, ASS trained and tested on a dataset of size 50 × 50; case 2, using ASS trained on a dataset of size 50 × 50 to test a dataset of size 100 × 100; case 3, ASS trained and tested on a dataset of size 100 × 100 and case 4, using ASS trained on a dataset of size 100 × 100 to test a dataset of size 50 × 50) show that the proposed ASS model is feasible.
Article
Full-text available
Early detection of lung cancer can help in a sharp decrease in the lung cancer mortality rate, which accounts for more than 17% percent of the total cancer related deaths. A large number of cases are encountered by radiologists on a daily basis for initial diagnosis. Computer-aided diagnosis (CAD) systems can assist radiologists by offering a 'second opinion' and making the whole process faster. We propose a CAD system which uses deep features extracted from an auto encoder to classify lung nodules as either malignant or benign. We use 4303 instances containing 4323 nodules from the National Cancer Institute (NCI) Lung Image Database Consortium (LIDC) dataset to obtain an overall accuracy of 75.01% with a sensitivity of 83.35% and false positive of 0.39/patient over a 10 fold cross validation.
Article
Full-text available
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
Conference Paper
Full-text available
Automated Lymph Node (LN) detection is an important clinical diagnostic task but very challenging due to the low contrast of surrounding structures in Computed Tomography (CT) and to their varying sizes, poses, shapes and sparsely distributed locations. State-of-the-art studies show the performance range of 52.9% sensitivity at 3.1 false-positives per volume (FP/vol.), or 60.9% at 6.1 FP/vol. for mediastinal LN, by one-shot boosting on 3D HAAR features. In this paper, we first operate a preliminary candidate generation stage, towards 100% sensitivity at the cost of high FP levels (40 per patient), to harvest volumes of interest (VOI). Our 2.5D approach consequently decomposes any 3D VOI by resampling 2D reformatted orthogonal views N times, via scale, random translations, and rotations with respect to the VOI centroid coordinates. These random views are then used to train a deep Convolutional Neural Network (CNN) classifier. In testing, the CNN is employed to assign LN probabilities for all N random views that can be simply averaged (as a set) to compute the final classification probability per VOI. We validate the approach on two datasets: 90 CT volumes with 388 mediastinal LNs and 86 patients with 595 abdominal LNs. We achieve sensitivities of 70%/83% at 3 FP/vol. and 84%/90% at 6 FP/vol. in mediastinum and abdomen respectively, which drastically improves over the previous state-of-the-art work.
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Article
Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. During training, dropout samples from an exponential number of different "thinned" networks. At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods. We show that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets. © 2014 Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever and Ruslan Salakhutdinov.
Article
Convolutional neural networks (CNNs) have emerged as the most powerful technique for a range of different tasks in computer vision. Recent work suggested that CNN features are generic and can be used for classification tasks outside the exact domain for which the networks were trained. In this work we use the features from one such network, OverFeat, trained for object detection in natural images, for nodule detection in computed tomography scans. We use 865 scans from the publicly available LIDC data set, read by four thoracic radiologists. Nodule candidates are generated by a state-of-the-art nodule detection system. We extract 2D sagittal, coronal and axial patches for each nodule candidate and extract 4096 features from the penultimate layer of OverFeat and classify these with linear support vector machines. We show for various configurations that the off-the-shelf CNN features perform surprisingly well, but not as good as the dedicated detection system. When both approaches are combined, significantly better results are obtained than either approach alone. We conclude that CNN features have great potential to be used for detection tasks in volumetric medical data.
Article
Challenges, in the context of medical imaging, are valuable in that they allow for a direct comparison of different algorithms designed for a specific radiologic task, with all algorithms abiding by the same set of rules, operating on a common set of images, and being evaluated with a uniform performance assessment paradigm. The variability of system performance based on database composition and subtlety, definition of “truth,” and scoring metric is well-known;1–3 challenges serve to level the differences across these various dimensions. The medical imaging community has hosted a number of successful thoracic imaging challenges that have spanned a wide range of tasks,4,5 including lung nodule detection,6 lung nodule change, vessel segmentation,7 and vessel tree extraction.8 Each challenge presents its own unique set of circumstances and considerations; however, important common themes exist. Future challenge organizers (and participants) could benefit from an open discussion of successes achieved, pitfalls encountered, and lessons learned from each completed challenge.
Article
MatConvNet is an implementation of Convolutional Neural Networks (CNNs) for MATLAB. The toolbox is designed with an emphasis on simplicity and flexibility. It exposes the building blocks of CNNs as easy-to-use MATLAB functions, providing routines for computing linear convolutions with filter banks, feature pooling, and many more. In this manner, MatConvNet allows fast prototyping of new CNN architectures; at the same time, it supports efficient computation on CPU and GPU allowing to train complex models on large datasets such as ImageNet ILSVRC. This document provides an overview of CNNs and how they are implemented in MatConvNet and gives the technical details of each computational block in the toolbox.
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif-ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make train-ing faster, we used non-saturating neurons and a very efficient GPU implemen-tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.
Article
Computer-aided detection (CAD) can help radiologists to detect pulmonary nodules at an early stage. In pulmonary nodule CAD systems, feature extraction is very important for describing the characteristics of nodule candidates. In this paper, we propose a novel three-dimensional shape-based feature descriptor to detect pulmonary nodules in CT scans. After lung volume segmentation, nodule candidates are detected using multi-scale dot enhancement filtering in the segmented lung volume. Next, we extract feature descriptors from the detected nodule candidates, and these are refined using an iterative wall elimination method. Finally, a support vector machine-based classifier is trained to classify nodules and non-nodules. The performance of the proposed system is evaluated on Lung Image Database Consortium data. The proposed method significantly reduces the number of false positives in nodule candidates. This method achieves 97.5% sensitivity, with only 6.76 false positives per scan.