Conference PaperPDF Available

Lung Image Segmentation Using Deep Learning Methods and Convolutional Neural Networks

Authors:

Abstract and Figures

This paper presents results of the first, exploratory stage of research and developments on segmentation of lungs in X-Ray chest images (Chest Radiographs) using Deep Learning methods and Encoder-Decoder Convolutional Neural Networks (ED-CNN). Computational experiments were conducted using GPU Nvidia TITAN X equipped with 3072 CUDA Cores and 12Gb of GDDR5 memory. Comparison of resultant segmentation accuracy with manual segmentation using Dice's score has revealed that the average accuracy achieves 0.962 with the minimum and maximum Dice's score values of 0.926, 0.974 respectively, and standard deviation of 0.008. The study was performed in the context of large-scale screening of population for lung and heart diseases as well as development of computational services for international portal on lung tuberculosis. The results obtained with this study allow concluding that ED-CNN networks may be considered as a promising tool for automatic lung segmentation in large-scale projects.
Content may be subject to copyright.
Lung Image Segmentation Using Deep Learning Methods and
Convolutional Neural Networks
Alexander Kalinovsky, Vassili Kovalev
United Institute of Informatics Problems, Belarus National Academy of Sciences
Surganova St., 6, 220012 Minsk, Belarus
gakarak@gmail.com, vassili.kovalev@gmail.com, http://imlab.grid.by/
Abstract: This paper presents results of the first,
exploratory stage of research and developments on
segmentation of lungs in X-Ray chest images (Chest
Radiographs) using Deep Learning methods and
Encoder-Decoder Convolutional Neural Networks (ED-
CNN). Computational experiments were conducted using
GPU Nvidia TITAN X equipped with 3072 CUDA Cores
and 12Gb of GDDR5 memory. Comparison of resultant
segmentation accuracy with manual segmentation using
Dice’s score has revealed that the average accuracy
achieves 0.962 with the minimum and maximum Dice’s
score values of 0.926, 0.974 respectively, and standard
deviation of 0.008. The study was performed in the
context of large-scale screening of population for lung
and heart diseases as well as development of
computational services for international portal on lung
tuberculosis. The results obtained with this study allow
concluding that ED-CNN networks may be considered as
a promising tool for automatic lung segmentation in
large-scale projects.
Keywords: Image segmentation, Deep Learning,
Convolutional Neural Networks, Lung.
1. INTRODUCTION
The image segmentation problem. Medical Image
segmentation is known to be one of complicated problems
in the image processing and image analysis field [1].
Typically, segmentation of target image objects comes
before other image analysis stages and therefore any
mistakes of incorrect detection of objects’ borders affect
all the subsequent steps severely. This paper is dealing
with chest X-Ray images, which are also known as chest
radiographs.
Despite the problem of segmentation of lung
component in X-Ray images of chest has been addressed
in several studies (see, for example, [2, 3]), the results of
fully automatic extraction of lung region remains
unsatisfactory in many occasions. This is especially true
in case of segmentation of lungs, which are affected by
various pathological processes and/or severe changes
associated with age. The problem of an automatic and
accurate segmentation worsened even further in the
scenario of massive screening of population [4] where it
moves into the Big Data domain [5].
Deep Learning and Convolutional Neural Networks.
Recently, there can be observed an explosion of interest to
the Deep Learning methodology. Such a methodology is
commonly understood as a branch of machine learning
methods, which capitalize on algorithms that attempts to
model high-level abstractions in data using multiple
processing layers. The corresponding computational
architecture and multiple processing/abstraction layers
typically represented using Convolutional Neural
Networks (CNN). Such a great interest to the deep neural
networks in general and CNN in particular can be partly
explained by the fact that since 2009, they have won
many official international pattern recognition
competitions, achieving the first superhuman visual
pattern recognition results in limited domains (see [6] for
up-to-date review of the field).
Encoder-Decoder Convolutional Neural Networks.
The first approaches employing deep learning methods
for image segmentation were similar to the ones, which
already examined earlier in previous image processing
and pattern recognition works. They tried to directly
adopt deep learning architectures for categorization small
image patches or pixel neighborhoods to certain classes
[7]. More recently, Vijay Badrinarayanan and colleagues
from University of Cambridge have presented a novel and
practical deep fully convolutional neural network
architecture for semantic pixel-wise segmentation termed
SegNet [8, 9]. This core trainable segmentation engine
consists of an encoder network, corresponding decoder
network followed by a pixel-wise classification layer. The
role of the decoder network is to map the low resolution
encoder feature maps to full input resolution feature maps
for pixel-wise classification. The resultant Encoder-
Decoder Convolutional Neural Network (ED-CNN) can
be viewed as a next step of generalization of neural
networks.
The purpose of this study was to examine the ability
of the deep learning methods and ED-CNN neural
networks to segment the lung component in chest X-Ray
images. From application point of view, this study was
performed in the context of large-scale screening of
population for lung and heart diseases, which resulted in
X-Ray image databases containing up to millions of items
[10] as well as development of computational services
[11] for international portal on Lung Tuberculosis hosted
by Amazon [12]. Since it is not feasible and impractical to
assess the efficiency of ED-CNN networks immediately
on the whole image database, this study was subdivided
into the following three subsequent stages:
(1) An exploratory trial based on a small set contained
few hundred of manually segmented chest images, which
used for both training and testing. Drawing conclusions
regarding the potential utility of ED-CNN networks.
(2) Modification of ED-CNN networks and extensive
testing on separate training and test sets containing
thousands of cases each. Adaptation network architecture
for lung segmentation in 3D Computed Tomography (CT)
images and testing.
(3) Porting resultant software solutions to a powerful
workstation equipped by modern GPUs and incorporation
into the target environment.
21
Kalinovsky A. and Kovalev V. Lung image segmentation using Deep Learning methods and convolutional neural networks . In:
XIII Int. Conf. on Pattern Recognition and Information Processing, 3-5 October, Minsk, Belarus State University, 2016, pp. 21-24
Thus, this paper dedicated to the first, exploratory
stage of the whole bunch of prospective research and
developments on lung segmentation in radiological
images using deep learning methods and recent neural
network approaches.
2. MATERIALS
The image set consisted of 354 X-Ray chest images,
each of which accompanied by lung masks resulted from
manual segmentation. These images originated from two
different sources:
107 images from tuberculosis portal [12] (image
Source 1),
247 images from open Japanese JSRT Database [13]
(image Source 2).
Original images from both sources and corresponding
masks of the lung component are illustrated in Fig. 1.
Fig.1 Example of original chest X-Ray images from two
image sources and their lung masks obtained manually.
It was hoped that the use of inhomogeneous dataset
containing chest images acquired in different countries
with the help of different scanners would be helpful for
obtaining more objective and conclusive testing results.
3. METHODS
The basic elements of the SegNet neural network
architecture (Fig. 2) can be viewed as a stack of
convolution layers (Encoder) with their corresponding de-
convolution layers (Decoder). The network architecture
used in this work had 4 encoding and 4 decoding layers.
Every encoder layer reduces the input feature map size by
factor of 2. Therefore, the combined sub-sampling rate
was equal to 16. It is commonly known that large scaling
factors can potentially improve desired properties of
displacement, rotation and scale invariance of the
convolution network being considered in the spatial
domain. Also, in case of chest X-Ray image
segmentation, the original input images already partly
aligned due to the natural top-bottom orientation of
patient’s body within the scanner. Consequently, the lung
area is typically located near the image center and the top
part of lung situated in the upper half of the image. Thus,
the relatively large value of scale factor such as 16-fold
represents a good spatial tolerance for the problem in
hand.
Fig.2 Architecture of Deep Encoder-Decoder
Convolutional Neural Networks
In this work we used ReLU as the nonlinear activation
function [13]. The MaxPooling sub-sampling was used on
the encoding stage and MaxPooling up-sampling (un-
pooling) utilized for the decoding stage. At every stage,
the window size was set to a small patch of 2x2 pixels in
size, without overlapping.
It is known that the problem of unpooling in decoder
layers is not uniquely defined. In order to solve this
problem in SegNet, the upsampling of feature map in
decoder layer was implemented using max-pool index
from corresponding encoder layer (see Fig. 3). Every
convolution and deconvolution layer maintains a fixed
number of filters (Fig. 3), which was set to 64 filters.
On the final layer of ED-CNN neural network we used
SoftMax function of the following type:
ii
k
kx
x
y)exp( )exp(
.
At the classification stage, the following two
techniques have been used for reducing the influence of
X-Ray intensity variations in the original images to the
neural network being employed:
(a) At a preprocessing stage, we transform the
intensity of each input image using the histogram
equalization technique [14].
(b) The Local Contrast Normalization (LCN)
procedure [15] was applied at the input of encoding
layers.
At the experimentation step the ED-CNN neural
network was trained on a graphics processor Nvidia
TITAN X equipped with 3072 CUDA Cores and 12Gb of
GDDR5 memory. The network training parameters were
set to:
22
Batch size: 6 (the minimum batch size to place
network into GPU memory),
Type of Solver: SGD Caffe solver,
Number of iterations: 5000,
Number of epochs: 85.
Fig.3 Interaction scheme between the encoding and
decoding neural network layers.
The network training required 11 Gigabytes of GPU
memory while the full training time was approximately 3
hours. The resultant automatic segmentation accuracy
score assessed by way of comparison with the results of
manual segmentation using well-known Dice’s score,
which calculated as:
,
TS TS
DSCORE
where T is the “true” lung area resulted from manual
segmentation, which was treated here as ground truth, and
S is the lung area obtained with the automatic
segmentation using ED-CNN neural network. In all the
occasions, the lung area was measured as the number of
image pixels constituting the lung image component.
4. RESULTS
On testing stage, the average accuracy was estimated
as 0.962 with the minimum and maximum Dice’s score
values of 0.926 and 0.974 respectively, and standard
deviation of 0.008.
Typical examples of automatic segmentation results
obtained using the ED-CNN neural network with the best
and worst scores are show in Fig. 4 and Fig. 5
respectively.
5. CONCLUSION
Results reported with this study allow drawing the
following conclusions.
(1) The Encoder-Decoder Deep Convolutional Neural
Networks may be considered as a promising tool in large-
scale projects for automatic lung segmentation in chest X-
Ray images. The segmentation accuracy obtained was
well comparable with the accuracy provided by a
specialized segmentation methods, which are based on the
known “segmentation by registration” technique [4]. This
technique was implemented by authors earlier and made
public on a dedicated web site [11].
(2) The main advantage of the method considered in
this work is the fact that the Deep Learning approach
followed here is uniform enough and therefore can be
applied to a wide range of different medical image
segmentation tasks with minimum modifications.
Furthermore, the method can be generalized for
segmentation of 3D tomography images and solving other
medical image analysis problems such as detection of
“atypical” image regions, which often associated with
lesions and other kinds of abnormalities.
Fig.4 Example of segmentation results with maximum that
is the best Dice’s score.
Fig.5 Example of segmentation results with minimum that
is with worst Dice’s score.
Acknowledgements. This work was partly funded by
the National Institute of Allergy and Infectious Diseases,
National Institutes of Health, U.S. Department of Health
23
and Human Services, USA through the CRDF project
OISE-15-61772-1.
6. REFERENCES
[1] Handbook of Medical Image Processing and Analysis,
2nd Edition, I.H.Bankman (Ed.), Academic Press,
ISBN 978-0-12-373904-9, San Diego, USA, 2009,
985 p.
[2] S. Candemir et al. Lung Segmentation in Chest
Radiographs Using Anatomical Atlases With Nonrigid
Registration, IEEE Transactions on Medical Imaging,
33 (2), 2014, p. 577-590.
[3] A. Prus, V. Kovalev, P. Vankevich. A method for lung
segmentation in massive X-ray screening of the
population. International Journal of Computer
Assisted Radiology and Surgery, 3(1), 2009, p. 367-
368.
[4] S. Jaeger et al. Automatic tuberculosis screening using
chest radiographs. IEEE Transactions on Medical
Imaging, 33 (2), 2014, p. 233-245.
[5] V.A. Kovalev, A.A. Kalinovsky. Big Medical Data:
Image mining, retrieval and analytics, Proceedings of
the International Conference on Big Data and
Predictive Analytics, Belarus State University of
Informatics and Radioelectronics, ISBN 978-985-543-
146-7, Minsk, Belarus, June 2015, pp. 33-46.
[6] J. Schmidhuber. Deep Learning in Neural Networks:
An Overview, Neural Networks, vol. 61, 2015, pp.
85117.
[7] C.Farabet, C.Couprie, L.Najman, and Y.LeCun,
Learning hierarchical features for scene labeling,
IEEE PAMI, vol. 35 (8), 2013, pp. 19151929.
[8] V.Badrinarayanan A.Kendall, R.Cipolla. SegNet: A
Deep Convolutional Encoder-Decoder Architecture
for Robust Semantic Pixel-Wise Labelling, arXiv
preprint, 2015, arXiv:1505.07293
[9] V.Badrinarayanan A.Kendall, R.Cipolla. SegNet: A
Deep Convolutional Encoder-Decoder Architecture
for Image Segmentation, arXiv preprint, 2015, arXiv:
1511.00561.
[10] V.A.Kovalev, V.A.Lapizky, A.A.Dmitruk, A.A
Kalinovsky. Big Data in Medicine: A database of
chest radiographs for diagnosis, treatment, and
scientific research goals, Proceedings of the
International Conference on Big Data and Predictive
Analytics, Belarus State University of Informatics and
Radioelectronics, ISBN 978-985-543-146-7, Minsk,
June 2015, pp. 66-71 (in Russian).
[11] http://imlab.grid.by/ Last visited 11.04.2016.
[12] http://tuberculosis.by/ Last visited 11.04.2016. Under
construction.
[13] X.Glorot, A.Bordes, Y.Bengio. Deep sparse rectifier
neural networks. Proceedings of the 14th
International Conference on Artificial Intelligence
and Statistics (AISTATS), 2011, Fort Lauderdale, FL,
USA, vol 15 of JMLR, 2011, pp. 315-323.
[14] J.A. Stark. Adaptive image contrast enhancement
using generalizations of histogram equalization, IEEE
Transactions on Image Processing, vol. 9 (5), 2000,
pp. 889-896.
[15] S.Lyu, E.P. Simoncelli. Nonlinear image represent-
tation using divisive normalization. Proceeding of the
IEEE Computer Society Conference on Computer
Vision and Pattern Recognition (CVPR-2008),
Anchorage, Alaska, USA, 23-28 June 2008, pp. 1-8.
[16] http://imlab.grid.by/appsegmxr/ Last visited
11.04.2016.
24
... Peruasan bidang paru-paru pada CXR telah mendapat perhatian yang besar dalam kajian perpustakaan sehingga kini (Mittal et al. 2017). Kebanyakannya menggunakan pangkalan data awam JSRT sebagai imej mereka (Candemir et al. 2014;Hassen & Ben 2011;Kalinovsky & Kovalev 2016;Li et al. 2016;Novikov et al. 2018;Shao et al. 2014;van Ginneken et al. 2006;Wan Ahmad et al. 2015;Wu et al. 2015;Zhang et al. 2014), iaitu CXR yang disediakan secara terbuka oleh Persatuan Teknologi Radiologi Jepun (JSRT), berserta data kesahihan klinikal (Shiraishi et al. 2000). Set data ini hanya terdiri daripada CXR posterior-anterior (PA), yang diambil oleh mesin radiograf pegun. ...
... Kaedah cadangan juga dapat dibandingkan dengan baik dengan kaedah berasaskan model yang diselia dalam (Li et al. 2016;Shao et al. 2014;Wu et al. 2015;Zhang et al. 2014) dengan perbezaan skor-bertindih dari 0.033 hingga 0.077. Perbandingan dengan kaedah rangkaian neural terbaru (Kalinovsky & Kovalev 2016;Novikov et al. 2018) pula ialah masing-masing 0.092 dan 0.08, namun lebih daripada 50% imej JSRT digunakan daalam proses latihan. Di samping itu, kaedah cadangan adalah tanpa penyeliaan dan sepenuhnya automatik di mana tiada latihan atau peringkat pembelajaran diperlukan. ...
... Shao et al. 2014;Wu et al. 2015;Zhang et al. 2014) dengan perbezaan skor-bertindih dari 0.033 hingga 0.077. Perbandingan dengan kaedah rangkaian neural terbaru (Kalinovsky & Kovalev 2016;Novikov et al. 2018) pula ialah masingmasing 0.092 dan 0.08, namun lebih daripada 50% imej JSRT digunakan daalam proses latihan. Di samping itu, kaedah cadangan adalah tanpa penyeliaan dan sepenuhnya automatik di mana tiada latihan atau peringkat pembelajaran diperlukan. ...
Article
Unsupervised lung segmentation method is one of the mandatory processes in order to develop a Content Based Medical Image Retrieval System (CBMIRS) of CXR. There is limited study found on segmentation of mobile chest radiographs, that is relatively important especially for very sick patients whenever their radiographs will be taken using portable X-Ray machine.The purpose of the study is to present a solution for lung segmentation of standard and mobile chest radiographs using fully automated unsupervised method, based on oriented Gaussian derivatives filter with seven orientations, combined with Fuzzy C-Means clustering and thresholding to refine the lung region. A new algorithm to automatically generate a threshold value for each Gaussian response is also proposed. The algorithms are applied to both PA and AP chest radiographs from both public JSRT and private datasets from collaborative hospital. Two pre-processing blocks are introduced to standardize the images from different machines. Comparisons with the previous works found in the literature on JSRT dataset shows that our method gives a reasonably good result. Performance measures (accuracy, F-score, precision, sensitivity and specificity) for the segmentation of lung in public JSRT dataset are above 0.90 except for the overlap measure is 0.87. The median of overlap score for the private image database is 0.83 (standard machine) and 0.75 (mobile machines). The algorithm is also fast, with the average execution time of 12.5s. Our proposed method is fully automated, unsupervised, with no training or learning stage is necessary to segment the lungs taken using both a standard and mobile machines, and useful for the application of the CBMIRS.
... Besides, the MRI can present several problems, like intensity inhomogeneity [8], or varying intensities between the same sequences [9]. Due to the high performance provided by deep learning (DL) in various areas, medical researchers have also exploited this technique in different axes such as the brain [7,10,11], the lung [12], the pancreas [13,14], the prostate [15], and multi-organ [16,17]. The schemes based on DL have provided superior performance compared to traditional segmentation methods. ...
... The manual analysis of these MRIs is fastidious and time-consuming. It is very important to develop an accurate and fully automatic In this paper, we proposed a fully automatic CNN due to the high performance provided in various areas [7][8][9][10][11][12][13][14][15][16][17]. The schemes based on DL have provided superior performance compared to traditional segmentation methods. ...
Article
Full-text available
The quantitative analysis of brain magnetic resonance imaging (MRI) represents a tiring routine and enormously on accurate segmentation of some brain regions. Gliomas represent the most common and aggressive brain tumors. In their highest grade, it can lead to a very short life. The treatment planning is decided after the analysis of MRI data to assess tumors. This treatment is manually performed which needs time and represents a tedious task. Automatic and accurate segmentation technique becomes a challenging problem since these tumors can take a variety of sizes, contrast, and shape. For these reasons, we are motivated to suggest a new segmentation approach using deep learning. A new segmentation scheme is suggested using Convolutional Neural Networks (CNN). The presented scheme is tested using recent datasets (BraTS 2017, 2018, and 2020). It achieves good performances compared to new methods, with Dice scores of 0.86 for the Whole Tumor, 0.82 for Tumor Core, and 0.6 for Enhancing Tumor based on the first dataset. According to the second dataset, the three regions had an average of 0.88, 0.77, and 0.65, respectively. The new dataset provides 0.87, 0.91, and 0.79 for the three regions, respectively.
... In recent years the CNNs have been employed in medical field for diagnosing chest problems 33 . Kalinovsky et al. 34 recently adopted the four-layered encoder-decoder architecture, which is called SegNet 35 , for the lung area segmentation. Novikov et al. 16 adopted U-Net 22 for multi-class segmentation in chest radiographs. ...
Article
Full-text available
Automated multi-organ segmentation plays an essential part in the computer-aided diagnostic (CAD) of chest X-ray fluoroscopy. However, developing a CAD system for the anatomical structure segmentation remains challenging due to several indistinct structures, variations in the anatomical structure shape among different individuals, the presence of medical tools, such as pacemakers and catheters, and various artifacts in the chest radiographic images. In this paper, we propose a robust deep learning segmentation framework for the anatomical structure in chest radiographs that utilizes a dual encoder–decoder convolutional neural network (CNN). The first network in the dual encoder–decoder structure effectively utilizes a pre-trained VGG19 as an encoder for the segmentation task. The pre-trained encoder output is fed into the squeeze-and-excitation (SE) to boost the network’s representation power, which enables it to perform dynamic channel-wise feature calibrations. The calibrated features are efficiently passed into the first decoder to generate the mask. We integrated the generated mask with the input image and passed it through a second encoder–decoder network with the recurrent residual blocks and an attention the gate module to capture the additional contextual features and improve the segmentation of the smaller regions. Three public chest X-ray datasets are used to evaluate the proposed method for multi-organs segmentation, such as the heart, lungs, and clavicles, and single-organ segmentation, which include only lungs. The results from the experiment show that our proposed technique outperformed the existing multi-class and single-class segmentation methods.
... Many different automated lung segmentation methods for CRs have been proposed in the literature [4,5]. These include rule-based methods [6][7][8], pixel classification [9], active shape models [1,[10][11][12], hybrid methods [13][14][15], and deep learning methods [16][17][18][19][20]. Recently, deep learning methods using convolutional neural networks (CNNs) have emerged as among of the best performing approaches for lung segmentation in CRs. ...
Article
Full-text available
Lung segmentation plays an important role in computer-aided detection and diagnosis using chest radiographs (CRs). Currently, the U-Net and DeepLabv3+ convolutional neural network architectures are widely used to perform CR lung segmentation. To boost performance, ensemble methods are often used, whereby probability map outputs from several networks operating on the same input image are averaged. However, not all networks perform adequately for any specific patient image, even if the average network performance is good. To address this, we present a novel multi-network ensemble method that employs a selector network. The selector network evaluates the segmentation outputs from several networks; on a case-by-case basis, it selects which outputs are fused to form the final segmentation for that patient. Our candidate lung segmentation networks include U-Net, with five different encoder depths, and DeepLabv3+, with two different backbone networks (ResNet50 and ResNet18). Our selector network is a ResNet18 image classifier. We perform all training using the publicly available Shenzhen CR dataset. Performance testing is carried out with two independent publicly available CR datasets, namely, Montgomery County (MC) and Japanese Society of Radiological Technology (JSRT). Intersection-over-Union scores for the proposed approach are 13% higher than the standard averaging ensemble method on MC and 5% better on JSRT.
... The output of our model is seen in Table 1. Table 1: Correlation of dice coefficients of our suggested method with the prior state-of-the-art Models Dice Coefficients ED-CNN [14] 97.4 FCN [15] 97.7 Our Model 98.1 ...
Preprint
Full-text available
An essential stage in computer aided diagnosis of chest X rays is automated lung segmentation. Due to rib cages and the unique modalities of each persons lungs, it is essential to construct an effective automated lung segmentation model. This paper presents a reliable model for the segmentation of lungs in chest radiographs. Our model overcomes the challenges by learning to ignore unimportant areas in the source Chest Radiograph and emphasize important features for lung segmentation. We evaluate our model on public datasets, Montgomery and Shenzhen. The proposed model has a DICE coefficient of 98.1 percent which demonstrates the reliability of our model.
... The output of our model is seen in Table 1. Table 1: Correlation of dice coefficients of our suggested method with the prior state-of-the-art Models Dice Coefficients ED-CNN [14] 97.4 FCN [15] 97.7 Our Model 98.1 ...
Article
Full-text available
An essential stage in computer-aided diagnosis of chest X-rays is automated lung segmentation. Due to rib cages and the unique modalities of each person's lungs, it is essential to construct an effective automated lung segmentation model. This paper presents a reliable model for the segmentation of lungs in chest radiographs. Our model overcomes the challenges by learning to ignore unimportant areas in the source Chest Radiograph and emphasize important features for lung segmentation. We evaluate our model on public datasets, Montgomery[1] and Shenzhen [2]. The proposed model has a DICE coefficient of 98.1% which demonstrates the reliability of our model.
... As well, in [22] the author anticipated human activity based detection for individual health monitoring of elderly people with sensor classification associated to certain significant activities in smart home environment. Data of smart meters are utilized in [23] for activity recognition with Non-intrusive appliance based load monitoring and D-S evidence theory. This work gathers all pre-processed data from homes to describe electrical appliance utilize patterns and machine learning based approaches to segregate foremost activities inside home. ...
Article
At present, there is a constant migration of people is encountered in urban regions. Health care services are considered as a confronting challenging factors, there is an extremely influenced by huge arrival of people to city centre. Subsequently, places all around the world are spending in digital evolution in an attempt to offer healthy eco-system for huge people. With this transformation, enormous homes are equipped with smarter devices (for example, sensors, smart sensors and so on) which produce huge amount of indexical data and fine-grained that is examined to assist smart city services. In this work, a model has been anticipated to utilize smart home big data analysis as a discovering and learning human activity patterns for huge health care applications. This work describes and highlights the experimentation with the analysis of vigorous data analysis process that assists healthcare analytics. This procedure comprises of subsequent stages: understanding, collection, cleaning, validation, enrichment, integration and storage. It has been resourcefully utilized to processing of data types variety comprising clinical data from EHR.
Article
Purpose: Lung cancer can evolve into one of the deadliest diseases whose early detection is one of the major survival factors. However, early detection is a challenging task due to the unclear structure, shape, and the size of the nodule. Hence, radiologists need automated tools to make accurate decisions. Methods: This paper develops a new approach based on generative adversarial network (GAN) architecture for nodule detection to propose a two-step GAN model containing lung segmentation and nodule localization. The first generator comprises a U-net network, while the second utilizes a mask R-CNN. The task of lung segmentation involves a two-class classification of the pixels in each image, categorizing lung pixels in one class and the rest in the other. The classifier becomes imbalanced due to numerous non-lung pixels, decreasing the model performance. This problem is resolved by using the focal loss function for training the generator. Moreover, a new loss function is developed as the nodule localization generator to enhance the diagnosis quality. Discriminator nets are implemented in GANs as an ensemble of convolutional neural networks (ECNNs), using multiple CNNs and connecting their outputs to make a final decision. Results: Several experiments are designed to assess the model on the well-known LUNA dataset. The experiments indicate that the proposed model can reduce the error of the state-of-the-art models on the IoU criterion by about 35 and 16% for lung segmentation and nodule localization, respectively. Conclusion: Unlike recent studies, the proposed method considers two loss functions for generators, further promoting the goal achievements. Moreover, the network of discriminators is regarded as ECNNs, generating rich features for decisions.
Article
Full-text available
We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network . The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the fully convolutional network (FCN) architecture and its variants. This comparison reveals the memory versus accuracy trade-off involved in achieving good segmentation performance. The design of SegNet was primarily motivated by road scene understanding applications. Hence, it is efficient both in terms of memory and computational time during inference. It is also significantly smaller in the number of trainable parameters than competing architectures and can be trained end-to-end using stochastic gradient descent. We also benchmark the performance of SegNet on Pascal VOC12 salient object segmentation and the recent SUN RGB-D indoor scene understanding challenge. We show that SegNet provides competitive performance although it is significantly smaller than other architectures. We also provide a Caffe implementation of SegNet and a webdemo at http://mi.eng.cam.ac.uk/projects/segnet/
Conference Paper
Full-text available
This paper is devoted to the key issues associated with handling of the content of very large databases of natively digital medical images. A particular attention is drawn to the problem of examining image content in order to generate new knowledge, which is lately referred to as the image mining problem. Other important questions discussed in the paper are the content-based medical image retrieval and image analytics that are aimed at automation of recent patient diagnosis and treatment technologies. It is also argued that in many occasions such tasks of big medical image data analysis can be solved using co-occurrence image descriptors of different sorts.
Article
Full-text available
We propose a novel deep architecture, SegNet, for semantic pixel wise image labelling. SegNet has several attractive properties; (i) it only requires forward evaluation of a fully learnt function to obtain smooth label predictions, (ii) with increasing depth, a larger context is considered for pixel labelling which improves accuracy, and (iii) it is easy to visualise the effect of feature activation(s) in the pixel label space at any depth. SegNet is composed of a stack of encoders followed by a corresponding decoder stack which feeds into a soft-max classification layer. The decoders help map low resolution feature maps at the output of the encoder stack to full input image size feature maps. This addresses an important drawback of recent deep learning approaches which have adopted networks designed for object categorization for pixel wise labelling. These methods lack a mechanism to map deep layer feature maps to input dimensions. They resort to ad hoc methods to upsample features, e.g. by replication. This results in noisy predictions and also restricts the number of pooling layers in order to avoid too much upsampling and thus reduces spatial context. SegNet overcomes these problems by learning to map encoder outputs to image pixel labels. We test the performance of SegNet on outdoor RGB scenes from CamVid, KITTI and indoor scenes from the NYU dataset. Our results show that SegNet achieves state-of-the-art performance even without use of additional cues such as depth, video frames or post-processing with CRF models.
Article
Full-text available
The National Library of Medicine (NLM) is developing a digital chest x-ray (CXR) screening system for deployment in resource constrained communities and developing countries worldwide with a focus on early detection of tuberculosis. A critical component in the computer-aided diagnosis of digital CXRs is the automatic detection of the lung regions. In this paper, we present a non-rigid registration-driven robust lung segmentation method using image retrieval-based patient specific adaptive lung models that detects lung boundaries, surpassing state-of-the-art performance. The method consists of three main stages: (i) a content-based image retrieval approach for identifying training images (with masks) most similar to the patient CXR using a partial Radon transform and Bhattacharyya shape similarity measure, (ii) creating the initial patient-specific anatomical model of lung shape using SIFT-flow for deformable registration of training masks to the patient CXR, and (iii) extracting refined lung boundaries using a graph cuts optimization approach with a customized energy function. Our average accuracy of 95:4% on the public JSRT database is the highest among published results. A similar degree of accuracy of 94:1% and 91:7% on two new CXR datasets from Montgomery County, Maryland (USA) and India, respectively, demonstrates the robustness of our lung segmentation approach.
Article
Full-text available
Tuberculosis is a major health threat in many regions of the world. Opportunistic infections in immunocompromised HIV/AIDS patients and multi-drug-resistant bacterial strains have exacerbated the problem, while diagnosing tuberculosis still remains a challenge. When left undiagnosed and thus untreated, mortality rates of patients with tuberculosis are high. Standard diagnostics still rely on methods developed in the last century. They are slow and often unreliable. In an effort to reduce the burden of the disease, this paper presents our automated approach for detecting tuberculosis in conventional posteroanterior chest radiographs. We first extract the lung region using a graph cut segmentation method. For this lung region, we compute a set of texture and shape features, which enable the x-rays to be classified as normal or abnormal using a binary classifier. We measure the performance of our system on two datasets: a set collected by the tuberculosis control program of our local county's health department in the United States, and a set collected by Shenzhen Hospital, China. The proposed computer-aided diagnostic system for TB screening, which is ready for field deployment, achieves a performance that approaches the performance of human experts. We achieve an area under the ROC curve (AUC) of 87% (78.3% accuracy) for the first set, and an AUC of 90% (84% accuracy) for the second set. For the first set, we compare our system performance with the performance of radiologists. When trying not to miss any positive cases, radiologists achieve an accuracy of about 82% on this set, and their false positive rate is about half of our system's rate.
Article
Full-text available
Scene labeling consists of labeling each pixel in an image with the category of the object it belongs to. We propose a method that uses a multiscale convolutional network trained from raw pixels to extract dense feature vectors that encode regions of multiple sizes centered on each pixel. The method alleviates the need for engineered features, and produces a powerful representation that captures texture, shape, and contextual information. We report results using multiple postprocessing methods to produce the final labeling. Among those, we propose a technique to automatically retrieve, from a pool of segmentation components, an optimal set of components that best explain the scene; these components are arbitrary, for example, they can be taken from a segmentation tree or from any family of oversegmentations. The system yields record accuracies on the SIFT Flow dataset (33 classes) and the Barcelona dataset (170 classes) and near-record accuracy on Stanford background dataset (eight classes), while being an order of magnitude faster than competing approaches, producing a $(320\times 240)$ image labeling in less than a second, including feature extraction.
Conference Paper
Full-text available
In this paper, we describe a nonlinear image represen- tation based on divisive normalization that is designed to match the statistical properties of photographic images, as well as the perceptual sensitivity of biological visual sys- tems. We decompose an image using a multi-scale oriented representation, and use Student's t as a model of the de- pendencies within local clusters of coefficients. We then show that normalization of each coefficient by the square root of a linear combination of the amplitudes of the coef- ficients in the cluster reduces statistical dependencies. We further show that the resulting divisive normalization trans- form is invertible and provide an efficient iterative inversion algorithm. Finally, we probe the statistical and perceptual advantages of this image representation by examining its robustness to added noise, and using it to enhance image contrast.
Article
While logistic sigmoid neurons are more biologically plausable that hyperbolic tangent neurons, the latter work better for training multi-layer neural networks. This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbolic tangent networks in spite of the hard non-linearity and non-differentiability at zero, creating sparse representations with true zeros, which seem remarkably suitable for naturally sparse data. Even though they can take advantage of semi-supervised setups with extra-unlabelled data, deep rectifier networks can reach their best performance without requiring any unsupervised pre-training on purely supervised tasks with large labelled data sets. Hence, these results can be seen as a new milestone in the attempts at understanding the difficulty in training deep but purely supervised nueral networks, and closing the performance gap between neural networks learnt with and without unsupervised pre-training
Article
In recent years, deep neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.