Conference PaperPDF Available

Automatic Acute Lymphocytic Leukemia Diagnosis Based on Kernel Ridge Regression Method

Authors:

Abstract and Figures

Leukemia is a type of cancer caused by abnormal increase of the white blood cells. Each year hundreds of thousands of people die of leukemia throughout the world. Leukemic cells become out of control and they spread independently. They cause structural and irreversible damage in the organs where blood cells are produced, in other organs and in tissues. If not treated properly, leukemia costs the life of the patient very quickly. Early diagnosis is vital and it should be followed by a treatment applied to the correct cells. It is possible to achieve successful results in treatment, if leukemic and non-leukemic cells are classified correctly. This work is focused on acute lymphocytic leukemia, which affects young children and has a higher expectation of survival rate as compared to acute myelogenous leukemia. A new algorithm that combines morphological image processing techniques with Kernel Ridge Regression is developed. The feature set is extracted using Gray Level Co-occurrence Matrices. The features used in classification are average, skewness, kurtosis, correlation, energy, cluster prominence and inverse-difference moment. The algorithm is tested on a large data set. The performance is measured based on the ensemble average and the root-mean-square error criteria. The results are compared with the diagnoses of the physician. Random runs are ensemble averaged and the success rates of the algorithm are found to lie in the range of 94.75-96.43%.
Content may be subject to copyright.
AWERProcedia
Information Technology
& Computer Science
2 (2012) 50-55
2nd World Conference on Innovation and Computer Sciences 2012
Automatic Acute Lymphocytic Leukemia Diagnosis Based on Kernel
Ridge Regression Method
Merve Ayyuce Kizrak, Figen Ozen
a
*
Halic University, Electrical and Electronics Engineering Department, Sıracevizler St. Nr. 29, Bomonti, Şişli,
Istanbul, 34363 Turkey
Abstract
Leukemia is a type of cancer caused by abnormal increase of the white blood cells. Each year hundreds of thousands of
people die of leukemia throughout the world. Leukemic cells become out of control and they spread independently. They
cause structural and irreversible damage in the organs where blood cells are produced, in other organs and in tissues. If not
treated properly, leukemia costs the life of the patient very quickly. Early diagnosis is vital and it should be followed by a
treatment applied to the correct cells. It is possible to achieve successful results in treatment, if leukemic and non-leukemic
cells are classified correctly. This work is focused on acute lymphocytic leukemia, which affects young children and has a
higher expectation of survival rate as compared to acute myelogenous leukemia. A new algorithm that combines
morphological image processing techniques with Kernel Ridge Regression is developed. The feature set is extracted using
Gray Level Co-occurrence Matrices. The features used in classification are average, skewness, kurtosis, correlation, energy,
cluster prominence and inverse-difference moment. The algorithm is tested on a large data set. The performance is
measured based on the ensemble average and the root-mean-square error criteria. The results are compared with the
diagnoses of the physician. Random runs are ensemble averaged and the success rates of the algorithm are found to lie in the
range of 94.75-96.43%.
Keywords: Biomedical image processing; pattern recognition; feature extraction; kernel based methods
Selection and/or peer review under responsibility of Prof. Dr. Dogan Ibrahim.
©2012 Academic World Education & Research Center. All rights reserved.
1. Introduction
Leukemia is a type of cancer caused by abnormal increase of the white blood cells. Each year
hundreds of thousands of people die of leukemia throughout the world.
ADDRESS FOR CORRESPONDENCE: Merve Ayyuce Kizrak, Halic University, Electrical and Electronics Engineering Department,
Sıracevizler St. Nr. 29, Bomonti, Şişli, Istanbul, 34363 Turkey
E-mail address: figenozen@halic.edu.tr. / Tel.: +90-212-343-0887
M. A. Kızrak, F. Özen/ AWERProcedia Information Technology & Computer Science (2012) 50-55
51
Leukemic cells become out of control and they spread independently. They cause structural and
irreversible damage in the organs where blood cells are produced, in other organs and in tissues. If not
treated properly, leukemia costs the life of the patient very quickly. Early diagnosis is vital and it
should be followed by a treatment applied to the correct cells. It is possible to achieve successful
results in treatment, if leukemic and non-leukemic cells are classified correctly.
Leukemia can be either acute or chronic. Acute leukemia spreads very rapidly and has to be treated
promptly. Treatment of the chronic leukemia does not have to be prompt. Acute leukemia can be
either lymphocytic (ALL) or myelogenous (AML), depending on which cells are under threat. Chronic
leukemia can be either lymphocytic (CLL) or myelogenous (CML) [1].
This work focuses on acute lymphocytic leukemia (ALL), which affects young children and has a
higher expectation of survival rate as compared to AML.
Many methods have been applied by the researchers to identify the leukemic cells. In [2], image
segmentation is combined with classification using tree structure. They have found out that
segmentation by areas technique is better than the gradient based contour following algorithms.
In [3], morphological operations are combined with Watershed algorithm for image segmentation
and textural, geometrical and statistical features are generated. Support vector machines are used for
classification.
In [4], linear kernel support vector machines are used for classification of the textural, geometrical
and statistical features of the myelogenous leukemic cells. Then radial kernel support vector machines
are used as the classifier.
Scotti has isolated the lymphocytes in [5], and categorized them as either blast or normal. After
morphological operations, the classification has been done using k-nearest neighbor, feedforward
neural networks and linear Bayes normal classifier. The results are compared. He has studied the cell
images in L*a*b* (lightness, red/green, yellow/blue) space in [6]. Background is suppressed, mean cell
diameter is estimated and the white cells are segmented for automatic diagnosis.
In [7], computer generated hologram technique has been used. The Zernike moments of the
computer reconstructed holographic images are used as features due to their rotation invariance
property. Minimum mean distance and the k-nearest neighbor methods are used for classification of
the leukemic cells.
In [8], Sobel mask edge detection and recursive image segmentation techniques have been
combined to separate foreground from background. The algorithm classifies single cells from clusters.
In [9], various contrast enhancement techniques have been applied to blurred and noisy cell
images. Out of the tested enhancement techniques, namely local, global, partial, bright and dark
contrast stretching, the partial contrast stretching technique has performed the best.
In [10], a comparison of segmentation of the acute leukemic cells using RGB and HIS colour spaces
has been included. The results show that, in HIS the blasts are extracted more successfully.
Khashman and Al-Zgoul have segmented the leukemic cell images using bimodal thresholding,
dilation, region filling and boundary tracing, filtering and elimination of the unwanted particles and
objects, restoration of the separated cell regions (cytoplasm and nucleus) in [11]. The proposed
algorithm has been tested on various leukemia types (ALL, AML, CLL, CML). Successful segmentation
results are reported.
2. Recognition of Leukemia Cells
Leukemia is a type of cancer caused by abnormal increase of the white blood cells. Early diagnosis is
the first step for the treatment. It is also important to apply the treatment to the correct cells. If
leukemic and non-leukemic cells are classified correctly, successful treatment is possible.
In this work leukemic cells have been classified as leukemic and non-leukemic. The algorithm uses
preprocessing techniques such as splitting the image into its color bands (red, green, blue), applying
filters, morphological operations; extracts the features using the technique of gray-level co-occurrence
M. A. Kızrak, F. Özen/ AWERProcedia Information Technology & Computer Science (2012) 50-55
52
matrices, classifies the cells as leukemic and non-leukemic using Kernel Ridge Regression method. As
the final step, performance of the algorithm is evaluated. The flowchart of the proposed algorithm is
given in Fig. 1.
3. Simulation Results
The algorithm has been applied to a data set and the steps of the program for one sample are
illustrated in Fig. 2. The GLCM (gray-level co-occurrence matrix) technique uses the average,
skewness, kurtosis, correlation, energy, cluster prominence and homogeneity as the features. GLCM
technique is preferred due to its versatility and sound application in extracting second order textural
statistics of the cell images. As compared to many other methods, the mathematics of the technique is
very well established. A detailed discussion on GLCM technique can be found in [12]. Classification is
done using Kernel Ridge Regression method. It is preferred due to its robustness and its ability to
provide a closed-form solution for the problem under consideration. A detailed discussion on
mathematical background can be found in [13].
Fig. 1. Flowchart of the proposed algorithm
Cell image
Red band
Blue band
Top-Hat filter
Contrast adjustment
Binary thresholding and region filling
Extraction of the relevant region
Extraction of the color band features
Segmented image
Unification of the color bands
GLCM Feature extraction
- Average
- Skewness
- Kurtosis
- Correlation
- Energy
- Cluster Prominence
- Homogeneity
Mean Square Error and Performance Calculation
Green band
Kernel Ridge Regression Algorithm
Output image
M. A. Kızrak, F. Özen/ AWERProcedia Information Technology & Computer Science (2012) 50-55
53
4. Performance Evaluation
After morphological operations on the cell images in the dataset under consideration, features are
extracted using Gray Level Co-occurrence Matrices. Later, cells are classified as healthy and leukemic,
based on the decision obtained by Kernel Ridge Regression technique of pattern recognition. The
performance of the algorithm is tested by a comparison with the laboratory results. The program is
run several times, using 10 random cell images each time. Some of the results are shown in Table 1.
After 100 runs with the random data selection, the ensemble average of the success rate is
calculated as 96.43%.
5. Conclusions
It is a well-known fact that early diagnosis increases success rate in acute lymphocytic leukemia.
The results obtained by the developed algorithm show that it is feasible to help speed the diagnosis
extracting the features using Gray Level Co-occurrence Matrices and classifying the cells using Kernel
Ridge Regression method.
The next step will focus on the comparison of the proposed algorithmwith the other methods.
Red Band
Green Band
Blue Band
(a)
(b)
(c)
(d)
M. A. Kızrak, F. Özen/ AWERProcedia Information Technology & Computer Science (2012) 50-55
54
(e)
(f)
Fig. 2. Results of different steps of the algorithm: (a) Gaussian filter; (b) Top-hat filter; (c) contrast adjustment;
(d) binary thresholding and region filling; (e) extraction of the relevant regions; (f) unification of the three color
bands and the final image
Table 1. The success rates of the program for random data selection
Run #1
Success (%)
Run #2
Success (%)
Run #3
Success (%)
1
93.17
1
93.84
1
97.43
2
95.89
2
93.84
2
99.28
3
98.32
3
95.89
3
98.97
4
96.95
4
97.78
4
97.66
5
94.76
5
92.86
5
95.97
6
95.89
6
93.17
6
97.43
7
95.89
7
96.95
7
93.84
8
92.86
8
97.43
8
95.97
9
97.43
9
93.17
9
98.43
10
97.66
10
92.82
10
92.84
Average
95.88
Average
94.74
Average
96.78
Run #4
Success (%)
Run #5
Success (%)
Run #6
Success (%)
1
93.84
1
94.80
1
97.78
2
95.78
2
99.28
2
92.82
3
92.82
3
98.97
3
95.97
4
98.76
4
95.89
4
98.32
5
96.95
5
93.17
5
94.85
6
93.84
6
97.43
6
96.28
7
95.97
7
97.66
7
96.95
8
94.76
8
95.89
8
97.66
9
98.79
9
92.86
9
94.89
10
98.32
10
94.85
10
97.43
Average
95.98
Average
96.08
Average
96.20
M. A. Kızrak, F. Özen/ AWERProcedia Information Technology & Computer Science (2012) 50-55
55
References
[1] wikipedia.org, online access in May 2011.
[2] SerboutiS, DuhamelA, HarmsH, GunzerU, AusHHM, MaryJY, BeuschartR. Image segmentation and
classification methods to detect leukemias.Proc. of the Annual International Conference of the IEEE
Engineering in Medicine and Biology Society, USA, Vol. 13, No. 1; 1991, pp 260-261.
[3] OsowskiS, MarkiewiczT, MarinskaB, MoszczynskiL. Feature generation for the cell image recognition of
myelogenous leukemia.Proc. of the EUSIPCO 2004, Vienna, Austria; 2004,pp. 753-756.
[4] MarkiewiczT, OsowskiS, MarianskaB, MoszczynskiL.Automatic recognition of the blood cells of myelogenous
leukemia using SVM.Proc. of Int. Joint Conference on Neural Networks, Montreal, Canada; 2005, pp.2496-
2501.
[5] ScottiF.Automatic morphological analysis for acute leukemia identification in peripheral blood microscope
images.Proc. of the IEEE Int. Conf. on Computational Intelligence for Measurement Systems and Applications,
CIMSA 2005, Naxos, Italy; 2005, pp.96-101.
[6] ScottiF. Robust segmentation and measurement techniques of white cells in blood microscope images.Proc.
of the IMTC 2006, Instrumentation and Measurement Technology Conf., Sorrento, Italy; 2006, pp.43-48.
[7] AsadiMR, VahediA, AmindavarH. Leukemia cell recognition with Zernike moments of holographic
images.Proc. of the NORSIG 2006, pp. 214-217.
[8] PrasadB, ChoiJSI, BadawyW. A high throughput screening algorithm for leukemia cells, Proc. of the CCECE
2006, Ottawa, Canada; 2006, pp. 2094-2097.
[9] MokhtarNR, HarunNH, MashorMY, RoselineH, MustafaN, AdollahR, AdilahH, NasirNFM. Image enhancement
techniques using local, global, bright, dark and partial contrast stretching for acute leukemia images.Proc. of
the World Congress on Eng., WCE 2009, London, UK Vol. 1; 2009.
[10] NorHH, MashorMY, MokhtarNR, SalihahANA, HassanR, RaofRAA, OsmanMK.Comparison of acute leukemia
image segmentation using HSI and RGB color space.Proc. of the 10th Int. Conf. on Information Science, Signal
Processing and their Applications, Malaysia; 2010, pp. 749-752.
[11] KhashmanA, Al-ZgoulE. Image segmentation of blood cells in leukemia patients.Proc. of the CEA’10,
Cambridge, USA; 2010,pp. 104-109.
[12] Albregtsen F. Statistical texture measures computed from gray level coocurrence matrices. Monograph,
Image Processing Laboratory, Department of Informatics, University of Oslo, Nov. 5, 2008, pp. 1-6.
[13] Theodoridis S, Koutroumbas K. Pattern Recognition. 4th ed. Academic Press; 2009, pp.215-216.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The purpose of the present text is to present the theory and techniques behind the Gray Level Coocurrence Matrix (GLCM) method, and the state- of-the-art of the field, as applied to two dimensional images. It does not present a survey of practical results.
Article
Full-text available
The paper presents the preprocessing methods of the leukemic blast cells image in order to generate the features well characterizing different types of cells. The solved problems include: the segmentation of the bone marrow aspirate by applying the watershed transformation, selection of individual cells, feature generation on the basis of texture, statistical and geometrical analysis of the cells. These features are used as the input signals applied to the support vector machine used as the classifier. The numerical results of recognition of 12 different cell types are presented and discussed.
Conference Paper
Full-text available
The Image segmentation plays an important role in computer vision and image processing areas. In this paper, the use of color segmentation for segmenting acute leukemia images is proposed. The segmentation technique segments each leukemia image into two regions: blast and background. In our approach, the segmentation is based on HSI and RGB color space. The performance comparison between the segmentation algorithms based on HSI and RGB color space is carried out to choose a better color image segmentation for blast detection. The results show that the proposed segmentation technique based on HSI has successfully segmented the acute leukemia images while preserving significant features and removing background noise.
Article
Full-text available
Leukaemia is a malignant disease (cancer) that affects people in any age either they are children or adults over 50 years old. Nowadays, there are screening system guidelines for leukaemia patients. The screening result from looking at a sample of patient blood, can determine the abnormal levels of white blood cells, which may suggest leukaemia for further diagnostic stage. Therefore, medical professional using medical images to diagnose leukemia. However, there are blurness and effects of unwanted noise on blood leukaemia images that sometimes result in false diagnosis. Thus image pre-processing such as image enhancement techniques are needed to improve this situation. This study proposes several contrast enhancement techniques which are local contrast stretching, global contrast stretching, partial contrast stretching, bright and dark contrast stretching. All techniques are applied on the leukaemia images. The comparison for all the proposed image enhancement techniques was carried out to find the best technique to enhance the acute leukaemia images. The results show that the partial contrast stretching is the best technique that helps to improve the image quality.
Book
This book considers classical and current theory and practice, of supervised, unsupervised and semi-supervised pattern recognition, to build a complete background for professionals and students of engineering. The authors, leading experts in the field of pattern recognition, have provided an up-to-date, self-contained volume encapsulating this wide spectrum of information. The very latest methods are incorporated in this edition: semi-supervised learning, combining clustering algorithms, and relevance feedback. Thoroughly developed to include many more worked examples to give greater understanding of the various methods and techniques Many more diagrams included--now in two color--to provide greater insight through visual presentation Matlab code of the most common methods are given at the end of each chapter An accompanying book with Matlab code of the most common methods and algorithms in the book, together with a descriptive summary and solved examples, and including real-life data sets in imaging and audio recognition. The companion book is available separately or at a special packaged price (Book ISBN: 9780123744869. Package ISBN: 9780123744913) Latest hot topics included to further the reference value of the text including non-linear dimensionality reduction techniques, relevance feedback, semi-supervised learning, spectral clustering, combining clustering algorithms Solutions manual, powerpoint slides, and additional resources are available to faculty using the text for their course. Register at www.textbooks.elsevier.com and search on "Theodoridis" to access resources for instructor. Published: December 2010.
Article
The early identification of leukemia form in cancer patients can greatly increase the likelihood of recovery. Diagnostic methods that distinguish among the disease's many forms are either costly or do not exist. Amongst the existing diagnostic methods are immune-phenotype and cytogenetic abnormality, which require time to obtain results and are costly to perform due to their requirement of well equipped laboratories. Thus, there is a need for fast and cost-effective method that results in the identification of the different leukemia forms or types. Therefore, we propose the use of morphological analysis of microscopic images of leukemic blood cells for the identification purpose. We present in this paper the first phase of an automated leukemia form identification system, which is the segmentation of infected cell images. The segmentation process provides two enhanced images for each blood cell; containing the cytoplasm and the nuclei regions. Unique features for each form of leukemia can then be extracted from the two images and used for identification.
Conference Paper
This paper presents a high throughput screening algorithm for leukemia cells that has been designed, implemented, and tested. It performs a recursive image segmentation technique, row-wise and column-wise, on edge detected cell image. The recursive image segmentation successfully eliminates background pixels from foreground pixels by only segmenting image sections that contain relevant pixels. Then, the algorithm generates a boundary box for all identified cells. The next step of the proposed algorithm, the cluster classification, uses signature plots to classify single cells from cell clusters, and determine total cell count, size, and position. The proposed algorithm was successfully tested on various leukemia cell images. Also, when compared to manual counting using hemocytometer, the algorithm result matched the hemocytometer result. In addition, the algorithm took less than three seconds to process each image. Hence, the proposed algorithm determines relevant cell population statistics with 95% accuracy and avoids unnecessary delays in the cell screening process
Conference Paper
In this paper, we use the digital holographic method to classify-recognize an unknown leukemia cell. This is a non-invasive method to microscopy biology samples in order to recognize them. We generate the hologram from the 2D digital images of blood cells and make the reconstruction of leukemia cell through a computer simulation. We utilize approximate Fresnel digital holography in order to simulate optical diffraction patterns of hologram. A feature selection process is done on a computer reconstructed holographic image where we use the Zernike moments as the features of digital image. We take advantage of the rotation invariant property of the Zernike moments in the recognition of leukemia cell due to its unknown rotational direction. We compute the Zernike moments from scale and translation invariant geometric moments. In order to classify the leukemia cell, we use the minimum mean distance and the K-nearest neighbor methods using the invariant features
Conference Paper
The analysis and the count of blood cell in microscope image can provide useful information concerning the health of the patients. In particular, morphological analysis of white cell deformations can effectively detect important diseases such as the acute lymphoblastic leukemia. Blood images obtained by microscopes coupled with a digital camera are simple to obtain and can be more simply transmitted to clinical centers than liquid blood samples. Automatic measurement systems for white cells in blood microscope image can greatly help blood experts that typically inspect blood films manually. Unfortunately, the analysis made by human experts is not rapid and it presents a not standardized accuracy due to the operator's capabilities and tiredness. The presented paper shows how that it is effectively possible to accurately measure the white cells properties in order to allow, at a second stage, the leukemia identification. In particular, the paper presents how to suitably enhance the microscope image by removing the undesired microscope background and a new segmentation strategy to robustly identify white cells permitting to better extract their features for subsequent automatic diagnosis of diseases