ArticlePDF Available

Tooth Color Detection Using PCA and KNN Classifier Algorithm Based on Color Moment

Authors:

Abstract and Figures

Matching the suitable color for tooth reconstruction is an important step that can make difficulties for the dentists due to the subjective factors of color selection. Accurate color matching system is mainly result based on images analyzing and processing techniques of recognition system. This system consist of three parts, which are data collection from digital teeth color images, data preparation for taking color analysis technique and extracting the features, and data classification involve feature selection for reducing the features number of this system. The teeth images which is used in this research are 16 types of teeth that are taken from RSGM UNAIR SURABAYA. Feature extraction is taken by the characteristics of the RGB, HSV and LAB based on the color moment calculation such as mean, standard deviation, skewness, and kurtosis parameter. Due to many formed features from each color space, it is required addition method for reducing the number of features by choosing the essential information like Principal Component Analysis (PCA) method. Combining the PCA feature selection technique to the clasification process using K Nearest Neighbour (KNN) classifier algorithm can be improved the accuracy performance of this system. On the experiment result, it showed that only using KNN classifier achieve accuracy percentage up to 97.5 % in learning process and 92.5 % in testing process while combining PCA with KNN classifier can reduce the 36 features to the 26 features which can improve the accuracy percentage up to 98.54 % in learning process and 93.12% in testing process. Adding PCA as the feature selection method can be improved the accuracy performance of this color matching system with little number of features.
Content may be subject to copyright.
EMITTER International Journal of Engineering Technology
Vol. 5, No. 1, June 2017
ISSN: 2443-1168
Copyright © 2017 EMITTER International Journal of Engineering Technology ‐ Published by EEPIS
139
Tooth Color Detection Using PCA and KNN Classifier
Algorithm Based on Color Moment
Justiawan, Riyanto Sigit, Zainal Arief
Dept. of Electrical Engineering, Magister Program of Engineering Technology
Politeknik Elektronika Negeri Surabaya
Kampus PENS, Jalan Raya ITS, Sukolilo 60111, Surabaya
E‐mail: n7us@pasca.student.pens.ac.id,{riyanto,zar}@pens.ac.id
Abstract
Matching the suitable color for tooth reconstruction is an important
step that can make difficulties for the dentists due to the subjective
factors of color selection. Accurate color matching system is mainly
result based on images analyzing and processing techniques of
recognition system. This system consist of three parts, which are
data collection from digital teeth color images, data preparation for
taking color analysis technique and extracting the features, and data
classification involve feature selection for reducing the features
number of this system. The teeth images which is used in this
research are 16 types of teeth that are taken from RSGM UNAIR
SURABAYA. Feature extraction is taken by the characteristics of the
RGB, HSV and LAB based on the color moment calculation such as
mean, standard deviation, skewness, and kurtosis parameter. Due to
many formed features from each color space, it is required addition
method for reducing the number of features by choosing the essential
information like Principal Component Analysis (PCA) method.
Combining the PCA feature selection technique to the clasification
process using K Nearest Neighbour (KNN) classifier algorithm can be
improved the accuracy performance of this system. On the
experiment result, it showed that only using KNN classifier achieve
accuracy percentage up to 97.19 % in learning process using 10 fold
cross validation while combining PCA with KNN classifier can reduce
the 12 features to the 8 features which can improve the accuracy
percentage up to 97.81 %. Adding PCA as the feature selection
method can be improved the accuracy performance of this color
matching system with little number of features.
Keywords: Color Matching, Feature Selection, Teeth Images, PCA,
KNN, RGB, HSV, LAB, Color Moment
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
140
1. INTRODUCTION
Feature extraction and classification are the part of image processing
system which have many application in medical fields. Color matching system
is one of application which can be used for clinical dentistry. Presently,
dentist use shade guides for describing tooth shades information about color
reference standard. However matching the suitable color for tooth
reconstruction is an important step that can make difficulties for the dentists
due to the subjective factors of color selection. Traditionally, dentist usually
select suitable shade tabs by their naked eyes which can make the results
were unreliable and inconsistent. Color matching system using digital images
can minimize the gap of color communication between the dentists to the
patient teeth that can influence to the aesthetic value in dental care
treatment [1][2]. However the almost same color of the teeth images, the
lighting intensity environment will be influenced to the color matching
system. A color description should be described in detail the color
distribution of tooth surface. The central area of tooth are chosen for
matching color on natural teeth and will be used as an effective content for
shade comparison[1]. Therefore, providing a simple process with high
accuracy level of teeth color matching is the main requirement in this system.
Basically the color of teeth is a white color with different value. A color
can be produced by a combination of its basic elements which is called color
space parameter. Each color has at least three basic elements as RGB, CMY
(K), HSV, CIE XYZ, Lab, Luv and YCrCb[3,4]. There are several color space
that have been widely used in some research. According to the basic elements
of color, RGB, HSV and Lab are the simplest parameter which are used in
color analysis system [3]. However the color space properties of teeth are
nonuniform and involve a complex layering of tooth structure[5], it is
required addition technique for determining spesific feature of each tooth.
Analysis technique using color moment with the simple mathematical
calculation can be applied for determining the specific features based on its
color space properties.
Images classification is a set techniques and method to identify images
according to their content or feature [6]. In order to provide the suitable
teeth color images based its spesific feature, it is required addition algorithm
to classify the features based on formed features from each color space
properties. In other side, due to the many features from each color space
based on color moment calculation, reducing the features number is an
essential step before any classification data can be performed. Principal
Component analysis (PCA) is one of the popular methods used and can be
reduced the features to preserve most of the relevan information of the
original features according to some optimality criteria [7].
In this paper, we proposed the teeth color images for color matching
system using PCA as the feature selection method from the RGB, HSV and
LAB color space properties. The formed features from each color space will
be classified using KNN algorithm to take the 16 types of dental color images
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
141
that have been taking before using digital camera in 288 lux lighting
parameter. Combining the PCA and KNN classifier algorithm can be improved
the accuracy level percentage which is also followed by reducing the features
number of this system. In this paper will be compared the performance
between only using KNN classifier algorithm and using combination from
PCA and KNN classifier algorithm.
The rest of paper is organized as follows: related work of feature
selection and tooth color matching system is explained in section II, section
III shows the proposed feature selection to the images of tooth color, section
IV shows the simulation result involve the performance analysis of the
proposed system. Finally, we conclude the results in section VI.
2. RELATED WORKS
Feature selection or feature extraction is a process for creating new
variables as combintions of others to reduce the dimensionality of selected
features [8]. Dimensionality reduction of a feature set is a common
preprocessing step used for pattern recognition and classification
applications in compression schemes [7]. The most well‐known of
dimensionality reduction algorithm is PCA. Using the covariance matrix and
its eigenvalues and eigenvectors PCA finds the “principal components” in the
data which are uncorrelated eigenvectors for each representing of
proportion of variance data [8]. PCA and many variations of it have been
applied as a way of reducing the principal features in face tracking and
content‐based image retrival problem in [7].
Face recognition system based on skin detection using PCA as reduction
method followed by KNN classifier has been proposed in [9]. The result show
that the recognition rate is from 88% to the 90 % for RGB and YcbCr color
space in this skin detection. The enhancement of face detection using skin
color is proposed by [10]. Recogition part consist of three steps : Gabor
feature extraction, feature selection using PCA and KNN based classification.
This research show that having improvement of recognition rate up to 96 %
for face detection system.
Feature selection for Network intrusion Detection System (NIDS) have
been applied in [11]. Using Genetic Algorithms (GA) and Particle Swarm
Optimization (PSO) can be significantly reduces feature number and
improving the classification accuracy up to 99.7%.
Using several extracted feature from the orthodontic images, such as
facial features and skin color using YcbCr color space, have more accurate
performance. However many complicated features can be influenced to the
time computation of the system when it is applied to real hardware. There
are many parameter of color spaces that have been used in previous research
for getting the features at color matching system. HSV and CIE l*a*b color
space are the suitable features for shade matching using a digital camera. Due
to the HSV and CIE l*a*b have low influence to the lighting condition which
was achieved high accuracy in some dental shade matching system [12].
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
142
Color moment as the indexing technique, is encoded in the color index by
dividing the image horizontally into three equal non‐overlapping region. The
three moments (mean, variance and skewness) are extracted from each
region of the color space which is used. This technique is applied for
improving the color indexing process and obtaining the specific features from
the system. High accuracy result tooth color matching system which is
followed by features reduction will be influenced by the kind of algorithm for
modeling the system based its features. The output from modeling system
will be applied to the matching system as the learning and the testing
process. In this paper, we propose the feature selection for teeth images
which is applied for color matching system. The performances anlysis are
according to the kind of color space and the kind of modeling system
algorithm that is using KNN or using PCA and KNN combination. From the
result, we can analyze the performance based on its accuracy in learning and
testing process. With the result, we can obtain the best color matching
system which can be applied to the real hardware based on kind of color
space and kind of modeling system.
3. ORIGINALITY
From some relatively researher, it has considered that there are many
color matching system for dental application. In Indonesia there are many
dentist still use the conventional method for shade matching system. Due to
the difference characterictic colors of indonesian people teeth, and the others
main factors which can be influenced to the result of shade matching system.
combining and comparing the previous result, we proposed comperehensive
comparison of color matching system by using 3 types of the color space
which is followed by feature selection. The color spaces are RGB, HSV and
LAB. The color indexing technique is using color moment which is suitable
for digital image system. Combining the PCA algorithm for reducing the
feature numbers with KNN classifier for determining the spesific type of
teeth color is the main result of this sytem. Therefore the datasheet of the
system is collecting valuable information from some collected information on
dental shade guide database and also compared to the indonesian teeth color
condition. The Indonesian teeth color are taken from digital camera to each
patient teeth using 288 lux lighting. Than, it were compare to the suitable
shade guide database from RSGM UNAIR Surabaya. There are 16 types shade
guide color which are used to this system according to the real condition.
This system is built to be able to be used on teeth recognition for color
matching system which is useful for the dentist for taking the treatment to
the patient and also for science according the performance of analysis result.
4. SYSTEM DESIGN
In this section, we will explain the system design of our proposed
method. There are 3 main steps of this system as depicted in Fig.1. The first
step of this system is collecting the images data from shade guide database
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
143
and also taking sample images teeth data from hospital (RSGM) patients. The
images of this system, is obtained from digital camera with 288 lux lighting
intensity. According to the sample images data will be compared to the shade
guide database for determining the suitable color characteristic of the
Indonesian pepople. From this step, it were obtained 16 images data from
shade guide which was compatible with patient teeth condition. In database
system, there are shade guide teeth images as the parameter and sample
images teeth of RSGM patient.
Figure 1. System design of feature selection for teeth images in dental color matching
The second step is data preparation to determine the color space model
of this system. There are 3 color space model which are used for this system.
Those are RGB, HSV, and LAB. Each color system model should be applied at
all images data. From the color model result we can analyze the characteristic
of each color model using color moment technique. The output from this step
is the variance feature of teeth color in database. In modeling system, there
are two main process. Those are feature selection using PCA for reducing the
number of features based its eigenvalue. After that, the variance feature
result should be classified based on ID number of teeth from the shade guide
database system using KNN classifier. Due to the advantage and disadvantage
of color space and the modeling system step, we can compare based on the
performance result from learning and testing process.
4.1 Parameter Data
This research uses patients data which are obtained from RSGM UNAIR
Surabaya which have been adopted before using shade guide database. The
color image of shade guide which are used for this system, are ilustrated at
Figure 2.
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
144
Figure 2. Images of color shade guide database
There are 16 types color images of shade guide which are applied to
this system. Each image should be converted to the 40 x 40 pixels size, as
ilustrated in Fig. 3. The images are divided into 3 part for horizontal and
vertical direction. The middle of intersection results is the images color that
will be processed to the next step.
Figure 3. Images of color shade guide database at 40 x 40 pixel size
4.3 Color Analysis Technique
In this phase will be obtained the color features of this system. The
features value of this system are determined from capturing each shade type
of teeth several times with different angle and lighting intensity. Due to
decreasing the other noise of lighting effect. The color model that can be
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
145
applied for tools in color matching system are RGB, HSV and LAB. RGB using
basic element of Red, Green and Blue to produce the colors. All of the other
colors is obtained from its combination. HSV is abbreviation from Hue,
Saturation and Value. In some research, HSV color models is often used as the
human color parameter. This is because HSV can be arrested easily using
human eyes than RGB color models. The equation for calculating the HSV
value from RGB value is described as:
( ) ( ) ( )
BGR B
b
BGR G
g
BGR R
r++
=
++
=
++
=,,
(1)
(
)
bgrV ,,max=
(2)
After calculating the value (V) of HSV it can be continued for determining the
Hue (H) and Saturation (S) parameter from the value of r,g,b, which is
obtained as [13]:
(3)
(4)
LAB is the color models based on the wavelength of the light. While the
transformation of RGB color models to the LAB is calculated using this
following equation[4]:

(5)
 (6)
 (7)
The value of LAB can be defined as following equation:


(8)

 (9)

 (10)
( )
=
=0,
,,min
1
0,0
Vif
Vbgr
Vif
S
( )
( )
( )
0,360
,
*
4*60
,
*
2*60
,
*
*60
0,0
+=
=
+
=
+
=
=
=
HifHH
bVif
VS br
gVif
VS rb
rVif
VS bg
Sif
H
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
146
f(q) function can be determined using this following equation:
!"# !
$
%
&'! ( 
!) '*+,'-*+,./ (11)
X
n
, Y
n
, Z
n
value is obtained from R=G=B=1 with the range of R, G, B (0, 1) [4].
After obtaining the color model from the database, the moment feature
colors can be determined from statistical calculation as mean, standard
deviation, skewness and kurtosis. The mean value is giving the distribution
size of this system based on this equation:
0
1
23
4 4 5
67
3
781
2
681
(12)
Variance is the area of distribution system, while the square root from
variance is called standard of deviation. The equation of deviation standard is
derived as:
9:
$
;<
4 4 =5
67
0>
?
3
781
2
681
(13)
The skewness is decleared the size of asymetry condition. The distribution to
the left side when it have negative value of skewness, while the distribution
to the right side when it have positive value of skewness. The normal
distributrion or symetry condition is obtained from null value of skewness.
The value of skewness can be calculated using this equation:
3
1 1
3
)(
σ
µ
θ
MN
P
M
i
N
jij
= =
=
(14)
The last parameter is kurtosis, which is the parameter for showing the data
distribution is become pointed or become blunt. The equation of kurtosis is
derived as:
3
)(
4
1 1
4
=
= =
σ
µ
γ
P
M
i
N
jij
(15)
The positive value is indicated the pointed distribution, while the negative
value is indicated the blunt distribution. According to the moment invariant
technique for determining the color features of each images from database,
we can obtain the feature from its height, width, and its color value [14].
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
147
4.3 KNN Classification Algorithm
K‐Nearest Neighbors algorithm (KNN) is a non‐parametric method used
for classification and regression. The data input of KNN consists of the k
closest training examples in the feature space. KNN is included at the
instace‐based learning group [15].
Figure 4. 3D scatter graph of 16 color teeth types using RGB color space
According to the data distribution in scatter graph Fig.4., It shows that there
are some clusters or groups based its tooth type and there are some mixed
group in there. KNN is the suitable algorithm for separating the cluster or
group which is started from finding the group of k object in data training that
is nearest data to the new data or testing data. There are many method for
measuring the distance among the testing data as the new data and the
training data as the old data. One of them is using euclidean distance [9,10]:
@&/*ABC, D A
1
E
1
"
?
A
?
E
?
"
?
F A
G
E
G
"
?
(16)
a
n
and b
n
are the features value from the two records. When the features
value of two record data are compared, and the value is 0 its means the data
is almost same or same. While the value is 1, it means the data is not same.
The simillarity value of feature can be calculated using following equation:
/&H&IA.&*J' K, !" '
L M
N
,O
N
"P'Q
N
NR$
Q
N
(17)
P is the new case data, while q is the case in storage. n is the feature number
and f is the simillarity function feature and w is the weight of the feature. All
of classification algorithm of this system is applied to the rapidminer
simulator based on the feature from color analysis which have been
calculated before using matlab. The performance system is automatically
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
148
show from the output of rapidminer simulator. The performance result will
be explained in the next section.
4.4 PCA-KNN Tooth Color Detection
Principal Component Analysis (PCA) is a dimensionality reduction which
is used for compresion and recognition problem [7‐10][16,17]. Tooth color
images which have white as the main color are difficult for machines to
classify each type of tooth. PCA creates reduced‐dimension images feature
that have almost same features data. The reduced‐dimension images feature
are obtained by indentifying a few most influential parameters. The
parameters are eplained the greatest amount of variation in dataset.
Using the covariance matrix, eigenvalue and eigenvectors, PCA finds the
“principal components” in the data which are uncorrelated eigenvectors each
representing some proportion of variance in the data. The PCA basis vectors
are defined as eigenvectors of the total scatter matrix S
T
of M images[9]:
S
T
4 U
6
V"
2
681
U
6
V"
T
(18)
F is the features of each images based on color moment calculation as listed
in eq. (12‐15), while x
i(1,2,3,...,12)
is the feature number with its coloumn
concatenated in vector. According to the S
T
basis vector, it can be determined
the threshold value based on the total of cummulative variance data. In this
systme, the threshold of cummulative variance data is ± 95 % from the all of
data. From the ±95 % choosen features, the projection matrix of
WPCA
is
formed from k eigenvectors which is corresponding to the k largest
eigenvalues. PCA yields the projection which is the maximum data from the
total scatter. It can be occured due to the environment condition such as the
lighting noise from the captured result of the tooth. The new features vectors
Y
k
are defined by this following linear transformations[9]:
W
'X
T
W
YZ[
(19)
W is chosen for maximizing the determinant of the scatter matrix total from
projected samples as[9]:
X\]^_\`aX
T
S
T
Xbc'
W
'aS
T
XbcaX
1
X
?
ZX
2
b
(20)
W is the set of M‐dimensional eigenvectors of S
T
which is corresponding to
the M as the largest eigenvalues.
After determining the main parameter from PCA algorithm, the next step
is training phase based on the tooth images database at each color moment
value calculation. The training set of M tooth images can be represented as
each color space type :
V
def
a
g
hi
j
k
g
hi
j
h
g
hi
j
h
b
T
V
lhm
an
g
n
hi
n
j
n
k
S
g
S
hi
S
j
S
h
o
g
o
hi
o
j
o
h
b
T
(21)
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
149
V
pqf
a
g
hi
j
k
g
hi
j
h
g
hi
j
h
b
T
m represent as mean, SD is the standard deviation, k is the kurtosis value and
s is the skewness calculation. The training set of each color space will be
calculated the average of tooth type images which is defined by:
V
r
1
2
4V
deflhmpqf
2
G81
(22)
Each images should be calculated from the average tooth color by the vector:
s
6
aV
deflhmpqf'
V
rb
(23)
Where i = 1 to M of captured images each tooth type with N x M dimension. M
will be set to the orthogonal vectors for describing the input data using
covariance matrix calculation as shown from following equation[9]:
t '
1
2
4s
T
s
2
681
(24)
The eigenvalue λ
i
and the eigenvectors u
i
are determined from covariance
matrix C which is the real and symmetric data as shown as:
u
6
1
2
4vA. w
6
xV
deflhmpqf"T
"
2
681
(25)
In PCA, we choose only the best k eigenvectors (with the highest k
eigenvalues). The elimination of the smallest eigenvalues can be followed by
minimum number of eigenvectors which is greater than cummulative
variance threshold = 95 %. From the new features that was formed by PCA
algorithm, we can reconstruct images in the new spaces with the best
eigenvectors that is described to this equation:
aV
deflhmpqf"
V
rb
L6Gyz
4aV
deflhmpqf"
V
rb
6G6{6yz
xw
6W
p
W81
(26)
There are two types of formed feature in this phase, initial features are
defined the features which is without using PCA algorithm, and the final
features is the result from PCA algorithm from k to the L features. The output
features from PCA algorithm will be used to establish the predefined tooth
type class that best describes the new tooth type. The new tooth type will
find using euclidean distance calculation based on KNN algorithm in eq. (16)
for each k number of KNN. The new tooth type can be known based on its
class label such as A1, A2, A3 and etc for each color space type which are
formed by the the color shade guide. The number k of KNN algorithm is
influenced by the distribution model of the data. The result of PCA‐KNN
classifier for tooth color detection will be discussed to the next section.
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
150
5. EXPERIMENTAL RESULT
In this section will be described the performance analysis of this system.
there are 3 kinds of performance that will explain. Those are performance
system based on learning accuracy, validation accuraccy for each kind of
color space and each modeling system scenario. There are two kinds of
modeling system which will be analyzed in this section, those are using KNN
as the classifier algorithm and using combination of PCA and KNN to this
dental color matching. According to the color analysis phase, there are 640
total data and have 16 class data which will be classified using KNN at every
color space (RGB, HSV and LAB). For the accuracy level evaluation of
learning process, 10 fold cross validation will be used.
In the 10 fold cross validation process is using stratified sampling data
which is built the random subset to arrange the distribution data as the real
data. The function of cross validation method is avoid the overlapping
condition at the testing process. There are two steps of cross validation
method. First, Divided the data to the same of k subset and than using the
each subset for testing data and the residue for training data. According to
this system 10 fold cross validation will be iterated the result up to 10 times.
When there are 640 total data, in the first iteration 64 data will be used for
testing data and the 576 data will be used for learning data, and this process
will be iterated up to 10 times. The result of the cross validation will be
obtained from the average of 10 iterations process. The experimental result
of this system is listed at Table 1. There are two parameter which will be
analyzed in this system. Those are accuracy percentage and the error
(RMSE).
Table 1. Experimental result accuracy and RMSE level performance in KNN and
PCA+KNN combination using 10 fold cross validation
Accuracy and RMSE Performance
KNN PCA + KNN
Color Space Features Accuracy RSME Features Accuracy RSME
RGB 12 97.19 % 0.161 8 97.81 % 0.130
HSV 12 64.06 % 0.598 11 65.00 % 0.590
LAB 12 91.56% 0.280 9 94.84 % 0.219
According to the experimental result from Table 1., the percentage of
learning accuracy from RGB, HSV and LAB color space model show using
combination between PCA and KNN have accuracy level improvement. The
number of features for each color space are 12 features. The example is when
using RGB color space, it will calculated the value of mean (m), standard
deviation(sd) skewness(s) and kurtosis(k) parameter for each R, G, B
component. The formed features in RGB color space are, Rm, Rsd, Rs, Rk,
Gm,Gsd, Gs, Gk, Bm, Bs, Bk, and Bsd. This process will be repeated againn to
the other color space (HSV and LAB).
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
151
After that, the feature number of each color space will be selected and
reduced using PCA. In this system, the feature selection of PCA based on
±95% threshold of cummulative distributif variance value from the data. The
features number of RGB and LAB can reduce from 12 features number to the
8 and 9 features number, while the HSV features is only reduced from 12
features to the 11 features. The features number will be influenced to the
classification process using KNN as listed in Table 1. The result show adding
feature selection using PCA can improve the accuracy percentage in learning
although the feature number have been reduced. The RMSE value is
influenced by the percentage which have inverse correlation. When the
accuracy is increased the RMSE will be decreased. It prove from the result at
Table 1.
Figure 5. The Accuracy level performance of learning process based on the k value
modification using 10 fold cross validation
The number of k in KNN classifier algorithm is also influenced to the
accuracy perfomance of learning with 10 fold cross validation as shown in
Fig. 5. The result show that the accuracy performance has fluctuative
influence to the number of k. RGB color space has reduction performance
from 97.19 % to the 94.38 % when the number of k increased in without
feature selection process. Although adding the feature selection process the
accuracy performance in k =1 is better performance up to 97.81 % than the
accuracy performance in k = 3,,5,7. The HSV color space with k value = 1 in
classification process have highest level accuracy in 65.00 %. While using
LAB color space model is achieved highest level accuracy in 95.94 % with k
value = 3 in adding PCA to the KNN classifier algorithm. This condition show
1 2 3 4 5 6 7
60
65
70
75
80
85
90
95
100
The number of k (KNN Algorithm)
Learning Accuracy Validation Percentage (%)
RGB
Reduced RGB
HSV
Reduced HSV
LAB
Reduced LAB
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
152
that the value of k in KNN classifier algorithm has saturation condition based
on its data distribution.
The k value of KNN classifier algorithm is also influenced with
fluctuative influence to the accuracy performance of learning with leave one
out cross validation as ilustrated in Fig. 6. RGB, HSV and LAB color space have
better performance in k value = 1 for all modelling system (without PCA). The
LAB color space is achieved highest accuracy level in k value = 3 and RGB,
HSV color space have better performance in k value = 1, when adding the PCA
feature selection.
When the result is compared based on the kind of color model, RGB is
achieved the best performance in using PCA+KNN algorithm. It is because
capturing image of this system is always constant in 288 lux lighting. RGB
color model will achieve better performance when the lighting is stable.
7. CONCLUSION
In this we propose the feature selection of teeth images for dental color
matching system using PCA and KNN classifier algorithm. The compatible
modeling system and color model can be influenced to the performance
analysis of this system. there are 2 comparison analysis of this system. those
are based on the kind of color model and based on modeling system. The best
performance of color matching system at the stable lighting condition in 288
lux is achieved by using RGB color with combination of PCA and KNN
algorithm. It has 97.81% accuracy of learning process and 0.130 RMSE value
in 10 fold cross validation process.
REFERENCES
[1] W.K. Tam, H.J. Lee, Dental Shade Matching Using Digital Camera,
The Journal of Dentistry Elsevier, pp. e3‐e10, June 2012.
[2] Stephen J. Chu, Richard D. T., and Rade D. Paravina, Dental Color
Matching Instruments and Systems (Review of Clinical and
Research Aspects), Journal of Dentistry Elsevier, pp. e2‐e16, July
2010.
[3] S. Mangijao Singh, K. Hemachandran, Image Retrieval Based on The
Combination of Color Histogram and Color Moment, International
Jounal of Computer Applications, vol. 58, No.3, November 2012.
[4] Shuli Wang, Weiting Wang, and Fan Wu, A Computer-Aided Analysis
on Dental Prosthesis Shade Matching, 4th International Conference
on Biomedical Engineering and Informatics (BMEI), pp. 1950‐1954,
2011.
[5] Dan S., Marius D. P., Marius D. S., Vladimir B., Tudor C., and Simona B.,
A Software Application to Detect Dental Color, Applied Medical
Informatics Journal, vol. 37, no. 3, pp: 31‐38, 2015.
[6] Hicham R., Abdelmajid E., and Farid B., Classification and
Recognition of Dental Images Using Decisional Tree, 13th
Volume 5, No. 1, June 2017
EMITTER International Journal of Engineering Technology, ISSN: 2443‐1168
153
International Conference Computer Graphics, Imaging and
Visualization(CgiV), pp. 390‐393, April 2016.
[7] Fengxi Song, Zhongwei Guo, Dayong Mei, Feature Selection Using
Principal Component Analysis, International Conference on System
Science, Engineering Design and Manufacturing
Informatization(ICSEM), pp. 27‐30, November 2010.
[8] Zena M. Hira and Duncan F. Gillies, A Review of Feature Selection
and Feature Extraction Methods Applied on Microarray Data,
Hindawi Publishing Corporation Advances in Bioinformatics, vol.
2015, 13 pages, 18 May 2015.
[9] Fatma zohra C., Noureddine C., and Amar D., Face Recognition
System Using Skin Detection in RGB and YcbCr Color Space, 2nd
World Symposium of Web Applications and Networking (WSWAN),
21‐23 March 2015.
[10] B. Dhivakar, C. Sridevi S. Selvakumar and P.Guhan, Face Detection
and Recognition Using Skin Color, 3rd International Conference on
Signal Processing, Communication and Networking (ICSCN), 2015.
[11] Iwan Syarif, Feature Selection on Network Intrusion Data Using
Genetic Algorithm and Particle Swarm Optimization, Emitter
International Jounal of Engineering Technology, Vol. 4, No. 2, 2
December 2016.
[12] Hasan Suat G., Bulent P., Dogan C., Sila M.,G., and Volkan A., Shade
Matching Performance of Normal and Color Vision-Deficient
Dental Professionals with Standard Daylight and Tungsten
Illuminats, The Journal of Prosthetic Dentistry, vol. 103, Issue 3, pp.
139‐147, March 2010
[13] S. Mangjiao Singh, K. Hemachandran, Content-Based Image Retrieval
Using Color Moment and Gabor Texture Feature, IJCSI Issues, Vol.9,
No.1, September 2012.
[14] Riyanto Sigit, Achmad Basuki, Nana Ramadijanti, Dadet Pramadihanto,
Step By Step Pengolahan Citra Digital, 2014
[15] Jihong Liu, Na Zhao, Runnan H., Study of Color Matching System for
Porcelain Teeth, ICMIPE, 19‐20 October 2013.
[16] Hongtan Sun, K-Nearest Neighbour and SVM Classifier with
Feature extraction and Feature Selection, final project for CSCI
6967 : Foundation of Data Sciences, May 11, 2015.
[17] Alex Norko, Simple Image Classification Using Principal
Component Analysis, GMU Volgenau School of Engineering USA,
December 9, 2015.
... The RGB color space consists of three primary colors: red, green, and blue. The spectra of these three colors are superimposed to produce a composite color [2]. The RGB color space is represented by a three-dimensional cube, with red, green, and blue distributed on three coordinate axes ( Figure 1). ...
Article
Full-text available
Through the statistics and analysis of the three colour components of dynamic background pixels in RGB space, the paper found that the difference between the three components fluctuates in a narrow area. Based on this fact, it proposes RGB colour based on computer big data Background modelling method of component statistics. This method fully considers the correlation of the three colour components of RGB, and the foreground detection process is faster. Experimental results show that compared with other colour invariant methods, this algorithm can more accurately reduce the computational complexity, time consumption and memory consumption of foreground detection.
Conference Paper
This paper constructs an integrated model called PCA-KNN model for financial time series prediction. Based on a K-Nearest Neighbor (KNN) regression, a Principal Component Analysis (PCA) is applied to reduce redundancy information and data dimensionality. In a PCA-KNN model, the historical data set as input is generated by a sliding window, transformed by PCA to principal components with rich-information, and then input to KNN for prediction. In this paper, we integrate PCA with KNN that can not only reduce the data dimensionality to speed up the calculation of KNN, but also reduce redundancy information while remaining effective information improves the performance of KNN prediction. Two specific PCA-KNN models are tested on historical data sets of EUR/USD exchange rate and Chinese stock index during a 10-year period, achieving the best hit rate of 77.58%.
Article
Full-text available
This paper describes the advantages of using Evolutionary Algorithms (EA) for feature selection on network intrusion dataset. Most current Network Intrusion Detection Systems (NIDS) are unable to detect intrusions in real time because of high dimensional data produced during daily operation. Extracting knowledge from huge data such as intrusion data requires new approach. The more complex the datasets, the higher computation time and the harder they are to be interpreted and analyzed. This paper investigates the performance of feature selection algoritms in network intrusiona data. We used Genetic Algorithms (GA) and Particle Swarm Optimizations (PSO) as feature selection algorithms. When applied to network intrusion datasets, both GA and PSO have significantly reduces the number of features. Our experiments show that GA successfully reduces the number of attributes from 41 to 15 while PSO reduces the number of attributes from 41 to 9. Using k Nearest Neighbour (k-NN) as a classifier,the GA-reduced dataset which consists of 37% of original attributes, has accuracy improvement from 99.28% to 99.70% and its execution time is also 4.8 faster than the execution time of original dataset. Using the same classifier, PSO-reduced dataset which consists of 22% of original attributes, has the fastest execution time (7.2 times faster than the execution time of original datasets). However, its accuracy is slightly reduced 0.02% from 99.28% to 99.26%. Overall, both GA and PSO are good solution as feature selection techniques because theyhave shown very good performance in reducing the number of features significantly while still maintaining and sometimes improving the classification accuracy as well as reducing the computation time.
Article
Full-text available
Content based image retrieval (CBIR) has become one of the most active research areas in the past few years. Many indexing techniques are based on global feature distributions. However, these global distributions have limited discriminating power because they are unable to capture local image information. In this paper, we propose a content-based image retrieval method which combines color and texture features. To improve the discriminating power of color indexing techniques, we encode a minimal amount of spatial information in the color index. As its color features, an image is divided horizontally into three equal non-overlapping regions. From each region in the image, we extract the first three moments of the color distribution, from each color channel and store them in the index i.e., for a HSV color space, we store 27 floating point numbers per image. As its texture feature, Gabor texture descriptors are adopted. We assign weights to each feature respectively and calculate the similarity with combined features of color and texture using Canberra distance as similarity measure. Experimental results show that the proposed method has higher retrieval accuracy than other conventional methods combining color moments and texture features based on global features approach.
Article
Full-text available
Choosing dental color for missing teeth or tooth reconstruction is an important step and it usually raises difficulties for dentists due to a significant amount of subjective factors that can influence the color selection. Dental reconstruction presumes the combination between dentistry and chromatics, thus implying important challenges. Purpose: The aim of this study was to develop and implement a software application for detecting dental color to come to the aid of dentists and largely to remove the inherent subjectiveness of the human vision. Basic Methods: The implemented application was named Color Detection and the application's source code is written using the C++ language. During application development, for creating the GUI (graphical user interface) the wxWidgets 2.8 library it was used. Results: The application displays the average color of the selected area of interest, the reference color from the key collection existent in the program and also the degree of similarity between the original (the selected area of interest) and the nearest reference key. This degree of similarity is expressed as a percentage. Conclusions: The Color Detection Program, by eliminating the subjectivity inherent to human sight, can help the dentist to select an appropriate dental color with precision.
Article
Full-text available
We summarise various ways of performing dimensionality reduction on high-dimensional microarray data. Many different feature selection and feature extraction methods exist and they are being widely used. All these methods aim to remove redundant and irrelevant features so that classification of new instances will be more accurate. A popular source of data is microarrays, a biological platform for gathering gene expressions. Analysing microarrays can be difficult due to the size of the data they provide. In addition the complicated relations among the different genes make analysis more difficult and removing excess features can improve the quality of the results. We present some of the most popular methods for selecting significant features and provide a comparison between them. Their advantages and disadvantages are outlined in order to provide a clearer idea of when to use each one of them for saving computational time and resources.
Conference Paper
This paper proposes a method to enhance the performance of face detection and recognition systems. This method basically consists of two main parts as detection of faces and then recognizing the detected faces. In detection step, skin color segmentation with thresholding skin color model combined with AdaBoost algorithm is used, which is fast and also more accurate in detecting the faces. Also, a series of morphological operators is used to improve the face detectionperformance. Recognition part consists of three steps: Gabor features extraction, dimension reduction and feature selection using PCA, and KNN based classification. Testing of the system on different face databases is done. Our aim is to show that system is robust enough to detect faces in different lighting conditions, scales, poses, and skin colors from various races and to recognize face with less misclassification compared to the previous methods.
Conference Paper
In this paper, we present a face recognition system using skin segmentation as feature reduction. Human Skin detection reduces the face region research in the image. In addition, color skin is a robust information face to rotation; scale and illumination variation. Skin regions are extracted using a set of bounding rules based on the skin color distribution obtained from a training set. The proposed RGB-YCbCr skin color model for skin region segmentation was evaluated on a set of three databases. The set was choice in a controlled and uncontrolled environment. Two classifiers are studied to build our face recognition system; the K nearest neighbor applied for each channel of the segmented image, whereas the second used the Principal component analysis PCA as reduction method followed by knn classifier. Best recognition rate were obtained for the first classifier.
Article
Objectives: Digital cameras could be substitutes for contact-type instruments in shade selection and overcome their drawbacks. The images taken show morphology and color texture of teeth. A new method was proposed to compare the color of shade tabs taken by a digital camera using appropriate color features. Methods: Vita 3D-MASTER shade guide and Canon EOS 1100D digital camera were employed. Shade tab images were compared in two reference strategies. The color of tooth surface was presented by a content manually cropped out of the image. The content was divided into 10 × 2 blocks to encode the color distribution. Color features from commonly used color spaces were evaluated. The top n matches were selected when the least n shade distances between the shade tabs were attained. Results: Using Sa*b* features, the top one accuracy was 0.87, where the feature S is defined in HSV color space, a* and b* features are defined in L*a*b* color space. This rate was higher than previous reports using contact-type instruments. The top three matching accuracy was 0.94. Conclusions: Sa*b* were suitable features for shade matching using a digital cameras in this study. Both the color and texture of the tooth surface could be presented by the proposed content-based descriptor. Clinical use of digital cameras in shade matching became possible. Clinical significance: This in vitro study proposed a method for shade matching using digital cameras through the comparisons of the color patterns on the shade tab surfaces. The method overcame some drawbacks from the devices such as colorimeters or spectrophotometers. The results supported the use of digital cameras in shade matching.
Conference Paper
Principal component analysis (PCA) has been widely applied in the area of computer science. It is well-known that PCA is a popular transform method and the transform result is not directly related to a sole feature component of the original sample. However, in this paper, we try to apply principal components analysis (PCA) to feature selection. The proposed method well addresses the feature selection issue, from a viewpoint of numerical analysis. The analysis clearly shows that PCA has the potential to perform feature selection and is able to select a number of important individuals from all the feature components. Our method assumes that different feature components of original samples have different effects on feature extraction result and exploits the eigenvectors of the covariance matrix of PCA to evaluate the significance of each feature component of the original sample. When evaluating the significance of the feature components, the proposed method takes a number of eigenvectors into account. Then it uses a reasonable scheme to perform feature selection. The devised algorithm is not only subject to the nature of PCA but also computationally efficient. The experimental results on face recognition show that when the proposed method is able to greatly reduce the dimensionality of the original samples, it also does not bring the decrease in the recognition accuracy.
Conference Paper
Recently dental aesthetics become important requirement when people went to dental clinic. Thus, for the most of dental patient if placed prosthesis, an ideal prosthesis is that the color closes to the adjacent teeth, so the tooth matching become a crucial step before implant. In this paper we proposed a tooth shade matching system by develop a standard procedure, which is use a digital camera and ring light to take a dental picture and color correction for the consistency of dental image, cluster the color spectrum of the tooth for matching with the shade guide tabs. Finally we perform the system can be decision support system in the shade matching, increasing the correctness of shade matching. Qualitative interviews are performed; the results showed that most dentists satisfied the smooth standard procedure and the precision of clustering the color spectrum of tooth.