Content uploaded by Fereshteh Nayyeri
Author content
All content in this area was uploaded by Fereshteh Nayyeri on Aug 05, 2016
Content may be subject to copyright.
!
"
# ! $ %
#
&'(')'
#*++,')'*+-./
0'1
()2 /3
!"!"
!"
#$%#$ !$%&
#$%#$ !$%&
!
" #
$!% ! & $
' ' ($
' # % % ) %* %
' $ '
+ "% &
'!#$,
($
-.
!
"-(
%
. %
%%% $
$ $-
-
--
//$$$
0
1"1"#23."
4&)*5/ +)
*!6!& 7!8%779:9&%)-
2;!
*&
/-<:=9>4&)*5/ +)
"3 " &:=9 >
i
Table of Contents
1. INTRODUCTION 1
1.1 Introduction 1
1.2 Background Of Study 1
1.3 Motivation 4
1.4 Problem Statement 4
1.5 Research Objectives 5
1.6 Research Methodology 5
1.7 Thesis Outline 6
1.8 Summary 8
2. LITERATURE REVIEW 9
2.1 Introduction 9
2.2 Distance Measurements 10
2.2.1 Euclidean Distance 12
2.2.2 Manhanttan Distance (City Block Distance) 13
2.2.3 Hausdorff Distance 14
2.3 Similarity Measurements 16
2.4 Bin-By-Bin Similarity Measures 16
2.4.1 Minkowski-form Distance 16
2.4.2 Histogram Intersection Distance 18
2.4.3 Jeffrey Divergence 19
2.4.4 Cosine Distance 20
2.5 Cross-Bin Similarity Measures 22
2.5.1 Quadratic Distance 22
2.5.2 Mahalanobis Distance 23
2.5.3 Match Distance 24
2.5.4 Kolmogorov-SmirnovDivergence 25
2.5.5 (DUWK0RYHU¶V'LVWance (EMD) 26
2.6 Evaluation Of Similarity Measurements 28
2.7 The Emd as a Metric For Image Retrieval 35
2.7.1 Transportation Problem 35
2.7.2 Appearance of EMD 35
2.7.3 Uses of EMD 36
2.8 Embedding Of Emd Into Normed Spaces 42
ii
2.9 Summary 43
3. RESEARCH METHODOLOGY 44
3.1 Introduction 44
3.2 Research Methodology Scheme 44
3.2.1 Literature Review 46
3.2.2 Proposed Methods 47
3.2.3 Hausdorff Distance 48
3.2.4 Implementation 52
3.3 Dataset 52
3.4 Summary 54
4. DIMENSION REDUCTION 55
4.1 Introduction 55
4.2 (DUWK0RYHU¶V'LVWDQFH(PG6LPLODULW\0HDVXUH 55
4.2.1 Weighted Matching 56
4.2.2 The Hungarian Method For Assignment Problem 57
4.2.3 The Matrix Form of The Hungarian Method 58
4.2.4 Using The Hungarian Method In EMD 66
4.3 (PEHGGHG(DUWK0RYHU¶V'LVWDQFH(HPG 68
4.3.1 The Embedding 69
4.4 Dimension Reduction 73
4.4.1 The Sampling Method 73
4.4.2 The Sketching Method 76
4.4.3 The DREAT Method 77
5. EXPERIMENTAL RESULTS AND DISCUSSION 79
5.1 Introduction 79
5.2 (DUWK0RYHU¶V'LVWDQFH(0' 79
5.3 Embedded EMD (EEMD) 81
5.4 Sampling Method 81
5.5 Sketching Method 82
5.6 DREAT 82
5.7 Discussion 86
6. CONCLUSION AND FUTURE WORKS 88
6.1 Introduction 88
6.2 Objectives And Achievements 88
6.3 Research Contribution 89
iii
6.4 Research Limitation 90
6.5 Future Work 90
6.6 Summary 91
7. PERFORMANCE 92
iv
PREFACE
Finding similar images to a given query image can be computed by different distance
measures. One of the general distance measures LV WKH (DUWK 0RYHU¶V 'LVWDQFH
(EMD). Although EMD has proven its ability to retrieve similar images in an average
precision of around 95%, high execution time is its major drawback. Embedding
EMD into L1is a solution that solves this problem by sacrificing performance;
however, it generates a heavily tailed image feature vector. We aimed to reduce the
execution time of embedded EMD and increase its performance using three
dimension reduction methods: sampling, sketching, and DREAT (Dimension
Reduction in Embedding by Adjustment in Tail). Sampling is a method that
randomly picks a small fraction of the image features. On the other hand, sketching is
a distance estimation method that is based on specific summary statistics. The last
method, DREAT, randomly selects an equally distributed fraction of the image
features. We tested the methods on handwritten Persian digit images.
Our first proposed method, sampling, reduces execution time by sacrificing the
recognition performance. The sketching method outperforms sampling in the
recognition, but it records higher execution time. The DREAT outperforms sampling
and sketching in both the execution time and performance.
v
ACKNOWLEDGEMENT
First, we express our endless thanks to almighty ALLAH for his great blessings on us
to complete this project.
:H DOVR OLNH WR WKDQNV WKH SXEOLVKHU ³/$% /$0%(57 $FDGHPLF 3XEOLVKLQJ´ IRU
providing the opportunity for us to publish.
We are also grateful to all the stuffs and academic members of the department of
Faculty of Information Science & Technology of UKM. We also want to give thanks
to all our friends who have helped us to prepare this book.
The development of this book is a collaborative effort involving many individuals
who contributes in variety of ways. We want to thanks all of them for their kind help.
Finally, we beg pardon for our unintentional errors and omission if any.
Authors
Fereshteh Nayyeri
Dr. Mohammad Faidzul Nasrudin
1
CHAPTER I
INTRODUCTION
1.1 INTRODUCTION
An image retrieval system is a computer system for browsing, searching, and
retrieving images from a large database of digital images. Content-Based Image
Retrieval (CBIR) is the application of computer vision to the image retrieval. CBIR
retrieves images based on similarities in their contents (textures, colours, shapes etc.)
to a user-supplied query image or user-specified image features. CBIR systems are
largely applied in most science fields. Due to the vast advantages of image retrieval
systems, the usage of this technology is sharply increased in the human daily life. In
this chapter, firstly, a brief background of CBIR technology is introduced, and
followed by motivation of study, problem statement, research objectives, research
methodology, and lastly, the thesis outline.
1.2 BACKGROUND OF STUDY
In general, The CBIR technology also known as Query by Image Content (QBIC)
and Content Based Visual Information Retrieval (CBVIR) is the application of
computer vision techniques to the image retrieval problem, that is, the problem of
searching for digital images in large databases.
Users in many professional fields are exploiting the opportunities offered by the
ability to access and manipulate remotely stored images in all kinds of new and
exciting ways. However, they are also discovering that the process of locating a
desired image in a large and varied collection can be a source of considerable