About
188
Publications
264,226
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,286
Citations
Introduction
Khairuddin Omar previously works at the Center for Artificial Intelligence Technology, National University of Malaysia. Now retired. Khairuddin does research in Artificial Intelligence and Artificial Neural Network. Their current project is 'Segmentation Arabic words using Voronoi diagrams'.
Current institution
retired
Additional affiliations
December 2009 - September 2021
January 2007 - September 2021
Publications
Publications (188)
Automatic Emotion Speech Recognition (ESR) is considered as an active research field in the Human-Computer Interface (HCI). Typically, the ESR system is consisting of two main parts: Front-End (features extraction) and Back-End (classification). However, most previous ESR systems have been focused on the features extraction part only and ignored
th...
This paper discusses the problem of ambiguity in Jawi - Rumi machine transliteration for Jawi homograph words. Machine transliteration (MT) is the process of converting a script from source text to target text automatically. In the context of Malay MT for Jawi - Rumi, there are difficulties in obtaining high -accuracy transliteration of homographic...
The increasing generation of MRI dataset and the recent cloud deployment of deep learning (DL) algorithms have enabled timely remote classifications of discrepancies in neural-biomarkers of critical health conditions such as dyslexia. Using these untrusted platforms to implement a secure DL algorithm will identify and resolve potential security att...
Systems of nonlinear equations are known as the basis for many models of engineering and data science, and their accurate solutions are very critical in achieving progress in these fields. However, solving a system with multiple nonlinear equations, usually, is not an easy task. Consequently, finding a robust and accurate solution can be a very cha...
Pencantas perkataan berfungsi untuk membuang imbuhan sesuatu perkataan dengan menghasilkan kata dasar bagi perkataan tersebut. Cantasan banyak digunakan dalam bidang pemprosesan bahasa tabii(PBT) seperti transliterasi mesin, penterjemahan mesin dan capaian dokumen. Dengan penggunaan cantasan, saiz kamus dapat dikurangkan kerana perkataan dalam morf...
Achieving biologically interpretable neural-biomarkers and features from neuroimaging datasets is a challenging task in an MRI-based dyslexia study. This challenge becomes more pronounced when the needed MRI datasets are collected from multiple heterogeneous sources with inconsistent scanner settings. This study presents a method of improving the b...
Dyslexia is a neurological disorder that is characterized by imprecise comprehension of words and generally poor reading performance. It affects a significant population of school-age children, with more occurrences in males, thus, putting them at risk of poor academic performance and low self-esteem for a lifetime. The long-term hope is to have a...
Achieving biologically interpretable neural-biomarkers and features from neuroimaging datasets is a challenging task in an MRI-based dyslexia study. This challenge becomes more pronounced when the needed MRI datasets are collected from multiple heterogeneous sources with inconsistent scanner settings. This study presents a method of improving the b...
The study of malware behaviors, over the last years, has received tremendous attention from researchers for the purpose of reducing malware risks. Most of the investigating experiments are performed using either static analysis or behavior analysis. However, recent studies have shown that both analyses are vulnerable to modern malware files that us...
Offline Jawi handwritten recognition is very important to allow efficient archiving and retrieving the original documents and increase the availability of the content. It is challenging task and still considered an open problem because the state-of-the-art recognizer performance is considered sub-par. The tradition trace Transform features extracto...
Handwritten word recognition is one of the hot topics in automatic handwritten text recognition that received a lot of attention in recent years. Unlike character recognition, word recognition deals with considerable variations in word shape and written style. This paper proposes a novel deep model for language-independent handwritten word recognit...
Teknik cantasan adalah sangat berguna dalam bidang capaian maklumat dan
dokumen. Ia juga dapat mengurangkan saiz kamus. Dalam bahasa Melayu, terdapat dua
jenis skrip penulisan, iaitu sama ada menggunakan sistem ejaan Rumi atau sistem ejaan
Jawi. Kebanyakan pencantas perkataan Melayu hanya merangkumi tulisan Rumi dan
beberapa Jawi moden. Kertas kerj...
This paper presents a critical assessment analysis on mental health detection in Online Social Networks (OSNs) based on the data sources, machine learning techniques, and feature extraction method. The appropriateness of the mental health detection was also investigated by identifying its data analysis method, comparison, challenges, and limitation...
The rapid increase in data volume and features dimensionality have a negative influence on machine learning and many other fields, such as decreasing classification accuracy and increasing computational cost. Feature selection technique has a critical role as a preprocessing step in reducing these issues. It works by eliminating the features that m...
Image segmentation of brain magnetic resonance imaging (MRI) plays a crucial role among radiologists in terms of diagnosing brain disease. Parts of the brain such as white matter, gray matter and cerebrospinal fluids (CFS), have to be clearly determined by the radiologist during the process of brain abnormalities detection. Manual segmentation is g...
Document image binarization is the first essential step in digitalizing images and is considered an essential technique in both document image analysis applications and optical character recognition operations, the binarization process is used to obtain a binary image from the original image, binary image is the proper presentation for image segmen...
The need to detect malware before it harms computers, mobile phones and other electronic devices has caught the attention of researchers and the anti-malware industry for many years. To protect users from malware attacks, anti-virus software products are downloaded on the computer. The anti-virus mainly uses signature-based techniques to detect mal...
Writer’s identification from a handwritten text is one of the most challenging machine learning problems because of the variable handwritten sources, various languages, the similarity between writer’s pattern, context variation, and implicit characteristics of handwriting styles. In this paper, a combination of the deep and hand-crafted descriptor...
Offline Character segmentation of text images is an important step in many document image analysis and recognition (DIAR) applications. However, the character segmentation of both writing styles (printed and handwritten) remains an open problem. Moreover, the manual segmentation is time-consuming and impractical for large numbers of documents. Base...
In this era of digitization, most hardcopy documents are being transformed into digital formats. In the process of transformation, large quantities of documents are stored and preserved through electronic scanning. These documents are available from various sources such as ancient documentation, old legal records, medical reports, music scores, pal...
This paper presents a brief processing of ear identification, and making an improvement of ear recognition via combination of Iterative Closest Point (ICP) algorithm with the Stochastic Clustering Method (SCM). The objective of this paper is to enhance the matching template scheme using MLPNN based on the modified ICP algorithm combined SCM method....
Prostate cancer is one of the commonest cancer found: the forth in Malaysia and sixth in the planet with 307 000 mortalities in 2012. Early detection is important to reduce the death rate; thus this research is carried out to develop an automated prostate cancer classification from the bone scan of pelvis CT images. Preliminary experiment has been...
Quadratic Assignment Problem (QAP) is one of the optimization problems which applied to many applications. This paper is presenting the QAP definition, its formulation and also exacts and heuristics solving methods that applied to QAP.
Jawi Manuscripts handwritten which are kept at Malaysia National Library (MNL), has aged over decades. Regardless of the intensive sustainable process conducted by MNL, these manuscripts are still not maintained in good quality, and neither can easily be read nor better view. Even thought, many states of the art methods are developed for image enha...
Breast Cancer & Surgery
Now a day the threat of malware is increasing rapidly. A software that sneaks to your computer system without your knowledge with a harmful intent to disrupt your computer operations. Due to the vast number of malware, it is impossible to handle malware by human engineers. Therefore, security researchers are taking great efforts to develop accurate...
Harmony Search (HS) is a behaviour imitation of a musician looking for the balance harmony. HS suffers to find the best parameter tuning especially for Pitch Adjustment Rate (PAR). PAR plays a crucial role in selecting historical solution and adjusting it using Bandwidth (BW) value. However, PAR in HS requires to be initialized with a constant valu...
Jawi Manuscripts handwritten which are kept at Malaysia National Library (MNL), has aged over decades. Regardless of the intensive sustainable process conducted by MNL, these manuscripts are still not maintained in good quality, and neither can easily be read nor better view. Even thought, many states of the art methods have developed for image enh...
The problems and challenges in Jawi handwritten recognition are inherited from Arabic script which consists of cursive natures, large variety of writing styles due to its morphologically rich, ligature, overlapping characters, dialects and the low quality of the manuscripts images. The word segmentation is difficult because the existence of sub wor...
Mental health detection in Online Social Network (OSN) is widely studied in the recent years. OSN has encouraged new
ways to communicate and share information, and it is used regularly by millions of people. It generates a mass amount of information
that can be utilised to develop mental health detection. The rich content provided by OSN should not...
Jawi and Roman scripts are represented Malay language. In the past, Jawi writings are widely used by the Malay community and foreigners; and it can be seen in the old documents. Old documents face the risk of background damage. In order to preserve this valuable information, there are significant needs to automated Jawi materials. Based on previous...
Document Image Analysis and Recognition (DIAR) technique is used to recognize text component and translate it into editable format. Scripts are a set of graphical representations used to express a particular writing system as well as subsets belonging to a particular writing system. The writing styles of more than one script family may then be adop...
Firefly Algorithm (FA) is one of the new natural inspired optimization algorithms. It is inspired by the flashing behavior of the fireflies. Firefly algorithm, has some drawbacks such as getting trapped into several local optima, FA parameters are set fixed without change during iterations time. Besides that, it does not memorize or remember the hi...
Document binarization is an
important technique in document image analysis and recognition. Generally, binarization methods are ineffective for degraded images. Several binarization methods have been proposed; however, none of them are effective for historical and degraded document images. In this paper, a new binarization method is proposed for de...
Many methods of segmentation using detection of segmentation points or where the location of segmentation points is expected before the segmentation process, the validity of segmentation points is verified by using ANNs. In this paper apply a novel method to detect correctly of location segmentation points by detect of peaks with neural networks fo...
In this paper, a Shearing Invariant Texture Descriptor (SITD) is proposed, which is a theoretically and computationally simple method based on the Rotation invariant Local Binary Pattern (Rot-LBP) descriptor. In real-world applications using flatbed scanners, such as paper texture fingerprinting, it’s common for a sheet of paper to rotate during th...
Many methods of segmentation using detection of segmentation points or where the location of segmentation points is expected before the segmentation process, the validity of segmentation points is verified by using ANNs. In this paper apply a novel method to detect correctly of location segmentation points by detect of peaks with neural networks fo...
In this paper, an image binarization method for separating text from the background of degraded textual images is proposed. This proposed methods are based on combination of Window Tracking Method (WTM) and Dynamic Image Histogram (DIH). The WTM and DIH methods work on an image that has been pre-processed. The WTM method searches for the largest pi...
A vehicle license plate recognition (LPR) system is useful to many applications, such as entrance admission, security, parking control, airport and cargo, traffic and speed control. This paper describe an adaptive threshold for image segmentation applied to a system for Malaysian intelligent license plate recognition (MyiLPR). Due to the different...
This paper describes the design and creation of a monolingual parallel corpus for the Malay language written in Jawi. This paper proposes a new corpus called the National University of Malaysia Word Tokenization (NUWT) corpora To the best of our knowledge, currently, there is no sufficiently comprehensive, well-designed standard corpus that is anno...
Objectives: The objectives are to use a mathematical model to define a region-based segmentation method. This study determines whether the Connected Component (CC) is one or more than one character. Method: Whereas the other methods they tend to ignore the solid foundation of describing characters and connection points. This proposed method adopts...
Purpose
– The purpose of this paper is to identify whether any financial integration exists among ASEAN+5 members and some East Asian countries, including China, Japan, Korea, Hong Kong, and Taiwan, through interest rate, exchange rate, level of prices, and real output.
Design/methodology/approach
– Therefore, the authors intend to identify any lo...
The Voronoi neighbourhood comes from the Graphs (G), accompanied with the details concerning the correlations and neighbouring Gs lower, upper and centre points, with regards to the word group is performed. The problem entails determining the neighbours for segmenting G to Voronoi Diagrams' (VD) usage for acquisition of Voronoi Edge (VE) which with...
The prevalence of handheld devices such as mobile phones for image capturing now a days is uncontested which is mainly contributed by their high image quality, low cost and portability. Hence, it is only natural that users would prefer the convenience of documents image capturing to photocopying or scanning. Images taken from these devices however,...
Proliferation rate estimation (PRE) is clinically performed from Ki-67 histopathology images. As brain tumor tissues are very complex, accurate PRE determination requires manual cell counting that is tedious, time consuming and inherently inaccurate due to inter-personal variations. Therefore, pathologists usually determine the PRE based on their e...
The Holy Quran is the central religious verbal text of Islam. The Muslims are expected to read, understand, and apply the teachings of the Holy Quran. Holy Quran was translated to Braille code as a normal Arabic text without having its reciting rules included. It is obvious that the users of this transliteration will not be able to recite the Quran...
Tokenization is a fundamental task focused on text processing. Among other tasks, the segmentation process is used to identify information units, such as sentences and words. In this paper, we discuss the Natural Language ToolKit (NLTK) tokenizer as a step to manipulate patterns within text. The purpose of this work is to build up Natural Language...
Analysis of whole-slide tissue for digital pathology images has been clinically approved to provide a second opinion to pathologists. Localization of focus points from Ki-67-stained histopathology whole-slide tissue microscopic images is considered the first step in the process of proliferation rate estimation. Pathologists use eye pooling or eagle...
Quadratic Assignment Problem (QAP), like Vehicle Routing Problem, is one of those optimization problems that interests many researchers in the last decades. The Quay Management Problem is a specific problem which could be presented as a QAP which involves a double assignment of customers and products toward loading positions using lifting trucks. T...
Objects localization from the whole slide images is one main issues that can be handled by image processing techniques which is important for both, the medical and computer science fields. In this study, random patch probabilistic density method is proposed for localization the tissue from the whole slide histology images. The proposed method is a...
The Malay language has two types of writing script, known as Rumi and Jawi. Most previous stemmer results have reported on Malay Rumi characters and only a few have tested Jawi characters. In this article, a new Jawi stemmer has been proposed and tested for document retrieval. A total of 36 queries and datasets from the transliterated Jawi Quran we...
Thinning or skeletonization is become a curial step in many document image analysis and recognition application as a pre-process stage. These applications such as optical character recognition OCR, optical script recognition OSR, and optical font recognition OFR widely adopt many exist methods, some thinning challenging defect the performance of th...
Segmentation and counting of blood cells are considered as an important step that helps to extract features to diagnose some specific diseases like malaria or leukemia. The manual counting of white blood cells (WBCs) and red blood cells (RBCs) in microscopic images is an extremely tedious, time consuming, and inaccurate process. Automatic analysis...
Thinning “Skeletonization” is a very crucial stage in the Arabic Character Recognition (ACR) system. It simplifies the text shape and reduces the amount of data that needs to be handled and it is usually used as a pre-processing stage for recognition and storage systems. The skeleton of Arabic text can be used for: baseline detection, character seg...
In this paper, a Shearing Invariant Texture Descriptor (SITD) is presented and the across-bin matching techniques Quadratic Distance (QD) and the Earth Movers Distance (EMD) are used. The shearing and 180° rotation are the main deformations generated during the image acquisition process from the physical paper using scanners. It is very common that...
Conflicts often arise between farmers and experts regarding the solution for handling issues on paddy abnormality. Although multiple and serial courses are periodically conducted by the experts, yet the farmers have different ideas when dealing with paddy diseases. The disagreement regarding paddy diseases has made it difficult for Malaysia to meet...
Due to the advancements on multimedia and communication technologies, multimedia assets are vulnerable to illegitimate manipulations and replications. To address this type of problem, ownership identification as an effective application of digital watermarking is a key solution. However, among illegitimate operations, geometric attacks are much mor...
Nowadays, due to the advanced technologies in telecommunication and broadcasting networks, there are ever-increasing business demands for immediate watermarking schemes in many real-time applications to protect their digital properties against misuse or piracy. Studies revealed that the spatial-domain watermarking schemes such as the EISB (Enhanced...
An effective approach is a necessity for many digital watermarking applications including medical image ownership identification. To address this necessity, many watermarking techniques have been introduced by the researchers. Among the techniques introduced, spatial domain watermarking techniques are simpler and lower in computational cost due to...
Thinning “Skeletonization” is a very crucial stage in the Arabic Character Recognition (ACR) system. It simplifies the text shape and reduces the amount of data that needs to be handled and it is usually used as a pre-processing stage for recognition and storage systems. The skeleton of Arabic text can be used for: baseline detection, character seg...
Ear recognition is a new technology and future trend for personal identification. However, the false detection
rate and matching recognition are very challenging due to the ear complex geometry. The Scope of the study is to introduced a combination of Iterative Closest Point (ICP) and Stochastic Clustering Matching (SCM)
algorithm for 3D ears match...
The process of baseline detection has an important role in optical recognition systems and document image analysis systems. It is widely used in many various preprocessing stages as a text normalization including skew, slant and slop corrections, writing lines straightness and characters segmentation, as well as in feature extraction process. In th...
Skeletonization and also known as thinning process is an important step in pre-processing phase. Skeletonization is a crucial process for many applications such as OCR, writer identification ect. However, the improvements in this area still remain due to researches recently. A new skeletonization algorithm is proposed in this paper. The algorithm i...
Skeletonization "also known as thinning" is an important step in the pre-processing phase in many of pattern recognition techniques. The output of Skeletonization process is the skeleton of the pattern in the images. Skeletonization is a crucial process for many applications such as OCR and writer identification. However, the improvements in this a...
Assigning lexical categories to words is an important step in the automated analysis of a text. Modern Natural Language Processing (NLP) algorithms are based on machine learning; learn rules automatically through the analysis of large corpora of typical real world examples. The Buckwalter transliteration has become a standard to be followed in natu...
Transliterasi mesin adalah proses menukar skrip daripada teks sumber kepada teks sasaran secara automatik. Ia banyak digunakan Dalam Capaian Maklumat Merentas Bahasa (CLIR), Terjemahan Mesin dan Pengekstrakan Maklumat. Isu utama dalam kajian transliterasi mesin adalah bagaimana untuk mendapatkan hasil transliterasi yang mempunyai ketepatan yang tin...
The Holy Quran is the central religious verbal text of Islam. Muslim people are expected to read, understand, and apply teachings of the Holy Quran. Holy Quran was translated to Braille code as a normal Arabic text without included its reciting rules. This, obviously, mean that the users of this translation is not able to recite the Quran the right...
Text normalization is an important technique in document image analysis and recognition. It consists of many preprocessing stages, which include slope correction, text padding, skew correction, and straight the writing line. In this side, text normalization has an important role in many procedures such as text segmentation, feature extraction and c...
Text normalization is an important technique in document image analysis and recognition. It consists of many preprocessing stages, which include slope correction, text padding, skew correction, and straight the writing line. In this side, text normalization has an important role in many procedures such as text segmentation, feature extraction and c...
Off-line writer identification requires transferring the text under consideration into an image file. This represents the only available solution to bring the printed materials to the electronic media. However, the transferring process causes the system to lose the temporal information of that text, which it can be gathered in on-line writer identi...
Baseline detection is an important process in document image analysis and recognition systems. It is extensively used to many various preprocessing stages such as text normalization, skew correction, characters segmentation, slant and slop correction as well as in feature extraction. in this work, we proposed a new method for baseline detection bas...
This paper presents a method to classify new objects with SURF descriptors and shape skeleton of objects in dataset. The objective of the research is to classify all objects which exist in all images. Stages in this method are consisting of three main stages: image segmentation, object recognition and object class recognition. The region of interes...
Circle detection is one of the fundamental problems in image processing fields. There are many algorithms used for circle detection, the most known algorithm is Circular Hough Transform (CHT), Randomized Hough Transform (RHT) and Randomized Circle detection (RCD). These algorithms are performing differently as they are highly depending on edge imag...
The ear recognition techniques in image processing become a key issue in ear identification and analysis for many geometric applications. Some current specialized feature extraction methods attempted to examine the effects of pose variation and lighting changes that potentially alter the visual characteristics of the structure of the ear. In additi...
Stemmers are important for Information Retrieval and essential for Natural Language Processing. Many researchers have focused on the effect of stemming on document retrieval and measured their stemmers with recall and precision. Applications in text mining (i.e., document indexing and clustering) require great accuracy in index terms. The aim of th...
The process of assessing the outcomes obtained by various groups of researchers is heavily facilitated by conventional databases. This paper introduces, an database (AHDB/FTR) comprising Arabic Handwritten Text Images, which helps the researches associated with recognition of Arabic handwritten text with open vocabulary, word segmentation and write...
The process of assessing the outcomes obtained by various groups of researchers is heavily facilitated by conventional databases. This paper introduces, an database (AHDB/FTR) comprising Arabic Handwritten Text Images, which helps the researches associated with recognition of Arabic handwritten text with open vocabulary, word segmentation and write...
Jawi script is a script that has the Arabic influence. In the past, these writings are widely used by the Malay community as well as foreigners who have diplomatic relations, business, missionary and such. At that time, the Malay language is the lingua franca of this region. So there are many Malay heritages such as manuscripts, religious books, le...
Triangle is a basic geometry. There are six types of triangle, but scalene triangle was chosen to be used in this research which is based on coordinates of corners generated by our proposed algorithm. In this paper, nine features are proposed. Six of the features were derived from coordinates and sides of triangle whereas three others are angle of...
A novel method is proposed to recognize the Arab/Jawi and Roman digits.
This new method is based on features from the triangle geometry,
normalized into nine features. The features are used for zoning which
results in five and 25 zones. The algorithm is validated by using three
standard datasets which are publicly available and used by researchers...
Kajian ini bertujuan untuk membandingkan sistem ejaan Jawi lama dan baru yang terdapat pada kitab Hidayah al Salikin yang dicetak di Pulau Pinang oleh Maktabah wa Matba`ah Dar al-Ma`arif (sistem ejaan Jawi lama) dan edisi ejaan Jawi baru terbitan Jabal Maraqy Enterprise Kota Bharu, Kelantan oleh Abdul Ghani Jabal Maraqy. Kitab Hidayah al Salikin ad...
Research in Malay Part-of-Speech (POS) has increased considerably in the past few years. From the literature, POS are known as the first stage in automated text analysis and the development of language technologies can scarcely begun without this initial phase. Malay language can be written in Roman or Jawi. Three different spelling between Roman a...
Document image analysis and recognition (DIAR) techniques are a primary application of pattern recognition. OFR is one of the most important DIAR techniques. The information about font type indicates important information to support human knowledge and other document analysis and recognition techniques. In this paper, a new optical font recognition...
Uji kaji dilaksanakan dengan imej dari Batu Bersurat Terengganu yang telah dilakukan penemberengan kepada huruf-huruf tunggal kerana Model Geometri Segi Tiga yang dicadangkan menggunakan pendekatan tempatan. Kajian dilakukan kepada model geometri segi tiga yang telah dimodelkan dan dipadankan dengan geometri yang diperoleh daripada huruf-huruf pada...
Baseline detection is an important process in document image analysis and recognition systems. It is extensively used to many various preprocessing stages such as text normalization, skew correction, characters segmentation, slant and slop correction as well as in feature extraction. in this work, we proposed a new method for baseline detection bas...
This book constitutes the refereed proceedings of the 16th FIRA Robo World Congress, FIRA 2013, held in Kuala Lumpur, Malaysia, in August 2013. The congress consisted of the following three conferences: 5th International Conference on Advanced Humanoid Robotics Research (ICAHRR), 5th International Conference on Education and Entertainment Robotics...
The ear recognition techniques in image processing become a key issue in ear identification and analysis for many geometric applications. This paper first reviews the source of ear image identification, compares the different applied models being currently used for the ear image modeling, details the algorithms, methods and processing steps and fin...
Image acquisition has great influence on the performance of any computer vision application. Different methods can be utilized to acquire the digital image of a paper, whilst scanning scheme is among the most attractive methods. This attractiveness is because of the fewer types of potential deformations and the low cost of the scanning devices, e.g...
Knowledge Management (KM) implementation is mainly linked to soft issues such as organizational culture and people. Recently, most organizations are struggling to effectively use KM tools and techniques. Previous study shows that there is a relationship between KM and Customer Relationship Management (CRM). KM, in particular, has been defined as th...
The paper aims to do a comparative analysis between old and modern Jawi spelling used in the Kitab Hidayah al-Salikin by Shaykh 'Abd al-Samad al-Falimbani written in 1192AH. This Kitab is the first Malay Kitab printed in Mecca. The major elements of this Kitab are Sufism, ‘Aqidah and Fiqh. This study examines the comparison of the printed Kitab in...
Questions
Question (1)
I have read a lot of articles but most of them are focus only on accuracy. Appreciate if anyone could share some of the articles which focus on performance measurement if any.