About
54
Publications
88,418
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
288
Citations
Citations since 2017
Introduction
Skills and Expertise
Publications
Publications (54)
Online social networks have become a necessity to everyone around the world. Particularly, online social networks have enabled us to connect to one another regardless of time, for as long as we have social media and social networking as platforms for broadcasting information and communicating, respectively. However, this evolution has resulted in p...
Machine learning is implemented extensively in various applications. The machine learning algorithms teach computers to do what comes naturally to humans. The objective of this study is to do comparison on the predictive models in cyberbullying detection between the basic machine learning system and the proposed system with the involvement of featu...
The popularity of social networking sites (SNS) has facilitated communication between users. The usage of SNS helps users in their daily life in various ways such as sharing of opinions, keeping in touch with old friends, making new friends, and getting information. However, some users misuse SNS to belittle or hurt others using profanities, which...
Recently. recommender systems have become a very crucial application in the online market and e-commerce as users are often astounded by choices and preferences and they need help finding what the best they are looking for. Recommender systems have proven to overcome information overload issues in the retrieval of information, but still suffer from...
A Quranic optical character recognition (OCR) system based on convolutional neural network (CNN) followed by recurrent neural network (RNN) is introduced in this work. Six deep learning models are built to study the effect of different representations of the input and output, and the accuracy and performance of the models, and compare long short-te...
Information need has been one of the main motivations for a person using a search engine. Queries can represent very different information needs. Ironically, a query can be a poor representation of the information need because the user can find it difficult to express the information need. Query Expansion (QE) is being popularly used to address thi...
Abstract
Authorship Attribution (AA) is a task that aims to recognize the authorship of unknown texts based on writing style. Out of the various approaches to solve the AA problem, Stylometry is a promising one. This paper explores the use of a K-Nearest Neighbor (KNN) classifier combined with stylometry features to perform AA. This study indicates...
In recent years, a significant boost in data availability for persistent data streams has been
observed. These data streams are continually evolving, with the clusters frequently forming arbitrary shapes instead of regular shapes in the data space. This characteristic leads to an exponential increase in the processing time of traditional clustering...
Objectives: Cloud computing technology is in continuous development, and with numerous challenges regarding security. In this context, one of the main concerns for cloud computing is represented by the trustworthiness of cloud services in the Pakistan Information Technology (IT) industry. This problem requires prompt resolution because IT organizat...
Building a robust Optical Character Recognition (OCR) system for languages, such as Arabic with cursive scripts, has always been challenging. These challenges increase if the text contains diacritics of different sizes for characters and words. Apart from the complexity of the used font, these challenges must be addressed in recognizing the text of...
Density-based methods have appeared as a valuable category for the clustering of evolving data streams. Although a number of density-based algorithms have recently been developed for the clustering of data streams, these algorithms are not without their issues. The quality of the clustering is dramatically reduced when the distance function is used...
Density-based methods have appeared as a valuable category for the clustering of evolving data streams. Although a number of density-based algorithms have recently been developed for the clustering of data streams, these algorithms are not without their issues. The quality of the clustering is dramatically reduced when the distance function is used...
In recent years, a significant boost in data availability for persistent data streams has been observed. These data streams are continually evolving, with the clusters frequently forming arbitrary shapes instead of regular shapes in the data space. This characteristic leads to an exponential increase in the processing time of traditional clustering...
Phishing is an attack that uses social engineering techniques to steal users' confidential information like passwords and banking information. It happens when cyber criminals disguised as a trusted entity and deceived users to click on fake links in e-mail received by the user. Cyber criminals also act to target phishing attacks from individuals to...
The study of online romance scam is still at its infancy in Malaysia, despite the increase in the number of reported cases in this country. This research primarily aims to identify the steps and strategies involved in the online romance scam in Malaysia. Apart from that, it also aims to identify the pattern of deceptive language used in online roma...
Language model encapsulates semantic, syntactic and pragmatic information about specific task. Intelligent systems especially natural language processing systems can show different results in terms of performance and precision when moving among genres and domains. Therefore researchers have explored different language model adaptation strategies in...
Bio-Named Entity Recognition (Bio-NER) is the process of identifying and semantically classifying biomedical technical terms and named entities in Biomedicine literature. Therefore, it is a major task in biomedical knowledge acquisition. Meanwhile, Natural Language Processing (NLP) plays an important role in Bio-NER in the biomedical domain. The fi...
This paper is an overview of cyberbullying which occurs mostly on social networking sites and issues and challenges in detecting cyberbullying. The topic presented in this paper starts with an introduction on cyberbullying: definition, categories and roles. Then, in the discussion of cyberbullying detection, available data sources, features and cla...
Biomedical Named Entity Recognition (Bio-NER) is an essential step of biomedical information extraction and biomedical text mining. Although, a lot of researches have been made in the design of rule-based and supervised tools for general NER, Bio-NER still remains a challenge and an area of active research, as still there is huge difference in F-sc...
Finding good and relevant information in crime news is one of the most challenging tasks faced by users. An increase in the amount of information from news media has caused difficulties for users in obtaining relevant information. Hence, visualization is one of the important aspects to enhance user’s understanding when browsing or searching for new...
Lanskap teknologi maklumat dan komunikasi terus-menerus berkembang dengan pantas menyebab pelbagai isu yang memerlu pertimbangan moral, undang-undang dan sosial timbul. Bayangkan banyaknya data peribadi yang dikumpul oleh tapak jaringan sosial seperti Facebook dan syarikat enjin gelintar lain yang besar. Perlukah kita bimbang tentang maklumat yang...
Lanskap teknologi maklumat dan komunikasi terus-menerus berkembang dengan pantas menyebab pelbagai isu yang memerlu pertimbangan moral, undang-undang dan sosial timbul. Bayangkan banyaknya data peribadi yang dikumpul oleh tapak jaringan sosial seperti Facebook dan syarikat enjin gelintar lain yang besar. Perlukah kita bimbang tentang maklumat yang...
The conventional information retrieval (IR) framework consists of four primary phases, namely, pre-processing, indexing, querying and retrieving results. Some phases of the current Arabic IR (AIR) framework have several drawbacks. This research aims to enhance an AIR by improving the processes in a conventional IR framework. We introduce an enhance...
The proliferation of many interactive Topic Detection and Tracking (iTDT) systems has motivated researchers to design systems that can track and detect news better. iTDT focuses on user interaction, user evaluation, and user interfaces. Recently, increasing effort has been devoted to user interfaces to improve TDT systems by investigating not just...
Stemming is a process of reducing inflected words to their stem, stem or root from a generally written word form. One of the high inflected words in the languages world is Arabic Language. Stemming improve the retrieval performance by reducing words variants, and in lcrease the similarity between related words. However, an Arabic Information Retrie...
Factors effecting re-find personal photographs are often difficult to define, given its inexpressible numbers. Literature review highlights the role of human behaviors in personal photographs lifecycle such as capturing, keeping, managing and re-finding personal photographs exaggeratedly without any progress to handle all factors and so forth. More...
This study presents the results of an experimental study of two document clustering techniques which are kmeans and k-means++. In particular, we compare the two main approaches in crime document clustering. The drawback of k-means is that the user needs to define the centroid point. This becomes more critical when dealing with document clustering b...
Since most Internet users limit their search scope to first page of search results and use the obtained information for decision making, logically search engines must give higher ranks to the websites with high quality data. In this research, a case study is conducted on dataset of 44 web portals of universities in Malaysia. The data quality level...
The usage of e-learning in education has become a medium to connect and to enhance the online communications between students and their lecturers as well as their friends in college. Students use the element of social networking as a medium for their studies to seek or share information with other students. This study investigates students’ percept...
Quranic text Information Retrieval (IR) is quite demanding yet very trivial due to that user will not always use the exact keywords to retrieve the relevant Quranic text (verse). Many have tried to overcome this problem by expanding or reformulating the query entered by users using semantic approaches with resources such as ontologies and thesauri....
Quranic text Information Retrieval (IR) is quite demanding yet very trivial due to that user will not always use the exact keywords to retrieve the relevant Quranic text (verse). Many have tried to overcome this problem by expanding or reformulating the query entered by users using semantic approaches with resources such as ontologies and thesauri....
Most of the crimes committed today are reported on the Internet through news, blogs and social networking sites. These sources have provided a huge amount of crime data, presenting a need for a means to extract useful information. In this research, the evaluation of Direct and Indirect extraction of nationality from crime news, along with the addit...
The development of digital library seems to be underpinned by a variably defined concepts accorded to term. These varied definitions have posed problems as to what actually constitutes a digital library, ranging from operations, holdings or collections and activities as there is no as yet a standard approach accepted by all. Thus, digital library i...
Teknologi maklumat dan komunikasi berubah dengan pantas. Ia memudahkan urusan seharian tetapi juga turut mendatangkan ancaman sekiranya penggunaannya tidak disertai dengan etika dan moral yang baik. Buku ini cuba menyingkap persoalan mengenai undang-undang, etika, dan dampak teknologi maklumat ke atas kehidupan sosial.
Isu etika semakin banyak di...
Teknologi maklumat dan komunikasi berubah dengan pantas. Ia memudahkan urusan seharian tetapi juga turut mendatangkan ancaman sekiranya penggunaannya tidak disertai dengan etika dan moral yang baik. Buku ini cuba menyingkap persoalan mengenai undang-undang, etika, dan dampak teknologi maklumat ke atas kehidupan sosial.
Isu etika semakin banyak di...
Teknologi Maklumat dan Komunikasi berubah dengan pantas. Ia memudahkan urusan seharian tetapi juga turut mendatangkan ancaman sekiranya penggunaannya tidak disertai dengan etika dan moral yang baik. Buku ini cuba menyingkap persoalan mengenai undang-undang, etika, dan impak teknologi maklumat ke atas kehidupan sosial. Dalam era teknologi maklumat,...
ii Pendahuluan Buku ini adalah edisi semakan kepada edisi yang ditulis pada tahun 2003. Buku ini ditulis khas untuk mahasiswa sains dan teknologi maklumat yang mengambil kursus TM1813 Pengantar Sains Sosial. Walaupun kursus ini bersifat pengantar tetapi isu yang dibincangkan adalah isu semasa yang relevan dengan kehidupan kontemporari. Berdasarkan...
This paper offers a critical examination of the variably defined concepts accorded to digital library. These varied and controversial definitions have posed problems as to what actually forms a digital library, its operations, holdings and activities. Literature in the area reveals that the absence of a commonly acceptable definition of digital lib...
ii Pendahuluan Buku ini ditulis khas untuk pelajar sains dan teknologi maklumat yang mengambil kursus TM1813 Pengantar Sains Sosial. Walaupun kursus ini bersifat pengantar tetapi isu yang dibincangkan tidak terhad kepada asas sains sosial tetapi turut meliputi isu semasa yang relevan dengan kehidupan kontemporari. Berdasarkan pengalaman penulis men...
Projects
Project (1)
we purpose to investigate the performance of KNN classifier on short historical Arabic texts ranging between 1289 and 1785 words per author. The length text varies from 290 to 800 words per document. Thus, we aim to train KNN classifier on limited data. For the purpose different stylometric features used. Additionally, methods of n fold
cross validation and Feature Selection (FS) were used for enhancing KNN Performance.