Riyad Al-ShalabiAmman Arab University · Department of Management Information Systems
Riyad Al-Shalabi
PhD
About
56
Publications
30,344
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,162
Citations
Publications
Publications (56)
Satisfaction Detection is one of the most common issues that impact the business world. So, this study aims to suggest an application that detects the Satisfaction tone that leads to customer happiness for Big Data that came out from online businesses on social media, in particular, Facebook and Twitter, by using two famous methods, machine learnin...
Satisfaction Detection is one of the most common issues that impact the business world. So, this study aims to suggest an application that detects the Satisfaction tone that leads to customer happiness for Big Data that came out from online businesses on social media, in particular, Facebook and Twitter, by using two famous methods, machine learnin...
The Gender Identification (GI) problem is concerned with determining the gender of a given text's author. It has a wide range of academic/commercial applications in various fields including literature, security, forensics, electronic markets and trading, etc. To address this problem, researchers have proposed that the writing styles of authors of t...
Sentiment Analysis (SA) is field in computational linguistics concerned with determining the sentiment conveyed in a piece of text towards certain entities (such as people, organizations, products, services, events, etc.) using NLP tools. The considered sentiments can be as simple as positive vs. negative. A more fine-grained approach known as Mult...
Sentiment Analysis (SA) is a computational study of the sentiments expressed in text toward entities (such as news, products, services, organizations, events, etc.) using NLP tools. The conveyed sentiments can be quantified using a simple positive/negative model. A more fine-grained approach known as Multi-Way SA (MWSA) uses a ranking system like t...
Sentiment Analysis (SA) is one of hottest fields in data mining (DM) and natural language processing (NLP). The goal of SA is to extract the sentiment conveyed in a certain text based on its content. While most current works focus on the simple problem of determining whether the sentiment is positive or negative, Multi-Way Sentiment Analysis (MWSA)...
Arabic light stemmer removes affixes from any word as well as stop words. It is considered as a text pre-processing task for many Natural Language Processing (NLP) applications such as text categorization, information retrieval, opinion mining, etc.. Many Arabic light stemmers were presented are depend on several techniques like the grammar-based,...
Arabic light stemmer removes affixes from any word as well as stop words. It is considered as a text pre-processing task for many Natural Language Processing (NLP) applications such as text categorization, information retrieval, opinion mining, etc.. Many Arabic light stemmers were presented are depend on several techniques like the grammar-based,...
The Gender Identification (GI) problem is concerned with determining the gender of the author of a given text based on its contents. The GI problem is one of the authorship profiling problems which have a wide range of applications in various fields such as marketing and security. Due to its importance, extensive research efforts have been invested...
Text classification facilitated the involvement of user in various computer-based applications and environments which their data scale increased continuously. Since that, we study the scalability behavior of some classifiers to identify the most scalable one. We used Weka and RapidMiner tools to execute the experiment on Arabic dataset. The dataset...
Feature selection is necessary for effective text classification. Dataset preprocessing is essential to make upright result and effective performance. This paper investigates the effectiveness of using feature selection. In this paper we have been compared the performance between different classifiers in different situations using feature selection...
The prevalent use of Online Social Networks (OSN) and the anonymity and lack of accountability they inherent from being online give rise to many problems related to finding the connection between the massive amount of text data on OSN and the people who actually wrote them. Analyzing text data for such purposes is called authorship analysis. This w...
The Holy Quran is the biggest Miracle of Muslims everywhere and at every time; therefore, it is valid for every time and place. Actually, researches and studies into the Holy Quran that aim to uncover new miracles within are considered as a kind of worship for Muslims researchers since it facilitates the Islamic mission and clarifies the vague pict...
Many Natural Language Processing (NLP) techniques have been used in Information Retrieval, the results are not encouraging. Proper names are problematic for cross-language information retrieval (CLIR), detecting and extracting proper nouns in the Arabic language is a primary key for improving the effectiveness of the system. The value of informatio...
Recently [1] presented a new heuristic optimization approach, called 2D-3D Continuous Ant Colony Approach (2D-3D-CACA), based on Ant Colony Optimization Algorithm for solving 3D continuous real world space problems. Ant colony algorithms are a subset of swarm intelligence and consider the ability of simple ants to solve complex problems by cooperat...
Many algorithms have been implemented for the problem of text classification. Most of the work in this area was carried out for English text. Very little research has been carried out on Arabic text. The nature of Arabic text is different than that of English text, and preprocessing of Arabic text is more challenging. This paper presents an impleme...
Stemming is one of many tools used in information retrieval to combat the vocabulary mismatch problem, in which query words do not match document words. Stemming in the Arabic language does not fit into the usual mold, because stemming in most research in other languages so far depends only on eliminating prefixes and suffixes from the word, but Ar...
We depict the architecture of a question answering system and methodically evaluate contributions of different system components to accuracy. The system differs from most question answering systems in its dependency on data redundancy rather than complicated linguistic analyses of either questions or contender answers. Because a wrong answer is oft...
Building an effective stemmer for Arabic language has been always a hot research topic in the IR field since Arabic language has a very different and difficult structure than other languages, that's because it is a very rich language with complex morphology. Many linguistic and light stemmers have been developed for Arabic language but still there...
Root extraction is one of the most important topics in information retrieval (IR), natural language processing (NLP), text summarization, and many other important fields. In the last two decades, several algorithms have been proposed to extract Arabic roots. Most of these algorithms dealt with triliteral roots only, and some with fixed length words...
Many of Natural Language Processing (NLP) techniques have been used in Information Retrieval, the results is not encouraging. Proper names are problematic for cross language information retrieval (CLIR), detecting and extracting proper noun in Arabic language is a primary key for improving the effectiveness of the system. The value of information i...
This paper provides an improvement to Arabic Information Retrieval Systems. The proposed system relies on the stem-based query expansion method, which adds different morphological variations to each index term used in the query. This method is applied on Arabic corpus. Roots of the query terms are derived, then for each derived root from the query...
This chapter presents enhanced, effective and simple approach to text classification. The approach uses an algorithm to automatically classifying documents. The main idea of the algorithm is to select feature words from each document; those words cover all the ideas in the document. The results of this algorithm are list of the main subjects founde...
Information retrieval systems utilize user feedback for generating optimal queries with respect to a particular information need. However, the methods that have been developed in IR for generating these queries do not memorize information gathered from previous search processes, and hence cannot use such information in new search processes. Thus, a...
Much attention has been paid to the relative effectiveness of Interactive Query Expansion(IQE) versus Automatic Query Expansion (AQE). This research has been shown that automatic queryexpansion (collection dependent) strategy gives better performance than no query expansion. Thepercentage of queries that are improved by AQE strategy is 57% with ave...
Text classification is the task of assigning a document to one or more of pre-defined categories based on its contents. This paper presents the results of classifying Arabic language documents by applying the KNN classifier, one time by using N-Gram namely unigrams and bigrams in documents indexing, and another time by using traditional single term...
Much attention has been paid to the relative effectiveness of interactive query expansion (IQE) versus automatic query expansion (AQE). This research has been shown that automatic query expansion (collection dependent) strategy gives better performance than no query expansion. The percentage of queries that are improved by AQE strategy is 57% with...
The paper presents enhanced, effective and simple approach to text classification. The approach uses an algorithm to automatically classifying documents. The main idea of the algorithm is to select feature words from each document; those words cover all the ideas in the document. The results of this algorithm are list of the main subjects founded i...
The paper describes a new stemmer algorithm to find the roots and patterns for Arabic words based on excessive letter locations. The algorithm locates the trilateral root , quadri-literal root as well as the pentaliteral root. The algorithm is written with the goal of supporting natural language processing programs such as parsers and information r...
This study introduces an analysis to the performance of the Enhanced Associativity Based Routing protocol (EABR ) based on two factors; Operation complexity (OC) and Communication Complexity (CC). OC can be defined as the number of steps required in performing a protocol operation, while CC can be defined as the number of messages exchanged in perf...
The development of an efficient compression scheme to process the Arabic languagerepresents a difficult task. This paper employs the dynamic Huffman coding on data compression withvariable length bit coding, on the Arabic language. Experimental tests have been performed on bothArabic and English text. A comparison was made to measure the efficiency...
Many algorithms have been implemented to the problem of text categorization. Most of the work in this area was carried out for the English text; on the other hand very few researches have been carried out for the Arabic text. In this project we have implemented the key Nearest Neighbor (kNN) algorithm, which is known to be one of top performing cla...
This project presents an implementation of automatic KNN Arabic text categorizer. Six hundred Arabic text documents belong to 6 categories was tested using the classifier. The main objective of this project is to build an automatic KNN Arabic text categorizer and test the effectiveness of the information gain (IG) feature selection which used for f...
This study investigated the use of neural networks in function approximation, data fitting and prediction. Due to its superior performance, the counterpropagation network was considered and an attempt was made to enhance its performance. As a result of this research, we proposed a new neural network architecture named Single Layer Linear Counterpro...
Automatic Text Categorization (ATC) refer to the process of building software tools capable of assigning unseen documents to predefined categories or subjects. This study aims to automatically classify the verses (Ayat, sentences) of the Fatiha and Yaseen Surahs (Chapters) in the Quran according to the classifications of Islamic scholars. Our autom...
This study explore the implementation of a text classification method to classify the prophet Mohammed (PBUH) hadiths (sayings) using Sahih Al-Bukhari classification. The sayings explain the Holy Qur`an, which considered by Muslims to be the direct word of Allah. Present method adopts TF/IDF (Term Frequency-Inverse Document Frequency) which is used...
In this paper we investigate the use of neural networks in function approximation, data fitting, and prediction. Due to its superior performance, the counterpropagation network was considered and an attempt was made to enhance its performance. As a result of this work, we propose a new neural network architecture named single layer linear counterpr...
This algorithm provides a new method for extracting the quadriliteral Arabic root (a four consonant string) from its morphological derivatives. Our stemming algorithm starts by excluding prefixes and checking the word starting from the last letter back to the first. A temporary vector is used to store the suffix letters being removed, and another v...
In the present study a system have developed that uses the Successor Variety Stemming Algorithm to find stems for Arabic words. A corpus of 242 abstracts have obtained from the Saudi Arabian National Computer Conference. All of these abstracts involve computer science and information systems. The study have set out to discover whether the Successor...
The objective of Information Retrieval is to retrieve all relevant documents for a user query and only those relevant documents. Much research has focused on achieving this objective with little regard for storage overhead or performance. This study have evaluated the use of Part of Speech Tagging to improve the index storage overhead and general s...
This study provides a technique for extracting the triliteral Arabic root for an unvocalized Arabic corpus. It provides an efficient way to remove suffixes and prefixes from the inflected words. Then it matches the resulting word with the available patterns to find the suitable one and then extracts the three letters of the root by removing all inf...
In the present study a system have developed that uses the Successor Variety Stemming Algorithm to findstems for Arabic words.A corpus of242 abstracts have obtained from the Saudi Arabian National Computer Conference. All of these abstracts involve computer science and information systems. The study have set out to discoverwhether the Successor Var...
The objective of this research is to study the process of examining documents by computing comparisons between the representation of the information need (the queries) and the representations of the documents. Also, we will automate the process of representing information needs as user profiles by computing the comparison between the user profile a...
Summary form only given. We have designed and implemented an efficient stop-word removal algorithm for Arabic language based on a finite state machine (FSM). An efficient stop-word removal technique is needed in many natural language processing application such as: spelling normalization, stemming and stem weighting, Question answering systems and...
Summary form only given. We present a new stemming algorithm to extract quadri-literal Arabic roots. The algorithm starts by excluding the prefixes and checks then the word characters starting from the last letter backward to the first one. A temporary matrix is used to store the suffix letters of the Arabic word, and another matrix is used to stor...
Part-of-Speech tagging is the process of assigning grammatical part-of-speech tags to words based on their context. Many automated tagging systems have been developed for English and many other western languages, and for some Asian languages, and have achieved accuracy rates ranging from 95% to 98%. A tagged corpus has more useful information than...
This paper describes a new algorithm for morphological analysis of Arabic words, which has been tested on a corpus of 242 abstracts from the Saudi Arabian National Computer Conference . It runs an order of magnitude faster than other algorithms in the literature.
Word sense ambiguity is widely spread in all natural languages; a word may carry several distinct meanings. Human can figure out the suitable meaning according to the context in which the word occurs. The Arabic language is highly polysemous; in many situations we find it extremely necessary to disambiguate the word senses. This paper studies and c...
Text classification is getting more attention and there is an increased need for text classification technique that provides automatic, fast, and accurate semi-supervised classification with the least human interaction with such systems. In our work we incorporated a well experimented technique for classification that makes use of the famous EM alg...