Mohsen A. Rashwan

Mohsen A. Rashwan
Cairo University | CU · Department of Electronics and Communication Engineering

PhD

About

152
Publications
59,372
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,049
Citations
Citations since 2017
24 Research Items
581 Citations
2017201820192020202120222023020406080100
2017201820192020202120222023020406080100
2017201820192020202120222023020406080100
2017201820192020202120222023020406080100
Additional affiliations
May 1977 - present
Cairo University
Position
  • Professor (Full)
Description
  • I have: 1. B.SC.: Cairo Univ. 1977 2. MSc: Ciaro Un, 1980 3. MSc: Carlton Un, 1985 4. PhD, Queen's Un, 1987 5. MBA, AASTMT, 2002
Education
September 1985 - May 1987
Queen's University
Field of study
  • Electrical Engineering

Publications

Publications (152)
Article
Full-text available
Semantic Textual Similarity (STS) is the task of identifying the semantic correlation between two sentences of the same or different languages. STS is an important task in natural language processing because it has many applications in different domains such as information retrieval, machine translation, plagiarism detection, document categorizatio...
Article
The problem of region of interest (RoI) in document layout analysis and document recognition has recently become an essential topic in OCR'ing systems. Arabic manuscript layout analysis and OCRing recognition using language detection, document category, and region of interest (RoI) with Keras and TensorFlow are terms of the state-of-the-art that sh...
Article
Full-text available
The way in which people speak reveals a lot about where they are from, where they were raised, and also where they have recently lived. When communicating in a foreign language or second language, accents from one’s first language are likely to emerge, giving an individual a ‘strange’ accent. This is a great and challenging problem. Not particularl...
Article
Full-text available
Any natural language may have dozens of accents. Even though the equivalent phonemic formation of the word, if it is properly called in different accents, humans do have audio signals that are distinct from one another. Among the most common issues with speech, the processing is discrepancies in pronunciation, accent, and enunciation. This research...
Article
In this paper, automatic segmentation system was built using the Kaldi toolkit at phoneme level for Quran verses data set with a total speech corpus of (80 hours) and its corresponding text corpus respectively, with a size of 1100 recorded Quran verses of 100 non-Arab reciters. Initiated with the extraction of Mel Frequency Cepstral Coefficients MF...
Article
Full-text available
The recent surge of social media networks has provided a channel to gather and publish vital medical and health information. The focal role of these networks has become more prominent in periods of crisis, such as the recent pandemic of COVID-19. These social networks have been the leading platform for broadcasting health news updates, precaution i...
Article
Full-text available
Due to the rapid developments in technology and the sudden expansion of social media use, Dialect Arabic has become an important source of data that needs to be addressed when building Arabic corpora. In this paper, thirty-three Arabic corpora are surveyed to show that despite all of the developments in the literature, Saudi dialect (SD) corpora st...
Conference Paper
Full-text available
In this paper, we describe the Automatic Speech Recognition (ASR) system developed by the team of RDI in the framework of the 2019 Multi-Genre Broadcast (MGB-5) challenge in Arabic language. The challenge of this year is considered as a task of building a system for transcribing Morocan Dialect Arabic speech, using a big audio corpus of primarily M...
Article
Full-text available
Different approaches have been used to estimate language models from a given corpus. Recently, researchers have used different neural network architectures to estimate the language models from a given corpus using unsupervised learning neural networks capabilities. Generally, neural networks have demonstrated success compared to conventional n-gram...
Article
Full-text available
In this paper, we describe a detailed approach to develop a botnet detection system using machine learning (ML) techniques. Detecting botnet member hosts, or identifying botnet traffic has been the main subject of many research efforts. This research aims to overcome two serious limitations of current botnet detection systems: First, the need for D...
Article
Full-text available
Arabic Modern texts are commonly written without diacritization, which is a critical task for other Arabic processing tasks as word sense disambiguation, automatic speech recognition, and text to speech, where word meaning or pronunciation is decided based on the diacritic signs assigned to each letter. This paper presents a novel approach for aut...
Article
This paper presents a system for improving the quality of pronunciation error detection and correction for Qur'an recitation by Non-Arabic speakers. Most of the classical speech recognition systems are built using the Hidden Markov Model (HMM) with a Mixture of Gaussian Model (GMM). This paper attempts to enhance the GMM-HMM model's performance by...
Article
Full-text available
Analytical based approaches in Optical Character Recognition (OCR) systems can endure a significant amount of segmentation errors, especially when dealing with cursive languages such as the Arabic language with frequent overlapping between characters. Holistic based approaches that consider whole words as single units were introduced as an effectiv...
Article
Full-text available
Document layout analysis is a key step in the process of converting document images into text. Arabic language script is cursive and written in different styles which cause some challenges in the analysis of Arabic text documents. In this paper, we introduce an approach for Arabic documents layout analysis. In that approach, the document is segment...
Article
Full-text available
This paper proposes a new optical camouflage system that uses RGB-D cameras, for acquiring point cloud of background scene, and tracking observers eyes. This system enables a user to conceal an object located behind a display that surrounded by 3D objects. If we considered here the tracked point of observer s eyes is a light source, the system will...
Conference Paper
Full-text available
In this paper, we introduce an enhancement for speech recognition systems using an unsupervised speaker clustering technique. The proposed technique is mainly based on I-vectors and Self-Organizing Map Neural Network (SOM). The input to the proposed algorithm is a set of speech utterances. For each utterance, we extract 100-dimensional I-vector and...
Article
Full-text available
Automated segmentation of speech signals has been under research for over 30 years. Many speech processing systems require segmentation of Speech waveform into principal acoustic units. Segmentation is a process of breaking down a speech signal into smaller units. Segmentation is the very primary step in any voiced activated systems like speech rec...
Research
This issue includes the following articles; P1121606475, Author="M. AbdelDayem and H. Hemeda and A. Sarhan", Title="Enhanced User Authentication through Keystroke Biometrics for Short-Text and Long-Text Inputs" P1121602462, Author="Eslam Mahmoud and Ahmed M. Elmogy and Amany Sarhan", Title="Enhancing Grid Local Outlier Factor Algorithm for better O...
Research
Full-text available
This issue includes the following articles; P1121606475, Author="M. AbdelDayem and H. Hemeda and A. Sarhan", Title="Enhanced User Authentication through Keystroke Biometrics for Short-Text and Long-Text Inputs" P1121602462, Author="Eslam Mahmoud and Ahmed M. Elmogy and Amany Sarhan", Title="Enhancing Grid Local Outlier Factor Algorithm for better O...
Research
Full-text available
This issue includes the following articles; P1121606475, Author="M. AbdelDayem and H. Hemeda and A. Sarhan", Title="Enhanced User Authentication through Keystroke Biometrics for Short-Text and Long-Text Inputs" P1121602462, Author="Eslam Mahmoud and Ahmed M. Elmogy and Amany Sarhan", Title="Enhancing Grid Local Outlier Factor Algorithm for better O...
Data
Full-text available
This issue includes the following articles; P1121606475, Author="M. AbdelDayem and H. Hemeda and A. Sarhan", Title="Enhanced User Authentication through Keystroke Biometrics for Short-Text and Long-Text Inputs" P1121602462, Author="Eslam Mahmoud and Ahmed M. Elmogy and Amany Sarhan", Title="Enhancing Grid Local Outlier Factor Algorithm for better O...
Article
Full-text available
This issue includes the following articles; P1121606475, Author="M. AbdelDayem and H. Hemeda and A. Sarhan", Title="Enhanced User Authentication through Keystroke Biometrics for Short-Text and Long-Text Inputs" P1121602462, Author="Eslam Mahmoud and Ahmed M. Elmogy and Amany Sarhan", Title="Enhancing Grid Local Outlier Factor Algorithm for better O...
Conference Paper
Full-text available
In this paper, a system is proposed to prepare a digital or a scanned Quran version for a verification process. The system handles the skew errors in the scanned image, Text extraction from ornamentation, a successful line segmentation for Arabic scripts, verse pattern detection for different versions, and powerful diacritics classifier. The propos...
Conference Paper
Gaussian Mixture Models (GMM) has been the most common used models in pronunciation verification systems. The recently introduced Deep Neural Networks (DNN) has proved to provide significantly better discriminative models of the acoustic space. In this paper, we introduce our efforts to upgrade the models of a Computer Aided Language Learner (CAPL)...
Article
Full-text available
Zone segmentation and classification is an important step in document layout analysis. It decomposes a given scanned document into zones. Zones need to be classified into text and non-text, so that only text zones are provided to a recognition engine. This eliminates garbage output resulting from sending non-text zones to the engine. This paper pro...
Article
Ranking is an important task in the field of information retrieval. Ranking may be used in different modules in natural language processing such as search engines. In this paper, we introduce a competitive ranking system which combines three different modules. The system participated in SemEval 2016 question ranking task for the Arabic language. Th...
Conference Paper
Full-text available
Ranking is an important task in the field of information retrieval. Ranking may be used in different modules in natural language processing such as search engines. In this paper, we introduce a competitive ranking system which combines three different modules. The system participated in SemEval 2016 question ranking task for the Arabic language. Th...
Conference Paper
Full-text available
Vector-based approaches proved their validity during the past few years as promising techniques for word and sentence representation. Automatic short answer grading is a challenging problem in natural language processing that can reduce a lot of human effort, accordingly research was fo-cused towards exploiting several vector representations to sol...
Article
Full-text available
Arabic spelling errors occur in different types of documents, such as handwritten by non experienced users, optical character recognition (OCR) documents and machine translated documents. Many researchers had tried to solve this dilemma but till now there is no a radical solution. This paper proposes a hybrid system based on the confusion matrix an...
Conference Paper
Full-text available
Many researchers have been investigating the task of plagiarism detection lately. In this paper we present RDI system for intrinsic plagiarism detection (RDI_RID). RDI_RID system was the only system that participated in intrinsic track of the Arabic language plagiarism detection competition. RDI_RID system achieved a PlagDet (Plagiarism Detection s...
Conference Paper
Full-text available
Extrinsic plagiarism detection gathered the attention of many researchers lately. Plagiarism process began to be more and more difficult to be detected due to appearance of other sophisticated plagiarism approaches other than direct copy and paste such as (phrase rephrasing, word shuffling, semantic substitution, etc…). In this paper, we present RD...
Conference Paper
Full-text available
Extrinsic plagiarism detection gathered the attention of many researchers lately. Plagiarism process began to be more and more difficult to be detected due to appearance of other sophisticated plagiarism approaches other than direct copy and paste such as (phrase rephrasing, word shuffling, semantic substitution, etc…). In this paper, we present RD...
Conference Paper
Full-text available
In this work we propose a fully automatic pre-processing technique to enhance the digital camera captured images and rectify the different known types of distortion to improve the performance of OCR applications. Our proposed approach depends on the features of the text lines and letters and doesn't need any especial equipment. Experimental results...
Conference Paper
Full-text available
Polysemous words acquire different senses and meanings from their contexts. Representing words in vector space as a function of their contexts captures some semantic and syntactic features for words and introduces new useful relations between them. In this paper, we exploit different vectorized representations for words to solve the problem of Cros...
Conference Paper
Full-text available
A lot of work has been done to give the individual words of a certain language adequate representations in vector space so that these representations capture semantic and syntactic properties of the language. In this paper, we compare different techniques to build vectorized space representations for Arabic, and test these models via intrinsic and...
Conference Paper
Full-text available
In this paper, we aim to move ontology-based Arabic NLP forward by experimenting with the generation of a comprehensive Arabic lexical ontology using multiple language resources. We recommend a combination of MUHIT, WordNet and SUMO and use a simple method to link them, which results in the generation of an Arabic-lexicalized version of the SUMO on...
Article
Emotion conversion using a small speech corpus is very important for expressive text to speech systems. Applying the unit selection paradigm for intonation conversion has been widely used for different languages using different intonation units. In this paper, an emotion conversion system is proposed for expressive Arabic speech. This system combin...
Article
This paper describes a speech-enabled Computer Aided Pronunciation Learning (CAPL) system. This system was developed for teaching Arabic pronunciations to non-native speakers. A challenging application of that system is teaching the correct recitation of the Holy Qur'an. This system uses a state of the art speech recognizer to detect errors in the...
Article
The Arabic language belongs to a group of languages that require diacritization over their characters. Modern Standard Arabic (MSA) transcripts omit the diacritics, which are essential for many machine learning tasks like Text-To-Speech (TTS) systems. In this work Arabic diacritics restoration is tackled under a deep learning framework that include...
Conference Paper
Full-text available
Many researchers have been investigating the task of plagiarism detection lately. In this paper we present RDI system for intrinsic plagiarism detection (RDI_RID). RDI_RID system was the only system that participated in intrinsic track of the Arabic language plagiarism detection competition. RDI_RID system achieved a PlagDet (Plagiarism Detection s...
Article
Full-text available
This paper presents an optical character/text recognition (OCR) system for cursive scripts like those of Arabic, Urdu, Persian, Kurdish, etc. This OCR system is a large-scale one in the sense of architecture, training data size, and state-of-the-art performance. The paper introduces the theoretical derivation and experimental assessment of our two...
Conference Paper
Full-text available
Although datasets represent a critical part of research and development activities, botnet research suffers from a serious shortage of reliable and representative datasets. In this paper, we explain a new approach to build a botnet experimentation platform completely from off-the-shelf open sources. This work aims to fill the gap in botnet research...
Conference Paper
Full-text available
Traditional keyword based search is found to have some limitations. Such as word sense ambiguity, and the query intent ambiguity which can hurt the precision. Semantic search uses the contextual meaning of terms in addition to the semantic matching techniques in order to overcome these limitations. This paper introduces a query expansion approach u...
Conference Paper
Full-text available
In this paper, Arabic diacritics restoration problem is tackled under the deep learning framework presenting Confused Subset Resolution (CSR) method to improve the classification accuracy, in addition to Arabic Part-of-Speech (PoS) tagging framework using deep neural nets. Special focus is given to syntactic diacritization, which still suffer low a...
Article
Full-text available
Most of opinion mining works need lexical resources for opinion which recognize the polarity of words (positive/ negative) regardless their contexts which called prior polarity. The word prior polarity may be changed when it is considered in its contexts, for example, positive words may be used in phrases expressing negative sentiments, or vice ver...
Article
Full-text available
In this paper, phonetic editor system for learning English speaking will be introduced. Methods and the architecture of systems used to edit new lessons into proposed dictionary will be discussed taken into consideration pronunciation effects. Speak Correct system will be presented, which uses state of the art automatic speech recognition (ASR) and...
Article
Full-text available
Literacy and adult education are an essential objective for realizing development and increasing production for any country. Egypt is one of the countries that still has high rate of illiteracy is around 30% of the adult population (age range 15-45). In Saudi Arabia the distant regions faces a similar challenge. Traditional literacy classes proved...
Article
Full-text available
In this paper we introduce the SpeakCorrect system which is a Computer Aided Pronunciation Training (CAPT) system for native Arabic students of English. The system is designed with optimized performance for the target users group. It is L1 dependent system and only the frequent pronunciation errors of native Arabic speakers are examined. Several ad...
Research
Full-text available
In this paper, phonetic editor system for learning English speaking will be introduced. Methods and the architecture of systems used to edit new lessons into proposed dictionary will be discussed taken into consideration pronunciation effects. Speak Correct system will be presented, which uses state of the art automatic speech recognition (ASR) and...
Conference Paper
Full-text available
the aim of this paper is to introduce a new technique that enhances online translation from English to Arabic for a specific domain. This enhancement is achieved by training a new "Arabic online engine translation" to "Arabic manual translation" model that corrects common errors in the online translation. This paper focuses on two popular online tr...
Data
The aim of this paper is to introduce a new technique that enhances online translation from English to Arabic for a specific domain. This enhancement is achieved by training a new "Arabic online engine translation" to "Arabic manual translation" model that corrects common errors in the online translation. This paper focuses on two popular online tr...
Conference Paper
Full-text available
Language resources are important factor in any NLP application. However, the language resource support for Arabic is poor because the existing Arabic language resources are either scattered, inconsistent or even incomplete. In this paper we discuss the notion of having an integrated Arabic resource leveraging various pre-existing ones. We present a...
Conference Paper
Full-text available
Large amounts of ground truth data is vital for building, testing, analyzing and improving the performance of character recognizers especially those using segmentation based routines. Ground truth information, the annotation, can be associated with the document images at the paragraph level, the sentence level, the word level, and up until the char...
Article
Full-text available
In this paper we propose a segmentation system for unconstrained Arabic online handwriting. An essential problem addressed by analytical-based word recognition system. The system is composed of two-stages the first is a newly special designed hidden Markov model (HMM) and the second is a rules based stage. In our system, handwritten words are b...
Article
Full-text available
Email has become an essential communication tool in modern life, creating the need to manage the huge information generated. Email classification is a desirable feature in an email client to manage the email messages and categorize them into semantic groups. Statistical artificial intelligence and machine learning is a typical approach to solve suc...
Conference Paper
Recognizing old documents is highly desirable since the demand for quickly searching millions of archived documents has recently increased. Using Hidden Markov Models (HMMs) has been proven to be a good solution to tackle the main problems of recognizing typewritten Arabic characters. These attempts however achieved a remarkable success for omn...
Article
Full-text available
Adaptation is a property of intelligent machines to update its knowledge according to actual situation. Self-learning machines (SLM) as defined in this paper are those learning by observation under limited supervision, and continuously adapt by observing the surrounding environment. The aim is to mimic the behavior of human brain learning from surr...