Chapter

Expansive Simple Arabic Sentence Parsing Using NooJ Platform: 12th International Conference, NooJ 2018, Palermo, Italy, June 20–22, 2018, Revised Selected Papers

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

All Arabic sentences, both verbal and nominal, share the same main structure, which consists of two required components: the predicate and the subject, and two optional components: the head and the complement. Simple sentences are based on most basic noun phrases (simple nouns), and can be expanded in the predicate, the subject, or the complement. The expansion leads to compound parts rather than simple ones. The aim of this work is to merge our two previous parsers [2, 3], and to extend the merged parser, at the noun phrase level, to be able to parse the expansive simple sentences. Hence, we have implemented a set of syntactic grammars modeling Arabic noun phrase structures. These grammars are enriched by the agreement constraints of the noun phrase components. Using our enhanced and extended grammar, we have parsed syntactically several sentences, we have recognized both nominal and verbal expansive sentences, and we have generated their possible syntactic trees regardless of the sentence’ components order. The results were satisfactory.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This segment plays a pivotal role characterized by its versatility in both compound and recursive structures [16]- [18]. In recent years, most of the existing research in Arabic sentence parsing has focused on simple sentences [1], [16], [19], [20]. ...
... We achieved this by creating 40 syntactic grammars within the NooJ platform [35]. This extension follows the principles of the expansive simple Arabic sentence parsing methodology as introduced by Bourahma et al. [18]. Our newly constructed grammar covers a wide range of sentence structures. ...
... By parsing and annotating the input sentences systematically in this way, our approach aims to provide a comprehensive understanding of the complex structures of Arabic psychological verbs. Then, we proceeded with the implementation of NooJ grammars designed to identify NPs in their general form, inspired by the extensive simple Arabic sentence grammar developed by Bourahma et al. [18]. The graphs shown in Figures 9, 10, and 11 illustrate the basic structure of Arabic NPs in both definite and indefinite forms. ...
Article
Full-text available
Complex Arabic sentences, especially those containing Arabic psychological verbs, follow a common underlying structure characterized by two essential components: the predicate and the subject. In addition, there are two optional elements: the head and the complement. These sentences, rooted in basic noun phrases (NPs), can be expanded within the predicate, subject, or complement, resulting in compound structures. This study aims to develop a syntactic analyzer for parsing complex sentences containing Arabic psychological verbs. To achieve this, we will use the dictionary generated from the lexicon-grammar table of Arabic psychological verbs, which contains all lexical, syntactic, semantic, and transformational information related to these verbs. Then, we will extend an existing analyzer to recognize and label all grammatical structures within complex sentences containing Arabic psychological verbs. Finally, we will evaluate the efficiency of this analyzer through tests on different texts and corpora.
Article
Full-text available
The research focus in our paper is twofold: (a) to examine the extent to which simple Arabic sentence structures comply with the Government and Binding Theory (GB), and (b) to implement a simple Arabic Context Free Grammar (CFG) parser to analyze input sentence structures to improve some Arabic Natural Language Processing (ANLP) Applications. Here we present a parser that employs Chomsky’s Government and Binding (GB) theory to better understand the syntactic structure of Arabic sentences. We consider different simple word orders in Arabic and show how they are derived. We analyze different sentence orders including Subject-Verb-Object (SVO), Verb-Object-Subject (VOS), Verb-Subject-Object (VSO), nominal sentences, nominal sentences beginning with inna (and sisters) and question sentences. We tackle the analysis of the structures to develop syntactic rules for a fragment of Arabic grammar. We include two sets of rules: (1) rules on sentence structures that do not account for case and (2) rules on sentence structures that account for case of Noun Phrases (NPs). We present an implementation of the grammar rules in Prolog. The experiments revealed high accuracy in case assignment in Modern Standard Arabic (MSA) in the light of GB theory especially when the input sentences are tagged with identification of end cases.
Article
Full-text available
This paper presents a simple parser to parse Arabic sentences. The aim of this parser is to check whether the syntax of an Arabic sentence is grammatically correct or not by constructing new ef-ficient Context-Free Grammar that makes Top-Down technique much valuable. A set of experiments were ran on a dataset con-tains 150 Arabic sentence. The system achieved an average ac-curacy of 95%.
Chapter
Natural Language Processing (NLP) applications such as machine translation, question answering, knowledge extraction, and information retrieval require parsing process as an essential step. In this paper, we present a parser to analyze simple Arabic nominal sentences using the NooJ platform. Hence, we propose a well-classified NooJ dictionary that includes most syntactic, and semantic features. We also present the rule describing the Arabic sentence. Then, we implement the parser that recognizes, and annotates all possible grammatical structures of simple Arabic nominal sentence. We implement a set of transducers modeling Arabic lexical, and syntactic constraints, these constraints reduce parsing ambiguity. Our parser is tested on many sentences extracted from real texts. These experimental results show the effectiveness of the proposed parser for analyzing simple Arabic nominal sentences.
Chapter
In this paper, we present a NooJ parser of simple Arabic verbal sentence. This parser is based on dependency grammar established by the attribution (Open image in new window, al-’isnād) concept in the Arabic language. In the first part of this paper, we present a syntactic and semantic classification of Arabic words allowing Arabic sentence parsing. In the second part, we present the shared structure by simple Arabic sentences. Furthermore, we use this structure to implement simple Arabic verbal sentence grammar using NooJ platform. Our parser is applied to the input sentence after two required steps: Morphological analysis and morphological disambiguation. The proposed parser generates possible parse tree(s) of the input sentence, and annotates all sentence components by their grammatical functions. The implemented parser is tested on a selected text; experimental results show its efficacy.
Book
This book is at the very heart of linguistics. It provides the theoretical and methodological framework needed to create a successful linguistic project. The author provides linguists with tools to help them formalize natural languages and aid in the building of software able to automatically process texts written in natural language (Natural Language Processing, or NLP). Computers are a vital tool for this, as characterizing a phenomenon using mathematical rules leads to its formalization. NooJ – a linguistic development environment software developed by the author – is described and practically applied to examples of NLP.
Article
The aim of this paper is to describe a technique for identifying the sourcesof several types of syntactic ambiguity in Arabic Sentences with a singleparse only. Normally, any sentence with two or more structuralrepresentations is said to be syntactically ambiguous. However, Arabicsentences with only one structural representation may be ambiguous. Ourtechnique for identifying Syntactic Ambiguity in Single-Parse ArabicSentences (SASPAS) analyzes each sentence and verifies the conditionsthat govern the existence of certain types of syntactic ambiguities in Arabicsentences. SASPAS is integrated with the syntactic parser, which is basedon Definite Clause Grammar (DCG) formalism. The system accepts Arabicsentences in their original script.