Abed Alhakim Freihat

Abed Alhakim Freihat
Università degli Studi di Trento | UNITN · Department of Information Engineering and Computer Science

Doctor of Philosophy

About

31
Publications
20,692
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
228
Citations
Introduction
Abed Alhakim Freihat currently works at the ICT, Università degli Studi di Trento. Abed Alhakim does research in lexical semantics, and natural language processing. His current project is 'ALP: an Arabic Linguistic Pipeline.'
Additional affiliations
August 2016 - present
Università degli Studi di Trento
Position
  • Researcher
Description
  • My research involves lexical database research, concept visualization, multilingual natural language processing tools, and development of user interfaces for language and data visualization
September 2003 - September 2008
Arab American University
Position
  • Instructor
Description
  • Taught undergraduate courses such as programming fundamentals I, II, Object Oriented Programming, data structures, algorithms, and theory of computation.
Education
January 2010 - April 2014
Università degli Studi di Trento
Field of study
  • Information and Communication Technologies
January 1996 - June 2003
Universität des Saarlandes
Field of study
  • Computational Linguistics

Publications

Publications (31)
Preprint
Full-text available
This paper describes a method to enrich lexical resources with content relating to linguistic diversity, based on knowledge from the field of lexical typology. We capture the phenomenon of diversity through the notions of lexical gap and language-specific word and use a systematic method to infer gaps semi-automatically on a large scale. As a first...
Conference Paper
Full-text available
Measuring the quality of lexical-semantic resources is a challenging problem. In this paper, we describe a general approach for quality evaluation in lexical-semantic resources in terms of the quality of their synsets. We also introduce a complete definition for the quality of lexical-semantic resources as a set of synset in-correctness, incomplete...
Conference Paper
Full-text available
With the increase of the lexical-semantic resources built over time, lexicon content quality has gained significant attention from Natural Language Processing experts such as lexicographers and linguists. Estimating lexicon quality components like synset lemmas, synset gloss, or synset relations are challenging research problems for Natural Languag...
Conference Paper
The emergence of Multi-task learning (MTL) models in recent years has helped push the state of the art in Natural Language Understanding (NLU). We strongly believe that many NLU problems in Arabic are especially poised to reap the benefits of such models. To this end, we propose the Arabic Language Understanding Evaluation Benchmark (ALUE), based o...
Conference Paper
Full-text available
We present a new wordnet resource for Scottish Gaelic, a Celtic minority language spoken by about 60,000 speakers, most of whom live in Northwestern Scotland. The wordnet contains over 15 thousand word senses and was constructed by merging ten thousand new,high-quality translations, provided and validated by language experts, with an existing wordn...
Preprint
Full-text available
This paper describes the SemEval--2016 Task 3 on Community Question Answering, which we offered in English and Arabic. For English, we had three subtasks: Question--Comment Similarity (subtask A), Question--Question Similarity (B), and Question--External Comment Similarity (C). For Arabic, we had another subtask: Rerank the correct answers for a ne...
Conference Paper
Full-text available
This paper describes our system and results on NSURL 2019 Semantic Question Similarity in Arabic task. We considered the solution to this problem from three point of view, where we adopted three approaches: lexical, statistical and neural. For the Lexical approach we applied a set of text similarity measures from the textdistance tools, where the b...
Conference Paper
Full-text available
This paper describes the solution that we propose on MADAR 2019 Arabic Fine-Grained Dialect Identification task. The proposed solution utilized a set of classifiers that we trained on character and word features. These clas-sifiers are: Support Vector Machines (SVM), Bernoulli Naive Bayes (BNB), Multinomial Naive Bayes (MNB), Logistic Regression (L...
Preprint
This paper presents ALP, an entirely new linguistic pipeline for natural language processing of text in Modern Standard Arabic. Contrary to the conventional pipeline architecture , we solve common NLP operations of word segmentation, POS tagging, and named entity recognition as a single sequence labeling task. Based on this single component , we al...
Conference Paper
Full-text available
n this paper we discuss several models we used to classify25city-level Arabic dialects in addition to Modern Standard Arabic (MSA)as part of MADAR shared task (sub-task 1). We propose an ensemble model of a group of experimentally designed best performing classifiers on a various set of features. Our system achieves an accuracy of69.3%macro F1-scor...
Presentation
Full-text available
Lemmatization—computing the canonical forms of words in running text—is an important component in any NLP system and a key preprocessing step for most applications that rely on natural language understanding. In the case of Arabic, lemmatization is a complex task because of the rich morphology, agglutinative aspects, and lexical ambiguity due to th...
Conference Paper
Full-text available
Lemmatization—computing the canonical forms of words in running text—is an important component in any NLP system and a key pre processing step for most applications that rely on natural language understanding. In the case of Arabic, lemmatization is a complex task because of the rich morphology, agglutinative aspects, and lexical ambiguity due to t...
Conference Paper
Full-text available
In this paper, we investigate a set of methods for textual Arabic Dialect Identification, where we considered word-level and sentence-level approaches. We used three classifiers, namely: Linear Support Vector Machine L-SVM, Bernoulli Naive Bayes BNB and Multinomial Naive Bayes MNB. Then we combined them by using a voting procedure. We carried out e...
Data
The presentation of the paper "A Single-Model Approach for Arabic Segmentation, POS-Tagging and Named Entity Recognition" presented at the the 2nd International Conference on Natural Language and Speech Processing ICNLSP 2018, 25-26,April,2018, Algiers, Algeria
Conference Paper
Full-text available
This paper presents an entirely new, one-million-word annotated corpus for a comprehensive, machine-learning-based preprocessing of text in Modern Standard Arabic. Contrarily to the conventional pipeline architecture, we solve the NLP tasks of word segmentation, POS tagging and named entity recognition as a single sequence labeling task. This singl...
Data
The presentation of the paper "Using Grice Maxims In Ranking Community Question Answers" given at "The Tenth International Conference on Information, Process, and Knowledge Management, eKNOW 2018, March 25, 2018 to March 29, 2018 - Rome, Italy"
Presentation
Full-text available
Prsenting the paper "Using Grice Maxims In Ranking Community Question Answer"
Conference Paper
Full-text available
Community question answering portals and forum websites are becoming prominent resources of knowledge and experience exchange and such platforms are becoming invaluable information mines. Getting to these information in such knowledge mines is not trivial and fraught with difficulties and challenges. One of these difficulties is to discover the rel...
Presentation
Full-text available
A tutorial presented at "the Tenth International Conference on Information, Process, and Knowledge Management eKNOW 2018"
Conference Paper
Full-text available
We present a large scale multilingual lexical resource, the Universal Knowledge Core (UKC), which is organized like a Wordnet with, however, a major design difference. In the UKC the meaning of words is represented not only with synsets but also using language independent concepts which cluster together the synsets which, in different languages, co...
Article
Full-text available
Lemmatization—computing the canonical forms of words in running text—is an important component in any NLP system and a key preprocessing step for most applications that rely on natural language understanding. In the case of Arabic, lemmatization is a complex task because of the rich morphology, agglutinative aspects, and lexical ambiguity due to th...
Conference Paper
Full-text available
In this paper, we present a community answers ranking system which is based on Grice Maxims. In particular, we describe a ranking system which is based on answer relevancy scores, assigned by three main components: Named entity recognition , similarity score, and sentiment analysis .
Presentation
Full-text available
presentation "A Taxonomic Classification of WordNet Polysemy Types"
Conference Paper
Full-text available
WordNet represents polysemous terms by capturing the different meanings of these terms at the lexical level, but without giving emphasis on the polysemy types such terms belong to. The state of the art pol-ysemy approaches identify several poly-semy types in WordNet but they do not explain how to classify and organize them. In this paper, we presen...
Conference Paper
Full-text available
Sense enumeration in WordNet is one of the main reasons behind the problem of the high polysemous nature of WordNet. The sense enumeration refers to misconstruction that results in wrong assigning of a synset to a term. In this paper, we propose semi-automatic process to discover and solve the problem of sense enumerations in compound noun polysemy...
Thesis
Full-text available
Polysemy in WordNet corresponds to various kinds of linguistic phenomena that can be grouped into five classes. One of them is homonymy that refers to the cases, where the meanings of a term are unrelated, and three of the classes refer to the polysemy cases, where the meanings of a term are related. These three classes are specialization polysemy,...
Article
Full-text available
WordNet represents the polysemous terms by capturing the different meanings of them at lexical level implicitly without giving emphasis on the polysemy types they belong to. This problem affects the usability of WordNet as a suitable knowledge representation resource for Natural Language Processing applications. The current work presents pattern ba...
Conference Paper
Full-text available
Specialization polysemy refers to the type of polysemy, when a term is used to refer to either a more general meaning or to a more specific meaning. Although specialization polysemy represents a large set of the polysemous terms in WordNet, no comprehensive solution has been introduced yet. In this paper we present a novel approach that discovers a...
Conference Paper
Full-text available
WordNet has been used widely in natural language processing and semantic applications. Despite the reputation of WordNet, the polysemy problem that leads to insufficient quality of applications results is still unsolved. Many approaches have been suggested. However, none of them give a comprehensive solution to the problem. In this paper, we introd...

Network

Cited By

Projects

Projects (10)
Project
The 5th International Conference on Natural Language and Speech Processing ICNLSP 2022 aims to attract contributions related to natural language and speech processing in basic theories and applications as well. Regular and posters sessions will be organized, in addition to keynotes presented by senior international researchers. icnlsp.org
Project
ICNLSP 2021, the fourth edition of the International Conference on Natural Language and Speech Processing, will be organized by the university of Trento. ICNLSP aims to attract contributions related to natural language and speech processing in basic theories and applications as well. Regular and posters sessions will be organized, in addition to keynotes presented by senior international researchers. The second edition of NSURL workshop will be colocated with ICNLSP 2021. Authors are invited to present their work relevant to the topics of the conference. TOPICS: The following list includes the topics of ICNLSP 2021 but not limited to: Signal processing, acoustic modeling Architecture of speech recognition system Deep learning for speech recognition Analysis of speech Paralinguistics in Speech and Language Pathological speech and language Speech coding Speech comprehension Summarization Speech Translation Speech synthesis Speaker and language identification Phonetics, phonology and prosody Cognition and natural language processing Text categorization Sentiment analysis and opinion mining Computational Social Web Arabic dialects processing Under-resourced languages: tools and corpora New language models Arabic OCR Lexical semantics and knowledge representation Requirements engineering and NLP NLP tools for software requirements and engineering Knowledge fundamentals Knowledge management systems Information extraction Data mining and information retrieval Machine translation PUBLICATION: All the accepted papers will be published in ACL Anthology, and indexed in DBLP. KEYNOTE SPEAKERS PD Dr. Valia Kordoni, Humboldt University, Germany. Dr. Ahmed Abdelali, QCRI, Qatar. Dr. Hussein Al-Natsheh, Beyond Limits. Dr. Kareem Darwish, DCRI, Qatar. IMPORTANT DATES New Submission deadline: 12 August 2021 Notification of acceptance: 30 September 2021 Camera-ready paper due: 15 October 2021 Conference dates: 12, 13 November 2021