Mahmoud EL-Haj
PhD candidate at the School of Computer Science and Electronic Engineering at Essex Univeristy, UK.
Main field is : Arabic Multidocument summarisation.
Research skills
-
Technical, SVM, ROUGE, Vibrating Sample Magnetometer
-
ITSenior Java programmer.
-
StatisticalR statistical package.
-
OtherMicrosoft PowerShell, Perl, C++, C#, vb, VBA, VB.NET, ASP.Net.
Research interests
-
InterestsInformation Retrieval, Arabic Natural language Processing, Text Summarization, Natural Language Processing, Corpus Linguistics, Machine Translation
Research experience
-
Teaching: Graduate Teaching/Lab Assistant at Essex University. CE161
-
Teaching: CE154 and CE203 Computer Courses. Which include: Java Programming
-
Teaching: Web Development and Digital Systems Architecture.
-
Sep 2011–
Nov 2011Research: MEDIE Search Engine
National Institute of Informatics · Natural Language Processing · National Institute of InformaticsNLP · TokyoWorking on clustering and summarising the results of the syntactic and semantic search engine (MEDIE) that searches millions of medical journals. -
Jul 2011–
Jul 2011Research: Hadoop Hackathon
Edinburgh University · Informatics · Edinburgh UniversityNLP · EdinburghWorking on extracting useful information from a sea of garbage data (billions of words corpus). The project was a two days hackathon, we were a group of 5 people from 5 different universities. Our group was succesful in extracting (statistically) useful information before the end of the hackathon. No competition between groups as the idea was to learn how to use Hadoop tool. -
Feb 2011–
May 2012Research: Upgrade the Archive's Systems and Preservation Service
UK Data Archive · Digital Preservation and SystemsUK's largest collection of social and economic data
Education
-
Jan 2009–
Nov 2011Essex University
Arabic Multi-document Text Summarisation · PhDUnited Kingdom · Colchester
Awards & achievements
-
Nov 2009Award: Best Paper Award at LTC 2009 Poznań, Poland.
Other
-
LanguagesArabic, English
-
Scientific MembershipsLanguage and Computation Group (LAC) Essex University.
http://cswww.essex.ac.uk/LAC/
Publications
-
Assessing Crowdsourcing Quality through Objective Tasks
Language Resources and Evaluation (LREC 2012); 01/2012
Exploring the possibilities and limits of crowd sourcing methods for creating NLP resources is a major research task at the moment. The paper presents experiments on the influence of the presentation method and the payment of the workers, respectively.... [more] Exploring the possibilities and limits of crowd sourcing methods for creating NLP resources is a major research task at the moment. The paper presents experiments on the influence of the presentation method and the payment of the workers, respectively.
-
Exploring Clustering for Multi-document Arabic Summarisation.
Information Retrieval Technology - 7th Asia Information Retrieval Societies Conference, AIRS 2011, Dubai, United Arab Emirates, December 18-20, 2011. Proceedings; 01/2011
-
TAC 2011 MultiLing Pilot Overview. In Text Analysis Conference (TAC) 2011
Text Analysis Conference (TAC 2011), MultiLing Summarisation Pilot, Maryland, USA.; 01/2011
The Text Analysis Conference MultiLing Pilot of 2011 posed a multi-lingual summarization task to the summarization community, aiming to quantify and measure the performance of multi-lingual, multi-document summarization systems. The task was to create a 240-250 word summary from 10 news texts, descr... [more] The Text Analysis Conference MultiLing Pilot of 2011 posed a multi-lingual summarization task to the summarization community, aiming to quantify and measure the performance of multi-lingual, multi-document summarization systems. The task was to create a 240-250 word summary from 10 news texts, describing a given topic. The texts of each topic were provided in seven languages (Arabic, Czech, English, French, Greek, Hebrew, Hindi) and each participant generated summaries for at least 2 languages. The evaluation of the summaries was performed using automatic (AutoSummENG, Rouge) and manual processes (Overall Responsiveness score). The participating systems were 8, some of which providing summaries across all languages. This paper provides a brief description for the collection of the data, the evaluation methodology, the problems and challenges faced, and an overview of participation and corresponding results.
-
University of Essex at the TAC 2011 MultiLingual Summarisation Pilot
In Text Analysis Conference (TAC) 2011, MultiLing Summarisation Pilot, Maryland, USA.; 01/2011
We present the results of our Arabic and English runs at the TAC 2011 Multilingual summarisation (MultiLing) task. We participated with centroid-based clustering for multidocument summarisation. The automatically generated Arabic and English summaries were evaluated by human participants and by two ... [more] We present the results of our Arabic and English runs at the TAC 2011 Multilingual summarisation (MultiLing) task. We participated with centroid-based clustering for multidocument summarisation. The automatically generated Arabic and English summaries were evaluated by human participants and by two automatic evaluation metrics, ROUGE and AutoSumENG. The results are compared with the other systems that participated in the same track on both Arabic and English languages. Our Arabic summariser performed particularly well in the human evaluation.
-
Multi-Document Arabic Text Summarisation
In the 3rd Computer science and Electronic Engineering Conference (CEEC'11), Colchester, UK; 01/2011
-
Understanding the Quran: A New Grand Challenge for Computer Science and Artificial Intelligence
Grand Challenges in Computing Research for 2010 and beyond, Edinburgh University; 01/2010
-
Using Mechanical Turk to Create a Corpus of Arabic Summaries
Language Resources (LRs) and Human Language Technologies (HLT) for Semitic Languages workshop held in conjunction with the 7th International Language Resources and Evaluation Conference (LREC 2010)., Valletta, Malta; 01/2010
-
Experimenting with Automatic Text Summarisation for Arabic.
Human Language Technology. Challenges for Computer Science and Linguistics - 4th Language and Technology Conference, LTC 2009, Poznan, Poland, November 6-8, 2009, Revised Selected Papers; 01/2009
-
Experimenting with Automatic Text Summarization for Arabic
Proceeding of the 4th Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, LTC'09, Poznan, Poland; 01/2009
-
Enhancing Retrieval Effectiveness of Diacritisized Arabic Passages Using Stemmer And Thesaurus
Proceeding of the 19th Midwest Artificial Intelligence And Cognitive Science Conference, Ohio, USA; 01/2008
-
Evaluation of Query-Based Arabic Text Summarization System
Proceeding of the IEEE International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE'08, Beijing, China; 01/2008
-
Effectiveness of Query Expansion in searching the Holy Quran
Proceeding of the Second International Conference on Arabic Language Processing, Rabat, Morocco; 01/2007
Following (19)
-
Qadri Mishael
Umm Al-Qura University -
Mohammed Alotaibi
Kingston University -
Azzam Sleit
University of Jordan -
Prateek Bansal
University of Ulster -
Khaled Shaalan
Cairo University