Hazem M. Hajj

Hazem M. Hajj
American University of Beirut | AUB · Department of Electrical and Computer Engineering

About

139
Publications
55,767
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,226
Citations
Additional affiliations
January 2008 - present
American University of Beirut
Position
  • Professor (Associate)
September 1996 - January 2008
Intel Corporation
Position
  • Research Director

Publications

Publications (139)
Article
Residential short-term load forecasting has become an essential process to develop successful demand response strategies, and help utilities and customers optimize energy production and consumption. Most previous works focused on capturing the spatial and temporal characteristics of residential load data but fell short in accurately comprehending i...
Article
Traditionally, feature selection is conducted by first deriving a candidate list of features, then ranking and selecting the top features based on predefined threshold. These methods are highly dependent on the choice of the threshold, and therefore lead to sub-optimal text categorization results. In this paper, we address the selection problem by...
Article
In many real-world scenarios, machine learning models fall short in prediction performance due to data characteristics changing from training on one source domain to testing on a target domain. There has been extensive research to address this problem with Domain Adaptation (DA) for learning domain invariant features. However, when considering adva...
Article
Full-text available
Background In the last decade, a lot of attention has been given to develop artificial intelligence (AI) solutions for mental health using machine learning. To build trust in AI applications, it is crucial for AI systems to provide for practitioners and patients the reasons behind the AI decisions. This is referred to as Explainable AI. While there...
Article
Full-text available
Mobile devices and sensors have limited battery lifespans, limiting their feasibility for context recognition applications. As a result, there is a need to provide mechanisms for energy-efficient operation of sensors in settings where multiple contexts are monitored simultaneously. Past methods for efficient sensing operation have been hierarchical...
Conference Paper
Full-text available
During seizures, different types of communication between different parts of the brain are characterized by many state-of-the-art connectivity measures. We propose to employ a set of undirected (spectral matrix, the inverse of the spectral matrix, coherence, partial coherence, and phase-locking value) and directed features (directed coherence, the...
Preprint
Full-text available
Enabling empathetic behavior in Arabic dialogue agents is an important aspect of building human-like conversational models. While Arabic Natural Language Processing has seen significant advances in Natural Language Understanding (NLU) with language models such as AraBERT, Natural Language Generation (NLG) remains a challenge. The shortcomings of NL...
Preprint
Full-text available
Advances in English language representation enabled a more sample-efficient pre-training task by Efficiently Learning an Encoder that Classifies Token Replacements Accurately (ELECTRA). Which, instead of training a model to recover masked tokens, it trains a discriminator model to distinguish true input tokens from corrupted tokens that were replac...
Preprint
Recently, pretrained transformer-based architectures have proven to be very efficient at language modeling and understanding, given that they are trained on a large enough corpus. Applications in language generation for Arabic is still lagging in comparison to other NLP advances primarily due to the lack of advanced Arabic language generation model...
Conference Paper
Full-text available
Conversational models have witnessed a significant research interest in the last few years with the advancements in sequence generation models. A challenging aspect in developing human-like conversational models is enabling the sense of empathy in bots, making them infer emotions from the person they are interacting with. By learning to develop emp...
Article
Full-text available
Success of Natural Language Processing (NLP) models, just like all advanced machine learning models, rely heavily on large -scale lexical resources. For English, English WordNet (EWN) is a leading example of a large-scale resource that has enabled advances in Natural Language Understanding (NLU) tasks such as word sense disambiguation, question ans...
Article
There have been significant advances in machine learning due to the profusion in data collection and computing resources. However, the need for large annotated datasets to train machine learning models remains a problematic constraint. To address the limitation of annotated data for personalized prediction, we propose a framework to enrich annotate...
Article
In this paper, we address the problem of recognizing the semantic human activities through the analysis of large dataset collected from users’ sensor-based smartphones. Our approach is unique in terms of covering a large number of activities that users could possibly engage in, and considering the multi-level-based classification model. Our model h...
Article
Full-text available
Epilepsy is a chronic medical condition that involves abnormal brain activity causing patients to lose control of awareness or motor activity. As a result, detection of pre-ictal states, before the onset of a seizure, can be life-saving. The problem is challenging since it is difficult to discern between EEG signals in pre-ictal states versus signa...
Conference Paper
Full-text available
The Arabic language is a morphologically rich language with relatively few resources and a less explored syntax compared to English. Given these limitations, Arabic Natural Language Processing (NLP) tasks like Sentiment Analysis (SA), Named Entity Recognition (NER), and Question Answering (QA), have proven to be very challenging to tackle. Recently...
Conference Paper
Full-text available
This paper presents state of the art methods for addressing three important challenges in automated fake news detection: fake news detection, domain identification, and bot identification in tweets. The proposed solutions achieved first place in a recent international competition on fake news. For fake news detection, we present two models. The win...
Conference Paper
Full-text available
This paper presents state of the art methods for addressing three important challenges in automated fake news detection: fake news detection, domain identification, and bot identification in tweets. The proposed solutions achieved first place in a recent international competition on fake news. For fake news detection, we present two models. The win...
Preprint
Full-text available
The Arabic language is a morphologically rich and complex language with relatively little resources and a less explored syntax compared to English. Given these limitations, tasks like Sentiment Analysis (SA), Named Entity Recognition (NER), and Question Answering (QA), have proven to be very challenging to tackle. Recently, with the surge of transf...
Article
With the advances in wireless technologies, there has been tremendous increase in user demand for resource-intensive mobile Internet services. This has been coupled with the limited capabilities of mobile devices and the high level of user expectations in terms of both quality of experience (QoE) and cost. An attractive enhancement technique is to...
Conference Paper
Domain Adaptation techniques remain limited in accuracy and robustness due to data sparsity. In this paper, we present a new approach called Domain Adversarial network with Representation Learning (DARL), to improve domain adaptation by introducing an encoding layer as part of DARL model learning. We integrate a Stacked Denoising Autoencoder and Ad...
Conference Paper
Full-text available
Image Captioning (IC) is the process of automatically augmenting an image with semantically-laden descriptive text. While English IC has made remarkable strides forward in the past decade, very little work exists on IC for other languages. One possible solution to this problem is to boostrap off of existing English IC systems for image understandin...
Article
In recent years, concerns about global warming have encouraged researchers to incorporate optimization techniques into the design of energy-efficient buildings. Designing a building with energy consumption in mind, design teams should apply the appropriate considerations at the early architectural conceptual design stage. Furthermore, constructing...
Patent
Full-text available
The Multi-Device Continuum and Seamless Sensing Platform for Context Aware Analytics provide a platform for continuous sensing across multiple devices towards a unified target. The Multi-Device Continuum and Seamless Sensing Platform provides a platform for extracting, loading, integrating, and tracking related data across multiple smart devices ca...
Conference Paper
While transfer learning for text has been very active in the English language, progress in Arabic has been slow, including the use of Domain Adaptation (DA). Domain Adaptation is used to generalize the performance of any clas-sifier by trying to balance the classifier's accuracy for a particular task among different text domains. In this paper, we...
Conference Paper
Full-text available
Arabic is a complex language with limited resources which makes it challenging to produce accurate text classification tasks such as sentiment analysis. The utilization of transfer learning (TL) has recently shown promising results for advancing accuracy of text classification in English. TL models are pre-trained on large corpora, and then fine-tu...
Conference Paper
Full-text available
A significant portion of data generated on blogging and microblogging websites is non-credible as shown in many recent studies. To filter out such non-credible information, machine learning can be deployed to build automatic credibility classifiers. However, as in the case with most supervised machine learning approaches, a sufficiently large and a...
Preprint
Full-text available
This paper tackles the problem of open domain factual Arabic question answering (QA) using Wikipedia as our knowledge source. This constrains the answer of any question to be a span of text in Wikipedia. Open domain QA for Arabic entails three challenges: annotated QA datasets in Arabic, large scale efficient information retrieval and machine readi...
Article
Full-text available
Opinion mining or sentiment analysis continues to gain interest in industry and academics. While there has been significant progress in developing models for sentiment analysis, the field remains an active area of research for many languages across the world, and in particular for the Arabic language which is the 5th most spoken language, and has b...
Article
Convolutional Neural Network (ConvNet or CNN) algorithms are characterized by a large number of model parameters and high computational complexity. These two requirements have made it challenging for implementations on resource-limited FPGAs. The challenges are magnified when considering designs for low-end FPGAs. While previous work has demonstrat...
Article
In recent years, there has been a significant growth of context-aware applications, which extract the user's context from multiple embedded sensors in smartphones and wearable sensors. However, running multiple context-aware applications simultaneously causes extensive battery drainage for mobile devices. To alleviate the energy limitation in multi...
Article
Full-text available
With advancements in compute-intensive and memory-bound applications, the need for faster and more energy-efficient processing platforms continue. In support of these advancements, heterogeneous platforms have been proposed to enhance the performance and efficiency in the cloud. These platforms include field programmable gate arrays (FPGAs) and gra...
Article
Traffic offloading via device-to-device communications is expected to play a major role to meet the exponential data traffic growth in wireless networks. In this work, we focus on the problem of user capacity maximization in ultra dense heterogeneous networks with device-to-device cooperation, where a large number of users in a given geographical a...
Conference Paper
Full-text available
While significant progress has been achieved for Opinion Mining in Arabic (OMA), very limited efforts have been put towards the task of Emotion mining in Arabic. In fact, businesses are interested in learning a fine-grained representation of how users are feeling towards their products or services. In this work, we describe the methods used by the...
Conference Paper
Full-text available
With the advancement of Web 2.0, social networks experienced a great increase in the number of active users reaching 2 billion active users on Facebook at the end of 2017. Consequently, the size of text data on the Internet increased tremendously. This textual data is rich in knowledge, which attracted many data scientists as well as computational...
Conference Paper
Full-text available
Sentiment analysis is a highly subjective and challenging task. Its complexity further increases when applied to the Arabic language, mainly because of the large variety of dialects that are unstandardized and widely used in the Web, especially in social media. While many datasets have been released to train sentiment classifiers in Arabic, most of...
Preprint
Sentiment analysis is a highly subjective and challenging task. Its complexity further increases when applied to the Arabic language, mainly because of the large variety of dialects that are unstandardized and widely used in the Web, especially in social media. While many datasets have been released to train sentiment classifiers in Arabic, most of...
Article
In this work, we propose a high performance distributed system that consists of several middleware servers each connected to a number of FPGAs with extended solid state storage that we call reconfigurable active solid state device (RASSD) nodes. A full data communication solution between middleware and RASSD nodes is presented. We use seismic data...
Article
Full-text available
Sentiment analysis in Arabic is challenging due to the complex morphology of the language. The task becomes more challenging when considering Twitter data that contain significant amounts of noise such as the use of Arabizi, code-switching and different dialects that varies significantly across the Arab world, the use of non-textual objects to expr...
Chapter
Diese Arbeit beschäftigt sich mit dem Problem der Aktivitätserkennung unter Verwendung von Daten, die vom Mobiltelefon des Benutzers erhoben wurden. Wir beginnen mit der Betrachtung und Bewertung der Beschränkungen der gängigen Aktivitätserkennungsansätze für Mobiltelefone. Danach stellen wir unseren Ansatz zur Erkennung einer großen Anzahl von Akt...
Conference Paper
Full-text available
While sentiment analysis in English has achieved significant progress, it remains a challenging task in Arabic given the rich morphology of the language. It becomes more challenging when applied to Twitter data that comes with additional sources of noise including dialects, misspellings, grammatical mistakes, code switching and the use of non-textu...
Article
Full-text available
Accurate sentiment analysis models encode the sentiment of words and their combinations to predict the overall sentiment of a sentence. This task becomes challenging when applied to morphologically rich languages (MRL). In this article, we evaluate the use of deep learning advances, namely the Recursive Neural Tensor Networks (RNTN), for sentiment...
Article
Full-text available
While research on English opinion mining has already achieved significant progress and success, work on Arabic opinion mining is still lagging. This is mainly due to the relative recency of research efforts in developing natural language processing (NLP) methods for Arabic, handling its morphological complexity, and the lack of large-scale opinion...
Article
Monitoring context depends on continuous collection of raw data from sensors which are either embedded in smart mobile devices or worn by the user. However, continuous sensing constitutes a major source of energy consumption; on the other hand, lowering the sensing rate may lead to missing the detection of critical contextual events. In this paper,...
Chapter
Full-text available
Degradomics is a novel discipline that involves determination of the proteases/substrate fragmentation profile, called the substrate degradome, and has been recently applied in different disciplines. A major application of degradomics is its utility in the field of biomarkers where the breakdown products (BDPs) of different protease have been inves...
Conference Paper
Full-text available
Opinion mining in Arabic is a challenging task given the rich morphology of the language. The task becomes more challenging when it is applied to Twitter data, which contains additional sources of noise, such as the use of unstan-dardized dialectal variations, the non-conformation to grammatical rules, the use of Arabizi and code-switching, and the...
Article
While cloud computing has provided major benefits by maximizing the use of resources within a cloud, the current solutions still face many challenges. In this paper, we propose performance enhancements for cloud computations, provided by integrating hardware acceleration into the computation services. We extend the Hadoop framework by adding provis...
Article
Due to the exploding traffic demands with the ubiquitous anticipated spread of 5G and Internet of Things, research has been active to devise mechanisms for meeting these demands while maintaining high quality user experience. In support of this direction, 3GPP is working towards cellular/WiFi interworking in heterogeneous networks to boost throughp...
Article
Full-text available
This article introduces a sentiment analysis approach that adopts the way humans read, interpret, and extract sentiment from text. Our motivation builds on the assumption that human interpretation should lead to the most accurate assessment of sentiment in text. We call this automated process Human Reading for Sentiment (HRS). Previous research in...
Conference Paper
Since human's emotions play a central role in everyday decisions and well-being, developing systems for recognizing and managing human's emotions captured significant research interest in the last decade. However, there is limited research on studying emotion recognition from human-computer interaction (HCI) in natural settings. This work aims at p...
Article
Huge amounts of data and personal information are being sent to and retrieved from web applications on daily basis. Every application has its own confidentiality and integrity policies. Violating these policies can have broad negative impact on the involved company's financial status, while enforcing them is very hard even for the developers with g...
Article
We propose a high performance distributed system that consists of several middleware servers (MWS) each connected to a number of FPGAs with extended solid state storage that we call reconfigurable active solid state device (RASSD) nodes. A MWS manages a group of RASSD nodes and bridges the connection between a client and the RASSD nodes within a co...
Article
Full-text available
Programming FPGAs requires advanced hardware design skills which limits their adoption in data centres. FPGA vendors have provided high level synthesis (HLS) tools to build register transfer level (RTL) specifications from designs provided in high level languages. We present a suite of C and C++-based hardware accelerators for the Purdue MapReduce...