Amir Atiya

Amir Atiya
Cairo University | CU · Department of Computer Engineering

B.S. Cairo University, Ph.D. Caltech

About

222
Publications
135,607
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,076
Citations
Additional affiliations
January 2001 - present
Texas A&M University
March 1993 - December 2009
Cairo University
September 1985 - September 1990
California Institute of Technology

Publications

Publications (222)
Preprint
Imbalanced data is a frequently encountered problem in machine learning. Despite a vast amount of literature on sampling techniques for imbalanced data, there is a limited number of studies that address the issue of the optimal sampling ratio. In this paper, we attempt to fill the gap in the literature by conducting a large scale study of the effec...
Article
Full-text available
Word embeddings mean the mapping of words into vectors in an N-dimensional space. ArSphere: is an approach that designs word embeddings for the Arabic language. This approach overcomes one of the shortcomings of word embeddings (for English language too), namely their inability to handle opposites (and differentiate those from unrelated word pairs)...
Article
Full-text available
The price demand relation is a fundamental concept that models how price affects the sale of a product. It is critical to have an accurate estimate of its parameters, as it will impact the company’s revenue. The learning has to be performed very efficiently using a small window of a few test points, because of the rapid changes in price demand para...
Article
Full-text available
In this work, we propose a novel recommender system model based on a technology commonly used in natural language processing called word vector embedding. In this technology, a word is represented by a vector that is embedded in an n-dimensional space. The distance between two vectors expresses the level of similarity/dissimilarity of their underly...
Article
Full-text available
Dynamic pricing is a beneficial strategy for firms seeking to achieve high revenues. It has been widely applied to various domains such as the airline industry, the hotel industry, and e-services. Dynamic pricing is basically the problem of setting time-varying prices for a certain product or service for the purpose of optimizing revenue. However,...
Chapter
Full-text available
Hotel reviews are an important driving factor for hotel business. They can benefit guests to make informed hotel selections, and hotels to tackle their deficiencies and better their performance. In this paper, we propose an opinion mining approach that is applied to hotel reviews. The approach combines both lexical and word vectors’ methods to clas...
Article
Scaling up support vector machine (SVM) for large data sets remains one of its main challenges. One way to achieve this is to break down the problem into smaller ones using clustering techniques where local SVM models are constructed. Although this approach is considerably fast compared to the standard SVM, its performance is sometimes inferior eve...
Article
Full-text available
A variety of screening approaches have been proposed to diagnose epileptic seizures, using electroencephalography (EEG) and magnetic resonance imaging (MRI) modalities. Artificial intelligence encompasses a variety of areas, and one of its branches is deep learning (DL). Before the rise of DL, conventional machine learning algorithms involving feat...
Preprint
Understanding data and reaching valid conclusions are of paramount importance in the present era of big data. Machine learning and probability theory methods have widespread application for this purpose in different fields. One critically important yet less explored aspect is how data and model uncertainties are captured and analyzed. Proper quanti...
Preprint
Full-text available
Deep neural networks (DNNs) have achieved the state of the art performance in numerous fields. However, DNNs need high computation times, and people always expect better performance with lower computation. Therefore, we study the human somatosensory system and design a neural network (SpinalNet) to achieve higher accuracy with lower computation tim...
Preprint
A variety of screening approaches have been proposed to diagnose epileptic seizures, using Electroencephalography (EEG) and Magnetic Resonance Imaging (MRI) modalities. Artificial intelligence encompasses a variety of areas, and one of its branches is deep learning. Before the rise of deep learning, conventional machine learning algorithms involvin...
Article
Full-text available
Embedding words from a dictionary as vectors in a space has become an active research field, due to its many uses in several natural language processing applications. Distances between the vectors should reflect the relatedness between the corresponding words. The problem with existing word embedding methods is that they often fail to distinguish b...
Article
Full-text available
Recently, active learning is considered a promising approach for data acquisition due to the significant cost of the data labeling process in many real world applications, such as natural language processing and image processing. Most active learning methods are merely designed to enhance the learning model accuracy. However, the model accuracy may...
Article
Full-text available
Forecast combinations were big winners in the M4 competition. This note reflects on and analyzes the reasons for the success of forecast combination. We illustrate graphically how and in what cases forecast combinations produce good results. We also study the effects of forecast combination on the bias and the variance of the forecast.
Article
Support vector machine (SVM) has been recently considered as one of the most efficient classifiers. However, the time complexity of kernel SVM, which is quadratic in the number of training patterns, makes it impractical to be applied to large data sets. In such a case, the complexity is further increased when an exhaustive grid search is used to fi...
Article
The maximum drawdown (MDD) is a well-known risk measure extensively used in financial markets. It measures the maximum loss from peak to subsequent valley for a stochastic process. In this work we consider discrete time processes, and derive the probability density of the maximum drawdown in terms of integral equation recursions. This is one of the...
Data
The code for the Exponential smoothing method with maximum likelihood estimation.
Chapter
Full-text available
The growth of social media has made Arabic sentiment analysis an active research area. The challenges lie in the fact that most users write unstructured dialect texts instead of writing in Modern Standard Arabic (MSA). In this paper we address these challenges by comparing between two strategies: applying sentiment analysis algorithms directly on t...
Article
Dynamic pricing is the science of pricing a product in a time-varying way for optimising revenue. There is a slow but steady tendency over the last three decades for major businesses to move from fixed pricing to dynamic pricing. In this paper, we consider the problem of dynamic pricing for wireless broadband data. We propose a novel dynamic pricin...
Article
Full-text available
Dynamic pricing is the science of pricing a product in a time-varying way for optimising revenue. There is a slow but steady tendency over the last three decades for major businesses to move from fixed pricing to dynamic pricing. In this paper, we consider the problem of dynamic pricing for wireless broadband data. We propose a novel dynamic pricin...
Conference Paper
Full-text available
The growth of social media has made Arabic sentiment analysis an active research area. The challenges lie in the fact that most users write unstruc-tured dialect texts instead of writing in Modern Standard Arabic (MSA). In this paper we address these challenges by comparing between two strategies: applying sentiment analysis algorithms directly on...
Article
In this paper, we consider the problems of state estimation and parameter estimation. The goal is to consider Robust Unscented Kalman filter, and demonstrate their successful application on a Coupled Tank system. Traditional unscented kalman filter have a limitation to estimate the state and parameter of time-varying parameter system due to making...
Research
This issue includes the following articles; P1121606475, Author="M. AbdelDayem and H. Hemeda and A. Sarhan", Title="Enhanced User Authentication through Keystroke Biometrics for Short-Text and Long-Text Inputs" P1121602462, Author="Eslam Mahmoud and Ahmed M. Elmogy and Amany Sarhan", Title="Enhancing Grid Local Outlier Factor Algorithm for better O...
Research
Full-text available
This issue includes the following articles; P1121606475, Author="M. AbdelDayem and H. Hemeda and A. Sarhan", Title="Enhanced User Authentication through Keystroke Biometrics for Short-Text and Long-Text Inputs" P1121602462, Author="Eslam Mahmoud and Ahmed M. Elmogy and Amany Sarhan", Title="Enhancing Grid Local Outlier Factor Algorithm for better O...
Research
Full-text available
This issue includes the following articles; P1121606475, Author="M. AbdelDayem and H. Hemeda and A. Sarhan", Title="Enhanced User Authentication through Keystroke Biometrics for Short-Text and Long-Text Inputs" P1121602462, Author="Eslam Mahmoud and Ahmed M. Elmogy and Amany Sarhan", Title="Enhancing Grid Local Outlier Factor Algorithm for better O...
Article
In this paper, we consider the problems of state estimation and parameter estimation. The goal is to consider Robust Unscented Kalman filter, and demonstrate their successful application on a Coupled Tank system. Traditional unscented kalman filter have a limitation to estimate the state and parameter of time-varying parameter system due to making...
Data
Full-text available
This issue includes the following articles; P1121606475, Author="M. AbdelDayem and H. Hemeda and A. Sarhan", Title="Enhanced User Authentication through Keystroke Biometrics for Short-Text and Long-Text Inputs" P1121602462, Author="Eslam Mahmoud and Ahmed M. Elmogy and Amany Sarhan", Title="Enhancing Grid Local Outlier Factor Algorithm for better O...
Article
Full-text available
This issue includes the following articles; P1121606475, Author="M. AbdelDayem and H. Hemeda and A. Sarhan", Title="Enhanced User Authentication through Keystroke Biometrics for Short-Text and Long-Text Inputs" P1121602462, Author="Eslam Mahmoud and Ahmed M. Elmogy and Amany Sarhan", Title="Enhancing Grid Local Outlier Factor Algorithm for better O...
Article
Text diacritic restoration is a very vital problem for languages that use diacritics in their orthography systems. Actually, it plays an important role for improving the performance of many NLP tasks. In this paper , we handle the problem of Arabic text diacritiza-tion; such that our system diacritizes input sequence of words both morphologically a...
Conference Paper
Full-text available
This paper compares between seven greedy sparse approximation algorithms with L0-norm regularization for the purpose of time series forecasting. Sparse approximation is used as a method of memory-based learning, where a dictionary is created from the time series lagged vectors, along with their corresponding targets. This dictionary is then used fo...
Conference Paper
In this work, we propose using the sparse coding techniques for learning models for the purpose of Time Series Forecasting. Training data are extracted from the input time series as a set of time-lagged predictors along with their correspondent targets. These time-lagged predictors are sparsely decomposed and transformed into the sparse domain. The...
Article
In this work we consider dynamic pricing for the case of continuous replenishment. An essential ingredient in such a formulation is the use of time normalized revenue or profit function, in other words revenue or profit per unit time. This provides the incentive to sell many items in the shortest time (and of course at a high price). Moreover, for...
Conference Paper
Abstract —The spread of breast cancer and its high fatality has spurred a lot of research for studying its causes and treatments. Since the discovery of gene extraction methods, many biomarkers have been investigated and related to cancer. The large number of genes and their intertwining relations necessitates advanced machine learning models, rath...
Conference Paper
Keyphrases extraction has a considerable importance in many applications such as search engine optimization, clustering, summarization, and sentiment analysis. The importance of keyphrases comes from the semantic meaning they provide as they can be used as descriptors for the documents. In this paper we compare four approaches for extracting keyphr...
Article
Multistep-ahead forecasts can either be produced recursively by iterating a one-step-ahead time series model or directly by estimating a separate model for each forecast horizon. In addition, there are other strategies; some of them combine aspects of both aforementioned concepts. In this paper, we present a comprehensive investigation into the bia...
Conference Paper
Full-text available
Telecommunications industry is a highly competitive one where operators’ strategies usually rely on significantly reducing minute rate in order to acquire more subscribers and thus have higher market share. However, in the last few years, the numbers of customers are noticeably increasing leading to more stress on the network, and higher congestion...
Article
This article introduces a new scheme to express a rectangular function as a linear combination of Gaussian functions. The main idea of this scheme is based on fitting samples of the rectangular function by adapting the well-known clustering algorithm, Gaussian mixture models (GMM). This method has several advantages compared to other existing fitti...
Conference Paper
Full-text available
In this paper a new time dependent pricing scheme is proposed for revenue management in mobile calls. The pro­ posed scheme considers many essential parameters that affect pricing such as time-of-day seasonality, weekday/weekend sea­ sonality and price demand elasticity for call arrivals and call duration. In this model, each day is partitioned int...
Conference Paper
Full-text available
In this paper we tackle the overbooking problem in hotel revenue management (RM). We propose a simulations­ based approach for the overbooking problem. It is based on accurately estimating all the hotel's processes, such as reservations arrivals, cancelations, length of stay, demand seasonality, etc. Subsequently, all these processes are simulated...
Conference Paper
The presence of missing data in time series is big impediment to the successful performance of forecasting models, as it leads to a significant reduction of useful data. In this work we propose a multiple-imputation-type framework for estimating the missing values of a time series. This framework is based on iterative and successive forward and bac...
Conference Paper
Full-text available
In this work, we address the problem of spelling correction in the Arabic language utilizing the new corpus provided by QALB (Qatar Arabic Language Bank) project which is an annotated corpus of sentences with errors and their corrections. The corpus contains edit, add before, split, merge, add after, move and other error types. We are concerned wit...
Article
In this article, we derive a series expansion of the multivariate normal probability integrals based on Fourier series. The basic idea is to transform the limits of each integral from h i to ∞ to be from -∞ to ∞ by multiplying the integrand by a periodic square wave that approximates the domain of the integral. This square wave is expressed by its...
Article
This paper derives the value of the integral of the product of the error function and the normal probability density as a series of the Hermite polynomial and the normalized incomplete Gamma function. This expression is beneficial, and can be used for evaluating the bivariate normal integral as a series expansion. This expansion is a good alternati...
Conference Paper
Full-text available
In this paper we tackle the overbooking problem in hotel revenue management (RM). We propose a simulations­ based approach for the overbooking problem. It is based on accurately estimating all the hotel's processes, such as reservations arrivals, cancelations, length of stay, demand seasonality, etc. Subsequently, all these processes are simulated...
Article
Full-text available
In this paper, we present a method of determining the parameters of a dynamic system using state estimate filter. State estimate filters such as Extended Kalman filter and the Unscented Kalman filter are widely used to estimate the status in robot and GPS navigation systems. However, in dynamic systems, determining parameters is difficult because m...
Conference Paper
Full-text available
We introduce LABR, the largest sentiment analysis dataset to-date for the Arabic lan-guage. It consists of over 63,000 book reviews, each rated on a scale of 1 to 5 stars. We investigate the properties of the the dataset, and present its statistics. We explore using the dataset for two tasks: sentiment polarity classification and rat-ing classifica...
Article
Full-text available
In this article we propose a new dynamic pricing approach for the hotel revenue management problem. The proposed approach is based on having ‘price multipliers’ that vary around ‘1’ and provide a varying discount/premium over some seasonal reference price. The price multipliers are a function of certain influencing variables (for example, hotel occ...
Article
The k-center problem arises in many applications such as facility location and data clustering. Typically, it is solved using a branch and bound tree traversed using the depth first strategy. The reason is its linear space requirement compared to the exponential space requirement of the breadth first strategy. Although the depth first strategy gain...
Article
A method for solving quasiconvex nondifferentiable unconstrained multiobjective optimization problems is proposed in this paper. This method extends to the multiobjective case of the classical subgradient method for real-valued minimization. Assuming ...
Article
Gaussian process is a very promising novel technology that has been applied to both the regression problem and the classification problem. While for the regression problem it yields simple exact solutions, this is not the case for the classification problem, because we encounter intractable integrals. In this paper we develop a new derivation that...
Article
Full-text available
In this paper, a novel algorithm is proposed for sampling from discrete probability distributions using the probability proportional to size sampling method, which is a special case of Quota sampling method. The motivation for this study is to devise an efficient sampling algorithm that can be used in stochastic optimization problems --when there i...
Chapter
Quality of Service (QoS) of telecommunication networks could be enhanced by applying predictive control methods. Such controllers rely on utilizing good and fast (real-time) predictions of the network traffic and quality parameters. Accuracy and recall speed of the traditional Neural Network models are not satisfactory to support such critical real...
Chapter
Full-text available
This chapter reviews a recent HONN-like model called Symbolic Function Network (SFN). This model is designed with the goal to impart more flexibility than both traditional and HONNs neural networks. The main idea behind this scheme is the fact that different functional forms suit different applications and that no specific architecture is best for...
Conference Paper
Full-text available
In this paper we propose a new unconstraining method for demand forecasting. Since true demand forecasting is a key aspect of hotel room revenue management systems, inaccurate forecasts will significantly impact the performance of these systems. We propose a method based on a Monte Carlo simulation forecasting model and an Expectation-Maximization...
Conference Paper
Penalized likelihood is a general approach whereby an objective function is defined, consisting of the log likelihood of the data minus some term penalizing non-smooth solutions. Subsequently, this objective function is maximized, yielding a solution that achieves some sort of trade-off between the faithfulness and the smoothness of the fit. In thi...
Article
In this paper, the states and parameters in a dynamic system are estimated by applying an Unscented Kalman Filter (UKF). The UKF is widely used in various fields such as sensor fusion, trajectory estimation, and learning of Neural Network weights. These estimations are necessary and important in determining the stability of a mobile system, monitor...
Article
Stock market forecasting has been considered an extremely challenging problem, and its predictability caused a debate that lasted for years. In this paper, we present a hidden Markov model that models the up-trend behavior of stocks. This model makes use of the fact that during up-trends the market frequently pauses and undergoes a pull-back. Altho...
Article
Multi-step ahead forecasting is still an open challenge in time series forecasting. Several approaches that deal with this complex problem have been proposed in the literature but an extensive comparison on a large number of tasks is still missing. This paper aims to fill this gap by reviewing existing strategies for multi-step ahead forecasting an...