Upal Mahbub

Upal Mahbub
Qualcomm · Multimedia R&D

Doctor of Philosophy
Developing Hardware-friendly Deep Learning Solutions for Different XR Use-cases

About

54
Publications
15,637
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
751
Citations
Citations since 2017
22 Research Items
597 Citations
2017201820192020202120222023020406080100120
2017201820192020202120222023020406080100120
2017201820192020202120222023020406080100120
2017201820192020202120222023020406080100120
Additional affiliations
September 2018 - present
Qualcomm
Position
  • Engineer
Description
  • Working at the QCT Multimedia R&D and Standards Department.
May 2017 - August 2017
Comcast
Position
  • Research Intern
January 2015 - April 2015
Google Inc. Sunnyvale
Position
  • Visiting Researcher
Education
November 2009 - November 2011
Bangladesh University of Engineering and Technology
Field of study
  • Electrical and Electronic Engineering
December 2004 - October 2009
Bangladesh University of Engineering and Technology
Field of study
  • Electrical and Electronic Engieering

Publications

Publications (54)
Preprint
As a beloved sport worldwide, dancing is getting integrated into traditional and virtual reality-based gaming platforms nowadays. It opens up new opportunities in the technology-mediated dancing space. These platforms primarily rely on passive and continuous human pose estimation as an input capture mechanism. Existing solutions are mainly based on...
Patent
Systems and techniques are provided for registering three-dimensional (3D) images to deformable models. An example method can include determining, based on an image of a target and associated depth information, a 3D mesh of the target; determining different sets of rotation and translation parameters based on modifications to rotation and translati...
Patent
Systems, methods, and non-transitory media are provided for capturing a region of interest (ROI) with a multi-camera system. An example method can include initializing image sensors of an electronic device, each image sensor being initialized in a lower-power mode having a lower power consumption than a higher-power mode supported by one or more of...
Patent
Embodiments include systems and methods that may be performed by a processor of a computing device. Embodiments may be applied for keypoint detection in an image. In embodiments, the processor of the computing device may apply to an image a first-stage neural network to define and output a plurality of regions, apply to each of the plurality of reg...
Preprint
Full-text available
In this paper, we present DIREG3D, a holistic framework for 3D Hand Tracking. The proposed framework is capable of utilizing camera intrinsic parameters, 3D geometry, intermediate 2D cues, and visual information to regress parameters for accurately representing a Hand Mesh model. Our experiments show that information like the size of the 2D hand, i...
Article
A set of advanced approaches and models on human action, activity, gesture, behavior and related aspects are summarized in this note. There are a number of challenges on these domains, and some of these are addressed here. Notably, Video-based human activity recognition, sensor-based activity analysis, skeleton-based activity recognition, assisted...
Patent
A method is presented. The method includes determining a number of landmarks in an image comprising multiple pixels. The method also includes determining a number of channels for the image based on a function of the number of landmarks. The method further includes determining, for each one of the number of channels, a confidence of each pixel of th...
Preprint
Full-text available
Background: COVID-19 pandemic is rapidly expanding throughout the world right now. Caused by a novel strain of the coronavirus, the manifestation of this pandemic shows a unique level of disease burden and mortality rate in different countries. Objective: In this paper, we investigated the effects of several socioeconomic, environmental, and health...
Chapter
Full-text available
Human activity recognition and analysis have a great number of important applications in numerous fields including computer vision, ubiquitous computing, human-computer interactions, healthcare, robotics, and surveillance. Video-based and sensor-based human activity recognition have progressed tremendously in the last two decades. In this chapter,...
Chapter
Full-text available
Monitoring human activities from a distance without actively interacting with the subjects to make a decision is a fascinating research domain given the associated challenges and prospects of building more robust artificial intelligence systems. In recent years, with the advancement of deep learning and high-performance computing systems, contactle...
Book
This book is a truly comprehensive, timely, and very much needed treatise on the conceptualization of analysis, and design of contactless & multimodal sensor-based human activities, behavior understanding & intervention. From an interaction design perspective, the book provides views and methods that allow for more safe, trustworthy, efficient, and...
Article
Full-text available
Given the relevance of smartphones for accessing personalized services in smart cities, Continuous Authentication (CA) mechanisms are attracting attention to avoid impersonation attacks. Some of them leverage Data Stream Mining (DSM) techniques applied over sensorial information. Injection attacks can undermine the effectiveness of DSM-based CA by...
Preprint
BACKGROUND The COVID-19 pandemic is rapidly expanding throughout the world right now. Caused by a novel strain of the coronavirus, the manifestation of this pandemic is showing different level of disease-burden and mortality rate in different countries. OBJECTIVE In this paper, we investigated the effects of several socio-economic, environmental,...
Article
An empirical investigation of active/continuous authentication for smartphones is presented by exploiting users’ unique application usage data, i.e., distinct patterns of use, modeled by a Markovian process. Specifically, variations of Hidden Markov Models (HMMs) are evaluated for continuous user verification, and challenges due to the sparsity of...
Preprint
Full-text available
An empirical investigation of active/continuous authentication for smartphones is presented in this paper by exploiting users' unique application usage data, i.e., distinct patterns of use, modeled by a Markovian process. Variations of Hidden Markov Models (HMMs) are evaluated for continuous user verification, and challenges due to the sparsity of...
Article
Full-text available
State-of-the-art methods of attribute detection from faces almost always assume the presence of a full, unoccluded face. Hence, their performance degrades for partially visible and occluded faces. In this paper, we introduce SPLITFACE, a deep convolutional neural network-based method that is explicitly designed to perform attribute detection in par...
Article
Full-text available
In this paper, targeted fooling of high performance image classifiers is achieved by developing two novel attack methods. The first method generates universal perturbations for target classes and the second generates image specific perturbations. Extensive experiments are conducted on MNIST and CIFAR10 datasets to provide insights about the propose...
Poster
Full-text available
Generic face detection algorithms do not perform very well in the mobile domain due to significant presence of occluded and partially visible faces. One promising technique to handle the challenge of partial faces is to design face detectors based on facial segments. In this paper two such face detectors namely, SegFace and DeepSegFace, are propose...
Article
Generic face detection algorithms do not perform well in the mobile domain due to significant presence of occluded and partially visible faces. One promising technique to handle the challenge of partial faces is to design face detectors based on facial segments. In this paper two different approaches of facial segment-based face detection are discu...
Article
Full-text available
Generic face detection algorithms do not perform very well in the mobile domain due to significant presence of occluded and partially visible faces. One promising technique to handle the challenge of partial faces is to design face detectors based on facial segments. In this paper two such face detectors namely, SegFace and DeepSegFace, are propose...
Article
Full-text available
In this paper, a solution to the problem of Active Authentication using trace histories is addressed. Specifically, the task is to perform user verification on mobile devices using historical location traces of the user as a function of time. Considering the movement of a human as a Markovian motion, a modified Hidden Markov Model (HMM)-based solut...
Article
Full-text available
In this paper, automated user verification techniques for smartphones are investigated. A unique non-commercial dataset, the University of Maryland Active Authentication Dataset 02 (UMDAA-02) for multi-modal user authentication research is introduced. This paper focuses on three sensors - front camera, touch sensor and location service while provid...
Conference Paper
In this paper, automated user verification techniques for smartphones are investigated. A unique non-commercial dataset, the Active Authentication Dataset 02 (AA02) for multi-modal user authentication research is introduced. This paper focuses on three sensors - front camera, touch sensor and location service while providing a general description f...
Conference Paper
Full-text available
We consider the problem of handling binary constraints in optimization problems. We review methods for solving binary quadratic programs (BQPs), such as the spectral method and semidefinite programming relaxations. We then discuss two new methods for handling these constraints. The first involves the introduction of an extra unconstrained variable...
Article
Full-text available
In this paper, a part-based technique for real time detection of users' faces on mobile devices is proposed. This method is specifically designed for detecting partially cropped and occluded faces captured using a smartphone's front-facing camera for continuous authentication. The key idea is to detect facial segments in the frame and cluster the r...
Article
Full-text available
In this paper, we propose a novel approach towards human action recognition using spectral domain feature extraction. Action representations can be considered as image templates, which can be useful for understanding various actions or gestures as well as for recognition and analysis. An action recognition scheme is developed based on extracting sp...
Article
Full-text available
In this paper, we propose a novel approach towards human action recognition using multi-resolution feature extraction. It is based on 2D Discrete Wavelet Transform (2D-DWT), where features are extracted from sequential video frames. The proposed feature selection algorithm offers an advantage of very low feature dimension and therefore, lower compu...
Article
Full-text available
An offline single channel acoustic echo cancellation (AEC) scheme is proposed based on gradient based adaptive least mean squares (LMS) algorithm considering a major practical application of echo cancellation system for enhancing recorded echo corrupted speech data. The unavailability of a reference signal makes the problem of single channel adapti...
Article
Full-text available
In this paper, a two-stage scheme is proposed to deal with the difficult problem of acoustic echo cancellation (AEC) in single-channel scenario in the presence of noise. In order to overcome the major challenge of getting a separate reference signal in adaptive filter-based AEC problem, the delayed version of the echo and noise suppressed signal is...
Article
Full-text available
In this paper, a single-channel acoustic echo cancellation (AEC) scheme is proposed using a gradient-based adaptive least mean squares (LMS) algorithm. Unlike the conventional dual-channel problem, by considering a delayed version of the echo-suppressed signal as a reference, a modified objective function is formulated and thereby an LMS update equ...
Article
A new technique for action clustering-based human action representation on the basis of optical flow analysis and random sample consensus (RANSAC) method is proposed in this paper. The apparent motion of the human subject with respect to the background is detected and localized by using optical flow analysis. The next task is to characterize the ac...
Article
This paper proposes a novel approach for gesture recognition from motion depth images based on template matching. Gestures can be represented with image templates, which in turn can be used to compare and match gestures. The proposed method uses a single example of an action as a query to find similar matches and thus termed one-shot-learning gestu...
Conference Paper
This paper proposes a novel approach for human action recognition using multi-resolution feature extraction based on the two-dimensional discrete wavelet transform (2D-DWT). Action representations can be considered as image templates, which can be useful for understanding various actions or gestures as well as for recognition and analysis. An actio...
Conference Paper
This paper deals with the problem of temporal segmentation present in practical applications of action and gesture recognition. In order to separate different gestures from gesture sequences a novel method utilizing depth information, oriented gradients and supervised learning techniques is proposed in this paper. The temporal segmentation task is...
Conference Paper
Full-text available
A novel approach for gesture recognition based on motion history images is proposed in this paper for one-shot learning gesture recognition task. The challenge here is to perform satisfactory recognition operations with only one training example of each action, while no prior knowledge about actions, foreground/background segmentation, or any motio...
Conference Paper
Full-text available
Conventional adaptive echo cancellation schemes are generally suitable for dual channel communication system and their performance degrades significantly in the presence of noise. This paper deals with a challenging task of cancelling both echo and noise in a single channel communication system. In this regard, first, a gradient based single channe...
Conference Paper
Full-text available
This paper presents an approach to detect voice disorders based on wavelet and prosody-related voice properties. First, several statistical measures of the normalized energy contents of the Discrete Wavelet Transform (DWT) coefficients over all voice frames are determined. Then, similar statistical measures of some prosody-related voice properties,...
Conference Paper
Full-text available
This paper presents the design and construction of a fuzzy logic based proportional integral derivative (PID) controller for the control of a pulse width modulation (PWM)based dc-dc buck converter working in Continuous Conduction Mode(CCM). The converter operates at a switching frequency of 100 KHz. Computer simulation is done to fuzzify the inputs...
Conference Paper
Full-text available
A new technique for action clustering-based human action representation on the basis of optical flow analysis and random sample consensus (RANSAC) method is proposed in this paper. The apparent motion of the human subject with respect to the background is detected using optical flow analysis, while the RANSAC algorithm is used to filter out unwante...
Conference Paper
Full-text available
This paper deals with the design and implementation of series active power filter for power quality improvement. The problem of harmonics due to non-linear loads can be reduced by series active power filter. P-Q theory is used as the control algorithm in the proposed series active power filter. The performance of the filter is evaluated by monitori...
Conference Paper
Full-text available
A novel approach for gesture recognition is developed in this paper based on template matching from motion depth image. The proposed method uses a single example of an action as a query to find similar matches from a good number of test samples. No prior knowledge about the actions, the foreground/background segmentation, or any motion estimation o...
Article
Full-text available
A new approach for motion-based representation on the basis of optical flow analysis and random sample consensus (RANSAC) method is proposed in this paper. Optical flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer (an eye or a camera) and the scene. It is intui...
Conference Paper
Power Quality means to maintain purely sinusoidal current wave form in phase with a purely sinusoidal voltage wave form. Power quality improvement using traditional compensation methods include many disadvantages like electromagnetic interference, possible resonance, fixed compensation, bulkiness etc. So power system and power electronic engineers...
Conference Paper
Full-text available
In this paper, an efficient time domain scheme for echo cancellation of speech signals in acoustic environment is proposed. The task of echo cancellation is commonly handled by using the state-of-the-art adaptive filters, which may lack the flexibility of controlling the number of iterations, convergence rate and the range of variation of filter co...
Article
Full-text available
This paper proposes a new technique for motion-based representation on the basis of optical flow analysis and random sample consensus (RANSAC) method. The method is based on the fact that an action can be characterized by the frequent movement of the optical flow points or interest points at different areas of the human figure. A combination of opt...
Conference Paper
This paper proposes a novel approach towards human action recognition based on optical flow and random sample consensus (RANSAC) by utilizing frequency domain feature extraction. Action representations can be considered as image templates, which can be useful for understanding various actions or gestures as well as for recognition and analysis. Opt...
Conference Paper
Full-text available
This paper deals with the problem of echo cancellation of speech signals in an acoustic environment. In this regard, generally, different adaptive filter algorithms are employed, which may lack the flexibility of controlling the convergence rate, number of iterations, range of variation of filter coefficients, and tolerance consistency. In order to...
Conference Paper
Full-text available
This paper deals with the problem of noise cancellation of speech signals in an acoustic environment. In this regard, generally, different adaptive filter algorithms are employed, many of them may lack the flexibility of controlling the convergence rate, range of variation of filter coefficients, and consistency in error within tolerance limit. In...
Article
Full-text available
The Paper presents the outlines of the field programmable gate array (FPGA) implementation of real time speech enhancement by spectral subtraction of acoustic noise using dynamic moving average method. It describes an stand alone algorithm for speech enhancement and presents a architecture for the implementation. The traditional spectral subtractio...

Network

Cited By

Projects

Projects (4)
Project
Continuous (or active) authentication provides complementary security mechanisms to any "obtrusive" authentication scheme, like passwords, PIN codes or biometrics. To be more specific, the aim is to cover the failure modes and limitations of the traditional login schemes and to develop effective solutions for performing unobtrusive active authentication by continuously monitoring the smartphone sensors and usage in order to detect whether the device has been hijacked by an intruder after initial (or otherwise prompted) authentication. This research project intends to propose more robust methods for unobtrusive continuous authentication and fast intrusion detection using multiple physiological and behavioral traits, like face and touch.
Archived project
Single-channel acoustic echo cancellation (AEC) and acoustic noise cancellation (ANC) schemes are developed using a gradient-based adaptive least mean squares (LMS) and other algorithms.
Project
Recognizing human action, activity, and gesture using multi-modal sensor data for real-world applications. To study, develop, and implement traditional and deep-learning-based methods. Discussing current and future research trends in this domain.