# Benyamin GhojoghUniversity of Waterloo | UWaterloo · Department of Electrical & Computer Engineering

Benyamin Ghojogh

Machine Learning Engineer

## About

97

Publications

55,199

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

225

Citations

Introduction

In addition to my coding skills, my abilities and skills include machine learning, deep learning, feature extraction (dimensionality reduction & manifold learning), data visualization, data science, computer vision, image processing, and biomedical engineering in both application and theory. I have very good and in-depth understanding of machine learning algorithms.

## Publications

Publications (97)

After the development of different machine learning and manifold learning algorithms, it may be a good time to put them together to make a powerful mind for machine. In this work, we propose affective manifolds as components of a machine's mind. Every affective manifold models a characteristic group of mind and contains multiple states. We define t...

Locally Linear Embedding (LLE) is a nonlinear spectral dimensionality reduction and manifold learning method. It has two main steps which are linear reconstruction and linear embedding of points in the input space and embedding space, respectively. In this work, we look at the linear reconstruction step from a stochastic perspective where it is ass...

Consider a set of $n$ data points in the Euclidean space $\mathbb{R}^d$. This set is called dataset in machine learning and data science. Manifold hypothesis states that the dataset lies on a low-dimensional submanifold with high probability. All dimensionality reduction and manifold learning methods have the assumption of manifold hypothesis. In t...

This is a tutorial and survey paper on metric learning. Algorithms are divided into spectral, probabilistic, and deep metric learning. We first start with the definition of distance metric, Ma-halanobis distance, and generalized Mahalanobis distance. In spectral methods, we start with methods using scatters of data, including the first spectral met...

This is a tutorial and survey paper on Generative Adversarial Network (GAN), adversarial autoencoders, and their variants. We start with explaining adversarial learning and the vanilla GAN. Then, we explain the conditional GAN and DCGAN. The mode collapse problem is introduced and various methods, including minibatch GAN, unrolled GAN, BourGAN, mix...

This is a tutorial and survey paper on various methods for Sufficient Dimension Reduction (SDR). We cover these methods with both statistical high-dimensional regression perspective and machine learning approach for dimensionality reduction. We start with introducing inverse regression methods including Sliced Inverse Regression (SIR), Sliced Avera...

This is a tutorial and survey paper on Karush-Kuhn-Tucker (KKT) conditions, first-order and second-order numerical optimization, and distributed optimization. After a brief review of history of optimization, we start with some preliminaries on properties of sets, norms, functions, and concepts of optimization. Then, we introduce the optimization pr...

This work concentrates on optimization on Riemannian manifolds. The Limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) algorithm is a commonly used quasi-Newton method for numerical optimization in Euclidean spaces. Riemannian LBFGS (RLBFGS) is an extension of this method to Riemannian manifolds. RLBFGS involves computationally expensive vecto...

Uniform Manifold Approximation and Projection (UMAP) is one of the state-of-the-art methods for dimensionality reduction and data visualization. This is a tutorial and survey paper on UMAP and its variants. We start with UMAP algorithm where we explain probabilities of neighborhood in the input and embedding spaces, optimization of cost function, t...

This is a tutorial and survey paper on the Johnson-Lindenstrauss (JL) lemma and linear and nonlinear random projections. We start with linear random projection and then justify its correctness by JL lemma and its proof. Then, sparse random projections with $\ell_1$ norm and interpolation norm are introduced. Two main applications of random projecti...

This is a tutorial and survey paper on Boltzmann Machine (BM), Restricted Boltzmann Machine (RBM), and Deep Belief Network (DBN). We start with the required background on probabilistic graphical models, Markov random field, Gibbs sampling, statistical physics, Ising model, and the Hopfield network. Then, we introduce the structures of BM and RBM. T...

Data often have nonlinear patterns in machine learning. One can unfold the nonlinear manifold of a dataset for low-dimensional visualization and feature extraction. Locally Linear Embedding (LLE) is a nonlinear spectral method for dimensionality reduction and manifold unfolding. It embeds data using the same linear reconstruction weights as in the...

This is a tutorial and survey paper on unification of spectral dimensionality reduction methods, kernel learning by Semidefinite Programming (SDP), Maximum Variance Unfolding (MVU) or Semidefinite Embedding (SDE), and its variants. We first explain how the spectral dimensionality reduction methods can be unified as kernel Principal Component Analys...

This is a tutorial and survey paper on kernels, kernel methods, and related fields. We start with reviewing the history of kernels in functional analysis and machine learning. Then, Mercer kernel, Hilbert and Banach spaces, Reproducing Kernel Hilbert Space (RKHS), Mercer's theorem and its proof, frequently used kernels, kernel construction from dis...

This is a tutorial and survey paper for nonlinear dimensionality and feature extraction methods which are based on the Laplacian of graph of data. We first introduce adjacency matrix, definition of Laplacian matrix, and the interpretation of Laplacian. Then, we cover the cuts of graph and spectral clustering which applies clustering in a subspace o...

We propose a new embedding method, named Quantile–Quantile Embedding (QQE), for distribution transformation and manifold embedding with the ability to choose the embedding distribution. QQE, which uses the concept of quantile–quantile plot from visual statistical tests, can transform the distribution of data to any theoretical desired distribution...

Histopathology image embedding is an active research area in computer vision. Most of the embedding models exclusively concentrate on a specific magnification level. However, a useful task in histopathology embedding is to train an embedding space regardless of the magnification level. Two main approaches for tackling this goal are domain adaptatio...

Locally Linear Embedding (LLE) is a nonlinear spectral dimensionality reduction and manifold learning method. It has two main steps which are linear reconstruction and linear embedding of points in the input space and embedding space, respectively. In this work, we propose two novel generative versions of LLE, named Generative LLE (GLLE), whose lin...

Raw data are usually required to be pre-processed for better representation or discrimination of classes. This pre-processing can be done by data reduction, i.e., either reduction in dimensionality or numerosity (cardinality). Dimensionality reduction can be used for feature extraction or data visualization. Numerosity reduction is useful for ranki...

The YouTube presentation of slides:
https://www.youtube.com/watch?v=bqtNjwI60dQ

The YouTube presentation of slides:
https://www.youtube.com/watch?v=S8lGRykashc

The YouTube presentation of slides:
https://www.youtube.com/watch?v=XuFBoRp4ZAM

The YouTube presentation of slides:
https://www.youtube.com/watch?v=iYwW9XdWPZw

The YouTube presentation of slides:
https://www.youtube.com/watch?v=LscooUyktz4

The YouTube presentation of slides:
https://www.youtube.com/watch?v=4Zk2UFGIrd0

The YouTube presentation of slides:
https://www.youtube.com/watch?v=3rXYXZiH-pA

Histopathology image embedding is an active research area in computer vision. Most of the embedding models exclusively concentrate on a specific magnification level. However, a useful task in histopathology embedding is to train an embedding space regardless of the magnification level. Two main approaches for tackling this goal are domain adaptatio...

Metric learning is a technique in manifold learning to find a projection subspace for increasing and decreasing the inter- and intra-class variances, respectively. Some metric learning methods are based on triplet learning with anchor-positive-negative triplets. Large margin metric learning for nearest neighbor classification is one of the fundamen...

Variants of Triplet networks are robust entities for learning a discriminative embedding subspace. There exist different triplet mining approaches for selecting the most suitable training triplets. Some of these mining methods rely on the extreme distances between instances, and some others make use of sampling. However, sampling from stochastic di...

This is a tutorial and survey paper on factor analysis, probabilistic Principal Component Analysis (PCA), variational inference, and Variational Autoencoder (VAE). These methods, which are tightly related, are dimensionality reduction and generative models. They asssume that every data point is generated from or caused by a low-dimensional latent f...

This is a tutorial and survey paper on the attention mechanism, transformers, BERT, and GPT. We first explain attention mechanism, sequence-to-sequence model without and with attention, self-attention, and attention in different areas such as natural language processing and computer vision. Then, we explain transformers which do not use any recurre...

Vehicle platooning has been shown to be quite fruitful in the transportation industry to enhance fuel economy, road throughput, and driving comfort. Model Predictive Control (MPC) is widely used in literature for platoon control to achieve certain objectives, such as safely reducing the distance among consecutive vehicles while following the leader...

We analyze the effect of offline and online triplet mining for colorectal cancer (CRC) histopathology dataset containing 100,000 patches. We consider the extreme, i.e., farthest and nearest patches to a given anchor, both in online and offline mining. While many works focus solely on selecting the triplets online (batch-wise), we also study the eff...

This is a tutorial and survey paper for Locally Linear Embedding (LLE) and its variants. The idea of LLE is fitting the local structure of manifold in the embedding space. In this paper, we first cover LLE, kernel LLE, inverse LLE, and feature fusion with LLE. Then, we cover out-of-sample embedding using linear reconstruction, eigenfunctions, and k...

This paper is a tutorial and literature review on sampling algorithms. We have two main types of sampling in statistics. The first type is survey sampling which draws samples from a set or population. The second type is sampling from probability distribution where we have a probability density or mass function. In this paper, we cover both types of...

Metric learning is one of the techniques in manifold learning with the goal of finding a projection subspace for increasing and decreasing the inter- and intra-class variances, respectively. Some of the metric learning methods are based on triplet learning with anchor-positive-negative triplets. Large margin metric learning for nearest neighbor cla...

Stochastic Neighbor Embedding (SNE) is a manifold learning and dimensionality reduction method with a probabilistic approach. In SNE, every point is consider to be the neighbor of all other points with some probability and this probability is tried to be preserved in the embedding space. SNE considers Gaussian distribution for the probability in bo...

Multidimensional Scaling (MDS) is one of the first fundamental manifold learning methods. It can be categorized into several methods, i.e., classical MDS, kernel classical MDS, metric MDS, and non-metric MDS. Sammon mapping and Isomap can be considered as special cases of metric MDS and kernel classical MDS, respectively. In this tutorial and surve...

We consider the problem of distributed multi-choice voting in a setting that each node can communicate with its neighbors merely by sending beep signals. Given its simplicity, the beep communication model is of practical importance in different applications such as system biology and wireless sensor networks. Yet, the distributed majority voting ha...

Variants of Triplet networks are robust entities for learning a discriminative embedding subspace. There exist different triplet mining approaches for selecting the most suitable training triplets. Some of these mining methods rely on the extreme distances between instances, and some others make use of sampling. However, sampling from stochastic di...

We analyze the effect of offline and online triplet mining for colorectal cancer (CRC) histopathology dataset containing 100,000 patches. We consider the extreme, i.e., farthest and nearest patches with respect to a given anchor, both in online and offline mining. While many works focus solely on how to select the triplets online (batch-wise), we a...

As many algorithms depend on a suitable representation of data, learning unique features is considered a crucial task. Although supervised techniques using deep neural networks have boosted the performance of representation learning, the need for a large sets of labeled data limits the application of such methods. As an example, high-quality deline...

Human action recognition is one of the important fields of computer vision and machine learning. Although various methods have been proposed for 3D action recognition, some of which are basic and some use deep learning, the need of basic methods based on generalized eigenvalue problem is sensed for action recognition. This need is especially sensed...

In this paper, we introduce the Hinduism religion and philosophy. We start with introducing the holy books in Hinduism including Vedas and Upanishads. Then, we explain the simplistic Hinduism, Brahman, gods and their incarnations, stories of apocalypse, karma, reincarnation, heavens and hells, vegetarianism, and sanctity of cows. Then, we switch to...

We propose a new embedding method, named Quantile-Quantile Embedding (QQE), for distribution transformation, manifold embedding, and image embedding with the ability to choose the embedding distribution. QQE, which uses the concept of quantile-quantile plot from visual statistical tests, can transform the distribution of data to any theoretical des...

We present a new method which generalizes subspace learning based on eigenvalue and generalized eigenvalue problems. This method, Roweis Discriminant Analysis (RDA) named after Sam Roweis, is a family of infinite number of algorithms where Principal Component Analysis (PCA), Supervised PCA (SPCA), and Fisher Discriminant Analysis (FDA) are special...

Fisher Discriminant Analysis (FDA) is a subspace learning method which minimizes and maximizes the intra- and inter-class scatters of data, respectively. Although, in FDA, all the pairs of classes are treated the same way, some classes are closer than the others. Weighted FDA assigns weights to the pairs of classes to address this shortcoming of FD...

Generative models and inferential autoencoders mostly make use of \(\ell _2\) norm in their optimization objectives. In order to generate perceptually better images, this short paper theoretically discusses how to use Structural Similarity Index (SSIM) in generative models and inferential autoencoders. We first review SSIM, SSIM distance metrics, a...

After the tremendous development of neural networks trained by backpropagation, it is a good time to develop other algorithms for training neural networks to gain more insights into networks. In this paper, we propose a new algorithm for training feedforward neural networks which is fairly faster than backpropagation. This method is based on projec...

Human action recognition has been one of the most active fields of research in computer vision for last years. Two dimensional action recognition methods are facing serious challenges such as occlusion and missing the third dimension of data. Development of depth sensors has made it feasible to track positions of human body joints over time. This p...

As many algorithms depend on a suitable representation of data, learning unique features is considered a crucial task. Although supervised techniques using deep neural networks have boosted the performance of representation learning, the need for a large set of labeled data limits the application of such methods. As an example, high-quality delinea...

We propose a novel approach to anomaly detection called Curvature Anomaly Detection (CAD) and Kernel CAD based on the idea of polyhedron curvature. Using the nearest neighbors for a point, we consider every data point as the vertex of a polyhedron where the more anomalous point has more curvature. We also propose inverse CAD (iCAD) and Kernel iCAD...

We propose a novel approach to anomaly detection called Curvature Anomaly Detection (CAD) and Kernel CAD based on the idea of polyhedron curvature. Using the nearest neighbors for a point, we consider every data point as the vertex of a polyhedron where the more anomalous point has more curvature. We also propose inverse CAD (iCAD) and Kernel iCAD...

Siamese neural network is a very powerful architecture for both feature extraction and metric learning. It usually consists of several networks that share weights. The Siamese concept is topology-agnostic and can use any neural network as its backbone. The two most popular loss functions for training these networks are the triplet and contrastive l...

After the tremendous development of neural networks trained by backpropagation, it is a good time to develop other algorithms for training neural networks to gain more insights into networks. In this paper, we propose a new algorithm for training feedforward neural networks which is fairly faster than backpropagation. This method is based on projec...

Fisher Discriminant Analysis (FDA) is a subspace learning method which minimizes and maximizes the intra- and inter-class scatters of data, respectively. Although, in FDA, all the pairs of classes are treated the same way, some classes are closer than the others. Weighted FDA assigns weights to the pairs of classes to address this shortcoming of FD...

Generative models and inferential autoencoders mostly make use of $\ell_2$ norm in their optimization objectives. In order to generate perceptually better images, this short paper theoretically discusses how to use Structural Similarity Index (SSIM) in generative models and inferential autoencoders. We first review SSIM, SSIM distance metrics, and...

Vehicle platooning has been shown to be quite fruitful in the transportation industry to enhance fuel economy, road throughput, and driving comfort. Model Predictive Control (MPC) is widely used in literature for platoon control to achieve certain objectives, such as safely reducing the distance among consecutive vehicles while following the leader...

We propose a new method, named isolation Mondrian forest (iMondrian forest), for batch and online anomaly detection. The proposed method is a novel hybrid of isolation forest and Mondrian forest which are existing methods for batch anomaly detection and online random forest, respectively. iMondrian forest takes the idea of isolation, using the dept...