
Joseph Alejandro Gallego Mejia- Doctor of Engineering
- Professor (Assistant) at Drexel University
Joseph Alejandro Gallego Mejia
- Doctor of Engineering
- Professor (Assistant) at Drexel University
About
43
Publications
3,103
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
78
Citations
Introduction
My current research interests are density estimation using density matrices and random Fourier features. How to involve those new topics in solving anomaly detection tasks.
https://www.youtube.com/playlist?list=PLBsv0esePMjEw19dtzQj9kehVZoojLRBT
Current institution
Additional affiliations
January 2014 - present
Position
- PhD Student
Description
- Joseph A. Gallego is currently pursuing Ph.D. degree in Systems and Computing Engineering from the National University of Colombia, Colombia. He earned a Computing Systems Engineer degree and an Industrial Engineer degree from the National University of Colombia. He earned a MSc in Systems and Computer Engineering from the National University of Colombia. He is currently a student associated with the MindLab research group of the National University of Colombia, Bogot?, conducting research in Ma
Education
January 2014 - September 2017
January 2008 - December 2013
January 2008 - January 2014
Publications
Publications (43)
Tree height estimation serves as an important proxy for biomass estimation in ecological and forestry applications. While traditional methods such as photogrammetry and Light Detection and Ranging (LiDAR) offer accurate height measurements, their application on a global scale is often cost-prohibitive and logistically challenging. In contrast, remo...
Accurately estimating forest biomass is crucial for global carbon cycle modelling and climate change mitigation. Tree height, a key factor in biomass calculations, can be measured using Synthetic Aperture Radar (SAR) technology. This study applies machine learning to extract forest height data from two SAR products: Single Look Complex (SLC) images...
This paper introduces a novel anomaly detection framework that combines the robust statistical principles of density-estimation-based anomaly detection methods with the representation-learning capabilities of deep learning models. The method originated from this framework is presented in two different versions: a shallow approach employing a densit...
Satellite-based remote sensing has revolutionised the way we address global challenges in a rapidly evolving world. Huge quantities of Earth Observation (EO) data are generated by satellite sensors daily, but processing these large datasets for use in ML pipelines is technically and computationally challenging. Specifically, different types of EO d...
Density estimation is a central task in statistics and machine learning. This problem aims to determine the underlying probability density function that best aligns with an observed dataset. Some of its applications include statistical inference, unsupervised learning, and anomaly detection. Despite its relevance, few works have explored the applic...
Potential of SSL for remote sensing applications
Complex data could lead to larger distances and therefore to higher errors when predicting (also the different spatial resolutions)
Areas of Interest (AOIs) used in this study. Bands indicate the splits for train (yellow), validation (blue) and test (pink). In total there are 167K image chips for CONUS, 163K chips for Middle East, 147K chips for Pakistan-India, 285K chips for China and 83K chips for South America, which aggregates to 845K chips covering a surface of 16.9M km2....
Foundational models, in which a pretext task is performed, have shown enormous advantages in problems with small or sparsely labeled data sets. The attention mechanism used by most foundational models has shown remarkable properties not only in large language models but in computer vision. The potential usefulness of these foundational models could...
Satellite-based remote sensing is instrumental in the monitoring and mitigation of the effects of anthropogenic climate change. Large scale, high resolution data derived from these sensors can be used to inform intervention and policy decision making, but the timeliness and accuracy of these interventions is limited by use of optical data, which ca...
In this work we pre-train a DINO-ViT based model using two Synthetic Aperture Radar datasets (S1GRD or GSSIC) across three regions (China, Conus, Europe). We fine-tune the models on smaller labeled datasets to predict vegetation percentage, and empirically study the connection between the embedding space of the models and their ability to generaliz...
In this work we pretrain a CLIP/ViT based model using three different modalities of satellite imagery across five AOIs covering over 10% of the earth total landmass, namely Sentinel 2 RGB optical imagery, Sentinel 1 SAR amplitude and Sentinel 1 SAR interferometric coherence. This model uses ∼ 250 M parameters. Then, we use the embeddings produced f...
This paper presents a novel density estimation method for anomaly detection using density matrices (a powerful mathematical formalism from quantum mechanics) and Fourier features. The method can be seen as an efficient approximation of Kernel Density Estimation (KDE). A systematic comparison of the proposed method with eleven state-of-the-art anoma...
The main goal of this thesis is to develop efficient non-parametric density estimation methods that can be integrated with deep learning architectures, for instance, convolutional neural networks and transformers. Density estimation methods can be applied to different problems in statistics and machine learning. They may be used to solve tasks such...
This paper presents an anomaly detection model that combines the strong statistical foundation of density-estimation-based anomaly detection methods with the representation-learning ability of deep-learning models. The method combines an autoencoder, that learns a low-dimensional representation of the data, with a density-estimation model based on...
This paper presents a novel approach to probabilistic deep learning (PDL), quantum kernel mixtures, derived from the mathematical formalism of quantum density matrices, which provides a simpler yet effective mechanism for representing joint probability distributions of both continuous and discrete random variables. The framework allows for the cons...
Anomaly detection is an important task in several applications. Classical approaches, such as the isolation forest, suffer with high-dimensional data such as images. Deeper models, such as autoencoders, do not directly optimize an anomaly loss function, but exploit the proxy task by optimizing the reconstruction error. In this paper, we propose a n...
The main goal of this thesis is to propose efficient non-parametric density estimation methods that can be integrated with deep learning architectures, for instance, convolutional neural networks and transformers. Density estimation methods can be applied to different problems in statistics and machine learning. They may be used to solve tasks such...
Kernel density estimation (KDE) is one of the most widely used nonparametric density estimation methods. The fact that it is a memory-based method, i.e., it uses the entire training data set for prediction, makes it unsuitable for most current big data applications. Several strategies, such as tree-based or hashing-based estimators, have been propo...
Density estimation is a fundamental task in statistics and machine learning that aims to estimate, from a set of samples, the probability density function of the distribution that generated them. There are different methods for addressing this problem but recently deep-neural density estimation methods have emerged as a powerful alternative. This p...
La Asociación Colombiana de Ingenieros de Sistemas (Acis) adelantó una investigación a través de sus plataformas en Internet, con el propósito de indagar sobre el uso de tecnologías de la información enfocadas en Machine Learning Operations (MLOps) y explorar el impacto de su penetración en el ámbito empresarial colombiano.
El análisis de datos y el aprendizaje de máquina están modificando cada aspecto actual de nuestras vidas. Actividades que antes eran realizadas enteramente por seres humanos como la detección de enfermedades, la detección de objetos, el reconocimiento y síntesis de voz, entre otros, hoy en día están siendo automatizadas usando algoritmos de aprendi...
This paper presents a novel anomaly detection method, called AD-DMKDE, based on the use of Kernel Density Estimation (KDE) along with density matrices (a powerful mathematical formalism from quantum mechanics) and Fourier features. The proposed method was systematically compared with eleven state-of-the-art anomaly detection methods on various data...
AD-DMKDE is a novel anomaly detection method that combines density matrices (a mathematical formalism from quantum mechanics) and Fourier features. The method can be seen as an efficient approximation of Kernel Density Estimation (KDE) AD-DMKDE was systematically compared against eleven state-of-the-art anomaly detection methods on a variety of ben...
Governments have to supervise and inspect social economy enterprises (SEEs). However, inspecting all SEEs is not possible due to the large number of SEEs and the low number of inspectors in general. We proposed a prediction model based on a machine learning approach. The method was trained with the random forest algorithm with historical data provi...
This paper presents an anomaly detection model that combines the strong statistical foundation of density-estimation-based anomaly detection methods with the representation-learning ability of deep-learning models. The method combines an autoencoder, for learning a low-dimensional representation of the data, with a density-estimation model based on...
This paper presents a novel density estimation method for anomaly detection using density matrices (a powerful mathematical formalism from quantum mechanics) and Fourier features. The method can be seen as an efficient approximation of Kernel Density Estimation (KDE). A systematic comparison of the proposed method with eleven state-of-the-art anoma...
Streaming anomaly detection refers to the problem of detecting anomalous data samples in streams of data. This problem poses challenges that classical and deep anomaly detection methods are not designed to cope with, such as conceptual drift and continuous learning. State-of-the-art flow anomaly detection methods rely on fixed memory using hash fun...
Governments have to supervise and inspect social economy enterprises (SEEs). However, inspecting all SEEs is not possible due to the large number of SEEs and the low number of inspectors in general. We proposed a prediction model based on a machine learning approach. The method was trained with the random forest algorithm with historical data provi...
A density matrix describes the statistical state of a quantum system. It is a powerful formalism to represent both the quantum and classical uncertainty of quantum systems and to express different statistical operations such as measurement, system combination and expectations as linear algebra operations. This paper explores how density matrices ca...
Kernel density estimation (KDE) is one of the most widely used nonparametric density estimation methods. The fact that it is a memory-based method, i.e., it uses the entire training data set for prediction, makes it unsuitable for most current big data applications. Several strategies, such as tree-based or hashing-based estimators, have been propo...
Density estimation is a fundamental task in statistics and machine learning applications. Kernel density estimation is a powerful tool for non-parametric density estimation in low dimensions; however, its performance is poor in higher dimensions. Moreover, its prediction complexity scale linearly with more training data points. This paper presents...
Kernel density estimation (KDE) is one of the most widely used nonparametric density estimation methods. The fact that it is a memory-based method, i.e., it uses the entire training data set for prediction, makes it unsuitable for most current big data applications. Several strategies, such as tree-based or hashing-based estimators, have been propo...
Density estimation is a central task in statistics and machine learning. This problem aims to determine the underlying probability density function that best aligns with an observed data set. Some of its applications include statistical inference, unsupervised learning, and anomaly detection. Despite its relevance, few works have explored the appli...
The main goal of this PhD thesis proposal presentation is to develop efficient non-parametric density estimation methods that can be integrated with deep learning architectures, for instance, convolutional neural networks and transformers. Density estimation methods can be applied to different problems in statistics and machine learning. They may b...
A density matrix describes the statistical state of a quantum system. It is a powerful formalism to represent both the quantum and classical uncertainty of quantum systems and to express different statistical operations such as measurement, system combination and expectations as linear algebra operations. This paper explores how density matrices ca...
This paper shows that least-square estimation (mean calculation) in a reproducing kernel Hilbert space (RKHS) F corresponds to different M-estimators in the original space depending on the kernel function associated with F. In particular, we present a proof of the correspondence of mean estimation in an RKHS for the Gaussian kernel with robust esti...
This paper shows that least-square estimation (mean calculation) in a reproducing kernel Hilbert space (RKHS) $\mathcal{F}$ corresponds to different M-estimators in the original space depending on the kernel function associated with $\mathcal{F}$. In particular, we present a proof of the correspondence of mean estimation in an RKHS for the Gaussian...
Our work shows that estimating the mean in a feature space induced by certain kinds of kernels is the same as doing a robust mean estimation using an M-estimator in the original problem space. In particular, we show that calculating the average on a feature space induced by a Gaussian kernel is equivalent to perform robust mean estimation with the...
This presentations shows that least-square estimation (mean calculation) in a reproducing kernel Hilbert space (RKHS) $\mathcal{F}$ corresponds to different M-estimators in the original space depending on the kernel function associated with $\mathcal{F}$. In particular, we present a proof of the correspondence of mean estimation in an RKHS for the...