May 2025
What is this page?
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
Publications (250)
May 2025

·

·

·
[...]
·

Small Uncrewed Aerial Systems (sUAS) are increasingly deployed as autonomous swarms in search-and-rescue and other disaster-response scenarios. In these settings, they use computer vision (CV) to detect objects of interest and autonomously adapt their missions. However, traditional CV systems often struggle to recognize unfamiliar objects in open-world environments or to infer their relevance for mission planning. To address this, we incorporate large language models (LLMs) to reason about detected objects and their implications. While LLMs can offer valuable insights, they are also prone to hallucinations and may produce incorrect, misleading, or unsafe recommendations. To ensure safe and sensible decision-making under uncertainty, high-level decisions must be governed by cognitive guardrails. This article presents the design, simulation, and real-world integration of these guardrails for sUAS swarms in search-and-rescue missions.
April 2025
·
10 Reads
April 2025
·
20 Reads
April 2025
·
15 Reads
Meaningful progress has been made in open world learning (OWL), enhancing the ability of agents to detect, characterize, and incrementally learn novelty in dynamic environments. However, novelty remains a persistent challenge for agents relying on state‐of‐the‐art learning algorithms. This article considers the current state of OWL, drawing on insights from a recent DARPA research program on this topic. We identify open issues that impede further advancements spanning theory, design, and evaluation. In particular, we emphasize the challenges posed by dynamic scenarios that are crucial to understand for ensuring the viability of agents designed for real‐world environments. The article provides suggestions for setting a new research agenda that effectively addresses these open issues.
February 2025
·
3 Reads
Over the past few decades, machine learning has made remarkable strides, owed largely to algorithmic advancements and the abundance of high-quality, large-scale datasets. However, an equally crucial aspect in achieving optimal model performance is the fine-tuning of hyperparameters. Despite its significance, hyperparameter optimization (HPO) remains challenging due to several factors. Many existing HPO techniques rely on simplistic search methods or assume smooth and continuous loss functions, which may not always hold true. Traditional methods like grid search and Bayesian optimization often struggle to adapt swiftly and efficiently navigate the loss landscape. Moreover, the search space for HPO is frequently high-dimensional and non-convex, posing challenges in efficiently finding a global minimum. Additionally, optimal hyperparameters can vary significantly based on the dataset or task at hand, further complicating the optimization process. To address these challenges, this paper presents HomOpt, an advanced HPO methodology that integrates a surrogate model framework with homotopy optimization techniques. Unlike rigid methodologies, HomOpt offers flexibility by incorporating diverse surrogate models tailored to specific optimization tasks. Our initial investigation focuses on leveraging Generalized Additive Model (GAM) surrogates within the HomOpt framework to enhance the effectiveness of existing optimization methodologies. HomOpt's ability to expedite convergence towards optimal solutions across varied domain spaces, encompassing continuous, discrete, and categorical domains is highlighted. We conduct a comparative analysis of HomOpt applied to multiple optimization techniques (e.g., Random Search, TPE, Bayes, and SMAC), demonstrating improved objective performance on numerous standardized machine learning benchmarks and challenging open-set recognition tasks. We also integrate CatBoost within the HomOpt framework as a surrogate, showcasing its adaptability and effectiveness in handling more complex datasets. This integration facilitates an evaluation against state-of-the-art methods such as BOHB, particularly on challenging computer vision datasets like CIFAR-10 and ImageNet. Comparative analyses reveal HomOpt's competitive performance with reduced iterations and underscore potential optimizations in execution time. All the experimentation and method code can be found here: https://github.com/sabraha2/HOMOPT
February 2025
·
2 Reads
February 2025
·
8 Reads
HCI is increasingly taking inspiration from philosophical and religious traditions as a basis for ethical technology designs. If these values are to be incorporated into real-world designs, there may be challenges when designers work with values unfamiliar to them. Therefore, we investigate the variance in interpretations when values are translated to technology designs. To do so we identified social media designs that embodied the main principles of Catholic Social Teaching (CST). We then interviewed 24 technology experts with varying levels of familiarity with CST to assess how their understanding of how those values would manifest in a technology design. We found that familiarity with CST did not impact participant responses: there were clear patterns in how all participant responses differed from the values we determined the designs embodied. We propose that value experts be included in the design process to more effectively create designs that embody particular values.
February 2025
·
6 Reads
In Natural Language Processing (NLP), semantic matching algorithms have traditionally relied on the feature of word co-occurrence to measure semantic similarity. While this feature approach has proven valuable in many contexts, its simplistic nature limits its analytical and explanatory power when used to understand literary texts. To address these limitations, we propose a more transparent approach that makes use of story structure and related elements. Using a BERT language model pipeline, we label prose and epic poetry with story element labels and perform semantic matching by only considering these labels as features. This new method, Story Grammar Semantic Matching, guides literary scholars to allusions and other semantic similarities across texts in a way that allows for characterizing patterns and literary technique.
February 2025
·
10 Reads
Face parsing is a fundamental task in computer vision, enabling applications such as identity verification, facial editing, and controllable image synthesis. However, existing face parsing models often lack fairness and robustness, leading to biased segmentation across demographic groups and errors under occlusions, noise, and domain shifts. These limitations affect downstream face synthesis, where segmentation biases can degrade generative model outputs. We propose a multi-objective learning framework that optimizes accuracy, fairness, and robustness in face parsing. Our approach introduces a homotopy-based loss function that dynamically adjusts the importance of these objectives during training. To evaluate its impact, we compare multi-objective and single-objective U-Net models in a GAN-based face synthesis pipeline (Pix2PixHD). Our results show that fairness-aware and robust segmentation improves photorealism and consistency in face generation. Additionally, we conduct preliminary experiments using ControlNet, a structured conditioning model for diffusion-based synthesis, to explore how segmentation quality influences guided image generation. Our findings demonstrate that multi-objective face parsing improves demographic consistency and robustness, leading to higher-quality GAN-based synthesis.
Citations (47)
... The activity domains consisted of Angry Birds , Monopoly (Kejriwal and Thomas 2021), CartPole , VizDoom (Wydmuch, Kempka, and Jaśkowski 2018), Polycraft (Goss et al. 2023), and CARLA (Dosovitskiy et al. 2017). The perceptual domains consisted of video activity recognition (Prijatelj et al. 2024), natural language text identification , and image recognition (Kumar et al. 2021). ...
Reference:
Open issues in open world learning
- Citing Article
December 2024
Journal of Artificial Intelligence Research
... 151 Moving forward, researchers use knowledge related to brain functions and combine machine learning with 152 high-throughput behavioral optogenetics to stimulate very precise brain areas. They have found that the 153 nature and magnitude of hallucinations experienced by macaques highly depend on concurrent visual input, 154 the location of brain stimulation, and the intensity of the stimulation (Shahbazi et al., 2024). 155 Some would argue that current neural network architectures and their connection to the human brain 156 are fleeting given that understanding of the dynamic processes in the brain has changed and the artificial 157 intelligence community has understandably prioritized complex optimization-based architectures over • the use of convolution operations in CNNs is a consequence of our notion of how simple cells in V1 172 perform linear filtering, by "calculating" the weighted sum of their inputs, with weights defined by the 173 receptive field profiles Wiesel, 1959, 1962), 174 • filtering kernels in CNNs' early layers often converge (during CNN's training) to Gabor wavelets 175 (Gabor, 1946), which were found as good models (in a least squares sense) of the receptive field 176 profiles in simple cells in visual cortex (Daugman, 1985), ...
- Citing Article
- Full-text available
April 2024
... Human vision has shown significant robustness to occlusion [83,98] and outperformed CV models on aerial detection of persons during search and rescue scenarios [60]. With previous research having successfully incorporated human perception into CV models in a variety of manners [31,34,83], we addressed the challenge of person aerial Figure 1. Development of Psych-ER, our behavioral dataset for Emergency Response (ER) aerial search, and its derived psychophysical loss. ...
- Citing Article
February 2024
Proceedings of the IEEE
... To evaluate our theoretical framework, we conduct an empirical study using two influential prototypebased architectures: ProtoVAE [27] and PrototypeDNN [51] (hereinafter ProtoDNN, for simplicity). These models are particularly well-suited to our analysis, as they define class-level explanations in terms of entire prototypes, rather than relying on fine-grained, local features that explain only part of the input [17,35,16]. We trained both architectures on the MNIST and Fashion-MNIST datasets, and for an increasing number of prototypes: |S| P t10, 20, 50, 100u. ...
Reference:
Fixed Point Explainability
- Citing Conference Paper
January 2024
... For example, in earthquake and flood scenarios, UAVs have been used to map damaged areas and locate survivors more rapidly compared to traditional ground-based methods [17]. The use of UAVs helps to overcome physical barriers that often hinder rescue teams, as demonstrated in various studies [18]. ...
- Citing Conference Paper
January 2024
... A more sophisticated approach considers hyperparameter tuning as an optimization problem, aiming to find the function that maximizes model performance or minimizes errors [13]. Examples of such techniques include Bayesian optimization [10], gradient-based optimization [9], genetic algorithms [5], and surrogate models [14], [15]. However, these techniques can be challenging for inexperienced modelers and often require a large number of runs or substantial memory resources. ...
- Citing Conference Paper
October 2023
... For example, novel methods for approximating Shapley values or integrating gradient-based techniques with other interpretability strategies could be explored. Another important direction is the refinement of post-hoc methods to capture higher-order interactions between input features [105]. Current methods typically focus on the individual importance of each feature, but many decision-making processes in large models involve complex interactions between multiple features. ...
- Citing Chapter
August 2023
Synthesis Lectures on Computer Vision
... HCR field has been studied extensively for more than five decades [1,7,25,28,31,52,53,56,65,74,85,93,94]. Earlier, the focus of research was mainly on developing feature extraction techniques and applying different classification techniques for recognition. ...
- Citing Chapter
August 2023
Synthesis Lectures on Computer Vision
... Approaches include approximating a complex, biased model with an ostensibly fair, interpretable one [Aivodji et al., 2019], designing an unfair model that switches to generating 'fair' predictions when being audited, akin to automotive manufacturers cheating emissions tests [Slack et al., 2020], or performing biassed sampling of the data points used to compute the SHAP values [Laberge et al., 2023a]. Recently, authors have proposed methods to detect and thwart such attacks , Carmichael and Scheirer, 2022. However, the present paper focusses on a higher level threat: the risk of (misguided) AutoML. ...
- Citing Article
June 2023
Proceedings of the AAAI Conference on Artificial Intelligence
... In support of the thermal ATR task, Abraham et al. [33] improve the performance of YOLOv5 models on the DSIAC dataset via a novel homotopy-based hyperparameter optimization algorithm. Leveraging the visible imagery of the DSIAC dataset, VS et al. [34] propose a metalearning strategy for unsupervised domain adaptation in thermal ATR. ...
- Citing Conference Paper
June 2023