Mona Jalal

Mona Jalal
Boston University | BU · Department of Computer Science

Computer Science graduate student specialized in Computer Vision and Deep Learning with interest in Computer Graphics and Natural Language Processing. Lover of all things math and code!

About

17
Publications
5,546
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
54
Citations
Introduction
My research during my Computer Science graduate studies at Boston University has encompassed deep learning, computer vision (affect analysis, (2D/3D) pose estimation and tracking), and vision and language (visual question answering) as well as efficient deep learning for detecting cancer in pathology images. I did a 6-month computer vision research internship (2020-2021) at DawnLight on 3D human-object interaction reconstruction and 3D pose estimation for activity understanding.

Publications

Publications (17)
Conference Paper
Full-text available
The adoption of “human in the loop” paradigms in computer vision and machine learning is leading to various applications where the actual data acquisition (e.g., human supervision) and the underlying inference algorithms are closely interwined. While classical work in active learning provides effective solutions when the learning module involves cl...
Conference Paper
Full-text available
We present a new, publicly-available image dataset generated by the NVIDIA Deep Learning Data Synthesizer intended for use in object detection, pose estimation, and tracking applications. This dataset contains 144k stereo image pairs that synthetically combine 18 camera viewpoints of three photorealistic virtual environments with up to 10objects (c...
Conference Paper
Full-text available
Advanced analytics is a booming area in the data management industry and a hot research topic. Almost all toolkits that implement machine learning (ML) algorithms assume that the input is a single table, but most relational datasets are not stored as single tables due to normalization. Thus, analysts often join tables to obtain a denormalized table...
Preprint
Full-text available
Visual complexity identifies the level of intricacy and details in an image or the level of difficulty to describe the image. It is an important concept in a variety of areas such as cognitive psychology, computer vision and visualization, and advertisement. Yet, efforts to create large, downloadable image datasets with diverse content and unbiased...
Preprint
Full-text available
Monitoring population-level changes in diet could be useful for education and for implementing interventions to improve health. Research has shown that data from social media sources can be used for monitoring dietary behavior. We propose a scrape-by-location methodology to create food image datasets from Instagram posts. We used it to collect 3.56...
Conference Paper
Abstract— In this work, we propose a video-based transfer learning approach for predicting problem outcomes of students working with an intelligent tutoring system (ITS). By analyzing a student’s face and gestures, our method predicts the outcome of a student answering a problem in an ITS from a video feed. Our work is motivated by the reasoning th...
Conference Paper
Full-text available
When journalists cover a news story, they can cover the story from multiple angles or perspectives. These perspectives are called "frames", and usage of one frame or another may influence public perception and opinion of the issue at hand. We develop a web-based system for analyzing frames in multilingual text documents. We propose and guide users...
Conference Paper
Full-text available
News media structure their reporting of events or issues using certain perspectives. When describing an incident involving gun violence, for example, some journalists may focus on mental health or gun regulation, while others may emphasize the discussion of gun rights. Such perspectives are called "frames" in communication research. We study, for t...
Preprint
Full-text available
When journalists cover a news story, they can cover the story from multiple angles or perspectives. A news article written about COVID-19 for example, might focus on personal preventative actions such as mask-wearing, while another might focus on COVID-19's impact on the economy. These perspectives are called "frames," which when used may influence...
Preprint
Full-text available
The adoption of "human-in-the-loop" paradigms in computer vision and machine learning is leading to various applications where the actual data acquisition (e.g., human supervision) and the underlying inference algorithms are closely interwined. While classical work in active learning provides effective solutions when the learning module involves cl...
Article
Full-text available
In this paper, we focus on visual complexity, an image attribute that humans can subjectively evaluate based on the level of details in the image. We explore unsupervised information extraction from intermediate convolutional layers of deep neural networks to measure visual complexity. We derive an activation energy metric that combines convolution...
Preprint
Full-text available
In the context of building an intelligent tutoring system (ITS), which improves student learning outcomes by intervention, we set out to improve prediction of student problem outcome. In essence, we want to predict the outcome of a student answering a problem in an ITS from a video feed by analyzing their face and gestures. For this, we present a n...
Preprint
Full-text available
We report results of a comparison of the accuracy of crowdworkers and seven NaturalLanguage Processing (NLP) toolkits in solving two important NLP tasks, named-entity recognition (NER) and entity-level sentiment(ELS) analysis. We here focus on a challenging dataset, 1,000 political tweets that were collected during the U.S. presidential primary ele...
Article
Crowdcoding, a method that outsources “coding” tasks to numerous people on the internet, has emerged as a popular approach for annotating texts and visuals. However, the performance of this approach for analyzing social media data in the context of journalism and mass communication research has not been systematically assessed. This study evaluated...
Conference Paper
Full-text available
Monitoring population-level changes in diet could be useful for education and for implementing interventions to improve health. Research has shown that data from social media sources can be used for monitoring dietary behavior.We propose a scrape-by-location methodology to create food image datasets from Instagram posts. We used it to collect 3.56...
Conference Paper
Full-text available
Switches and communication links of Network on Chips (NoCs) are highly vulnerable to transient faults due to the use of nano-scale VLSI technologies in fabrication of NoCs. This paper proposes a reconfigurable switch architecture which is capable of operating in four configurations with different levels of reliability. This is done by the use of a...
Conference Paper
Full-text available
Mapping of tasks on the cores of a Network-on-Chip (NoC) has direct impact on the efficiency of the network. This paper provides a comprehensive study regarding application mapping for NoCs to clarify their pros and cons. The study considers different aspects including performance, power consumption, and reliability of mappings. Four mappings named...

Questions

Questions (4)
Question
I have two synchronized videos of a moving deformable objects in two slightly different views. What is the best available open-source code or GitHub repo for sparse or dense 3D reconstruction of the moving object from these two videos? Please share a link.
Question
I am looking for a animal pose estimation code that predicts the pose of an animal given 100-200 annotated frames from scratch using deep learning on a frame-by-frame basis. Is there any such code? I am not looking for something like DeepLabCut or DeepPoseKit or LEAP/SLEAP.ai tools. Looking for a simple baseline preferably written in PyTorch that could be easily modified.
Question
How can also BERT do entity-level sentiment analysis? Is there an open-source tool based on BERT that does so?
Question
Please let me know if you are aware of other tools that are tested and I can feed them text and they can return me the detected entities as well as the entity-level sentiment.

Network

Cited By

Projects

Projects (5)
Project
Estimating the pose as well as tracking the object in 2D and 3D for both deformable and non-deformable objects such as solid objects, animals, and humans as well as its applications in virtual reality/augmented reality, robotics, physical therapy, etc.
Project
Using computer vision and natural language processing methods for analyzing of public communications such as news outlets and social media.