Sushant Gautam

Sushant Gautam
Simula Research Laboratory · Department of Holistic Systems at SimulaMet

Master of Science
Interested in operationalizing AI/ML.

About

33
Publications
15,924
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
66
Citations
Introduction
PhD Student | Graduated from the Department of Electronics and Computer Engineering, Central Campus Pulchowk, IOE, Tribhuvan University.
Additional affiliations
March 2019 - present
UBL R&D Center Nepal
Position
  • Project Manager
Description
  • Experimenting using new technology in various domains including education, environment and healthcare. Project Lead for NSD-Al Project
March 2019 - August 2019
Leapfrog Technology, Inc.
Position
  • AI Intern
Description
  • RNN models for time series modelling on Tensorflow for weather and pollution prediction. ARIMA model and windowing based time series predictive models Model deployment and serving
January 2018 - August 2019
UGC Research Project
Position
  • Research Assistant
Description
  • > Research assistant to Dr Nanda Bikram Adhikari for his UGC Collaborative Research Grant Project focused on finding ways to measure pollution and environment levels on Kathmandu valley.
Education
September 2020 - August 2022
Tribhuvan University
Field of study
  • MSc in Informatics and Intelligent Systems Engineering
September 2015 - September 2019
Tribhuvan University
Field of study
  • Computer Engineering

Publications

Publications (33)
Thesis
Full-text available
Soccer dominates the global sports market, and viewers’ interest in watching videos of soccer matches is ramping up. Globally, there is a huge and constantly increasing amount of soccer game content being generated, including video footage, audio commentary, text metadata, goal and player statistics, scores, and rankings. As a large percentage of a...
Conference Paper
Full-text available
Soccer is one of the most popular sports globally, and the amount of soccer-related content worldwide, including video footage, audio commentary, team/player statistics, scores, and rankings, is enormous and rapidly growing. Consequently, the generation of multimodal summaries is of tremendous interest for broadcasters and fans alike, as a large pe...
Preprint
Full-text available
In association football, the development of multimodal summaries is of great importance to both broadcasters and spectators since a large number of viewers choose to follow just the soccer game highlights. The fundamental drive for the development of summarization systems is the requirement to manage huge amounts of data in different formats. By hi...
Conference Paper
Full-text available
Nepal, containing a rugged elevation ranging from less than 100 meters to over 8000 meters and having various climates varying from tropical to alpine and perpetual snow, has a great potential for the study of the highly varying environment and weather proxies. Fine spatio-temporal-scale measurements of such data using sufficiently distributed auto...
Preprint
This paper examines the integration of real-time talking-head generation for interviewer training, focusing on overcoming challenges in Audio Feature Extraction (AFE), which often introduces latency and limits responsiveness in real-time applications. To address these issues, we propose and implement a fully integrated system that replaces conventi...
Preprint
Full-text available
This paper demonstrates PlayerTV, an innovative framework which harnesses state-of-the-art Artificial Intelligence (AI) technologies for automatic player tracking and identification in soccer videos. By integrating object detection and tracking, Optical Character Recognition (OCR), and color analysis, Play-erTV facilitates the generation of player-...
Preprint
Full-text available
Extracting meaningful insights from large and complex datasets poses significant challenges, particularly in ensuring the accuracy and relevance of retrieved information. Traditional data retrieval methods such as sequential search and index-based retrieval often fail when handling intricate and interconnected data structures, resulting in incomple...
Preprint
Full-text available
We introduce Kvasir-VQA, an extended dataset derived from the HyperKvasir and Kvasir-Instrument datasets, augmented with question-and-answer annotations to facilitate advanced machine learning tasks in Gastrointestinal (GI) diagnostics. This dataset comprises 6,500 annotated images spanning various GI tract conditions and surgical instruments, and...
Preprint
Full-text available
In the rapidly evolving field of sports analytics, the automation of targeted video processing is a pivotal advancement. We propose PlayerTV, an innovative framework which harnesses state-of-the-art AI technologies for automatic player tracking and identification in soccer videos. By integrating object detection and tracking, Optical Character Reco...
Preprint
Full-text available
The rapid evolution of digital sports media necessitates sophisticated information retrieval systems that can efficiently parse extensive multimodal datasets. This paper introduces SoccerRAG, an innovative framework designed to harness the power of Retrieval Augmented Generation (RAG) and Large Language Models (LLMs) to extract soccer-related infor...
Preprint
Full-text available
The rapid evolution of digital sports media necessitates sophisticated information retrieval systems that can efficiently parse extensive multimodal datasets. This paper demonstrates SoccerRAG, an innovative framework designed to harness the power of Retrieval Augmented Generation (RAG) and Large Language Models (LLMs) to extract soccer-related inf...
Preprint
Full-text available
Fact-checking is a crucial natural language processing (NLP) task that verifies the truthfulness of claims by considering reliable evidence. Traditional methods are often limited by labour-intensive data curation and rule-based approaches. In this paper, we present FactGenius, a novel method that enhances fact-checking by combining zero-shot prompt...
Conference Paper
Full-text available
The rapid advancement of technology has been revolutionizing the field of sports media, where there is a growing need for sophisticated data processing methods. Current methodologies for extracting information from soccer broadcast videos to generate game highlights and summaries for social media are predominantly manual and rely heavily on text-ba...
Conference Paper
Full-text available
This paper introduces SoccerSum, a novel dataset aimed at enhancing object detection and segmentation in video frames depicting the soccer pitch, using footage from the Norwegian Eliteserien league across 2021-2023. With the goal of detecting elements beyond common entities in existing datasets, such as the soccer ball, players and referees, this d...
Conference Paper
Full-text available
This paper introduces TACDEC, a dataset of tackle events in soccer game videos. Recognizing the gap in existing open datasets that predominantly focus on official soccer events such as goals and cards, TACDEC targets a comprehensive analysis of tackles --- a critical aspect of soccer that combines technical skills, tactical decision-making, and phy...
Conference Paper
Full-text available
Social media plays a significant role for sports organizations with millions of active fans, but publishing highlights is often a tedious manual operation. With the development of AI, new tools are available for content generation and personalization to engage audiences. We propose an AI-based multimedia production framework for the automatic publi...
Preprint
Full-text available
In the era of digitalization, social media has become an integral part of our lives, serving as a significant hub for individuals and businesses to share information, communicate, and engage. This is also the case for professional sports, where leagues, clubs and players are using social media to reach out to their fans. In this respect, a huge amo...
Conference Paper
Full-text available
With the increasing availability of multimodal data, especially in the sports and medical domains, there is growing interest in developing Artificial Intelligence (AI) models capable of comprehending the world in a more holistic manner. Nevertheless, various challenges exist in multimodal understanding, including the integration of multiple modalit...
Book
Full-text available
Dystonia is a movement disorder that causes unusual movements and involuntary muscle contractions affecting some parts of the whole body. Selecting drugs and doses is a highly personalized process for dystonia, requiring frequent visits to the clinic, pointing toward the need for more systematic and objective methods of collecting patient data. A d...
Poster
Full-text available
This project, for the first time of its nature, introduces a new research paradigm of remote motion sensing for health monitoring of civil construction on the public safety domain in Nepal. Preliminary data from a piloting study from BRB encourages us to move forward with an aging analysis of such civil structures. Students from DoECE at IOE, Pulch...
Thesis
Full-text available
Nepal, containing a rugged elevation ranging from less than 100 meters to over 8,848 meters and having various climate varying from tropical to alpine and perpetual snow has a great potential for the study of highly varying environment and weather proxies. Fine spatio-temporal scale measurements of such data using sufficiently distributed automatic...
Conference Paper
Full-text available
One of the major challenges in searching on the internet has been that search engines and online forums have not been able to extract and pinpoint exact answer to people's query despite information being available on the internet. Extraction of to-the-point answers from articles, posts and blogs tend to improve search accuracy. Sentence Ranking hel...
Preprint
Full-text available
One of the major challenges in searching on the internet has been that the search engines and online forums have not been able to extract and pinpoint the exact answer to people's query despite information being available on the internet. Extraction of to-the-point answers from articles, posts and blogs tend to improve the search accuracy. Sentence...
Conference Paper
Full-text available
The street lighting system is based upon the electronic controller that utilizes the traffic density survey data. An android mobile app was developed for this purpose. Data was collected and analyzed at different busy junctions of Kathmandu Valley. The app maintained a database record of each vehicles type that enter in the system and simultaneousl...
Thesis
Full-text available
Facial Landmarks Detection is a neural network-based project built using re-learning approach that uses Histogram of oriented gradients (HOG) features of images to train a deep learning neural network in order to extract the facial features from the image. This report presents the methodology and algorithms for detecting facial landmarks through th...
Research Proposal
Full-text available
This project aims to design a system that takes a video as an input splitting it into frames and obtains images extracting its features and landmarks through the algorithms of machine learning thus providing a base for a number of possible systems. The main objectives of this project are: · To design a system that takes an image as an input and det...
Book
Full-text available
The Refugee Crisis is unequivocally the most burning global problem that has been haunting the modern world. With the causes and factors ranging from political instability, safety threat and mass discrimination to environmental hazards or simply a search for a quality life, the number of displaced people is on an ever-increasing curve. For the most...
Book
Full-text available
Shanti is a project that provides part time vocational training and opportunities to the women, specially the victims of gender-based violence, so that they can get independent and do something for living on their own. Vocational training includes sewing clothes and preparation of textiles accessories like bags, cushion covers and purses.

Network

Cited By