A. Debnath’s research while affiliated with Raksha Shakti University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (3)


Multi-Similarity Checking-Based Spoken Content Video Retrieval Using Enhanced Mayfly Optimization-Based Weighted Feature Selection
  • Article

August 2024

·

1 Read

International Journal of Image and Graphics

A. Debnath

·

K. Sreenivasa Rao

·

Partha P. Das

The general mechanism of “cascading automatic speech recognition (ASR)” with the retrieval of text information has been very successfully used for performing spoken content retrieval. Since retrieval presentation seriously depends on ASR accuracy, this approach performs better when the ASR accuracy is relatively high. However, it is less applicable to difficult real-world scenarios. This difficulty prompts the development of different methods for spoken content retrieval that is over the fundamental strategy of “cascading ASR with text retrieval” to achieve retrieval performances with higher ASR accuracy. Therefore, this paper develops an efficient spoken term retrieval model from videos based on the multi-similarity function. This model is performed under the testing and training stage. In the training phase, the experimental videos are collected from the real-time platform. Then, the audio is retrieved from the videos, from which the spectral features are extracted, and are further transferred to the optimal weighted feature selection process. Here, the weight is tuned by the offered Inertia weight upgraded mayfly optimization algorithm (IWU-MOA). The tuned weight value is then multiplied by the extracted spectral features to generate the novel set of features and it is reserved in the feature database. In the testing phase, the query is obtained as the spoken words, and further, the spectral features are extracted from the spoken term. The extracted spectral features are searched on the trained feature database by considering the multi-similarity function for retrieving the appropriate videos based on the user requirements. The efficacy of the offered retrieval model is analyzed with the conventional spoken word retrieval models to display the efficacy of the proposed framework. The improvement of the designed technique is applicable for the real-time alerting benefits and automatic human activity detection systems. The major downside of during the development and testing of the model is class imbalance and also it is quite limited for some of the datasets which will be resolved in the upcoming works.


A multi-modal lecture video indexing and retrieval framework with multi-scale residual attention network and multi-similarity computation
  • Article
  • Publisher preview available

December 2023

·

35 Reads

·

3 Citations

Signal Image and Video Processing

Due to technological development, the mass production of video and its storage on the Internet has increased. This made a huge amount of videos to be available on websites from various sources. Thus, the retrieval of essential lecture videos from multimedia is difficult. So, an effective way of indexing and retrieving the video by considering various similarities in the video features is suggested using the deep learning method in this paper. From the standardized set of data, the videos containing lectures are obtained for training. The optimal keyframes are selected from the obtained videos employing the Adaptive Anti-Corona virus Optimization Algorithm. Then the video contents are segmented and arranged on the basis of the optimized keyframes. The optical characters, such as semantic words and keywords, are recognized by means of Optical Character Reorganization, and the image features are extracted from the segmented frames with the help of a Multi-scale Residual Attention Network (MRAN). The generated pool of features is arranged and stored in the database according to the contents. Text and video queries are given as the input for testing the trained model. The features from the text query and the features of the optimized keyframes from the video query are obtained with the help of MRAN in the testing phase. The generated pool features from the text and video queries are compared with the features that are stored in the database for analyzing the similarities using Cosine, Jacquard, and Euclidean similarity indices. From this, the multi-similarity features are used for retrieval of the relevant videos in accordance with the provided query. The experimental results show that the performance of the proposed system for video indexing and retrieval is better and more efficient than the existing methods of video retrieval.

View access options

Citations (1)


... Therefore, the design of insulation systems in stator windings plays a critical role in determining the overall insulation level of electrical machinery. To ensure sufficient mechanical strength for motor insulation, the thickness of the main insulation must be increased [3,4]. However, to improve heat dissipation and reduce the size of the alternator, the main insulation thickness should be minimized, creating a design conflict [5]. ...

Reference:

Parametric Study and Improvement of Anti-Corona Structure in Stator Bar End Based on Finite Element Analysis
A multi-modal lecture video indexing and retrieval framework with multi-scale residual attention network and multi-similarity computation

Signal Image and Video Processing