
Ankur NarangSigmoidstar · AI & Tech
Ankur Narang
Doctor of Philosophy
Research on Generative AI, Quantum
About
29
Publications
7,368
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
90
Citations
Introduction
Skills and Expertise
Additional affiliations
May 2022 - present
Analytics India Magazine
Position
- Member
Description
- AI applications and innovations in industry verticals
May 2021 - present
Bytelearn
Position
- Cofounder & Advisory Board Member
Description
- AI & Tech for Interactive Math for K12 education
November 2020 - May 2023
Hike
Position
- Head
Description
- Leading Innovative Applications and Research in Generative AI
Publications
Publications (29)
Stickers are popularly used in messaging apps such as Hike to visually express a nuanced range of thoughts and utterances to convey exaggerated emotions. However, discovering the right sticker from a large and ever expanding pool of stickers while chatting can be cumbersome. In this paper, we describe a system for recommending stickers in real time...
The Federated Learning setting has a central server coordinating the training of a model on a network of devices. One of the challenges is variable training performance when the dataset has a class imbalance. In this paper, we address this by introducing a new loss function called Fed-Focal Loss. We propose to address the class imbalance by reshapi...
Speech-driven facial video generation has been a complex problem due to its multi-modal aspects namely audio and video domain. The audio comprises lots of underlying features such as expression, pitch, loudness, prosody(speaking style) and facial video has lots of variability in terms of head movement, eye blinks, lip synchronization and movements...
Large pretrained models, such as Bert, GPT, and Wav2Vec, have demonstrated great potential for learning representations that are transferable to a wide variety of downstream tasks. It is difficult to obtain a large quantity of supervised data due to the limite d availability of resources and time. In light of this, a significant amount of research...
Large pretrained models like Bert, GPT, and Wav2Vec have demonstrated their ability to learn transferable representations for various downstream tasks. However, obtaining a substantial amount of supervised data remains a challenge due to resource and time limitations. As a solution, researchers have turned their attention to using large pretrained...
Large pre-trained models, such as Bert, GPT, and Wav2Vec, have demonstrated great potential for learning representations that are transferable to a wide variety of downstream tasks . It is difficult to obtain a large quantity of supervised data due to the limited availability of resources and time. In light of this, a significant amount of research...
Stickers are popularly used while messaging to visually express nuanced thoughts. We describe a real-time sticker recommendation (SR) system. We decompose SR into two steps: predict the message that is likely to be sent, and substitute that message with an appropriate sticker. To address the challenges caused by transliteration of message from user...
Text-to-speech (TTS) systems are designed to synthesize natural and expressive speech, adapt to an unseen voice, and capture the speaking style of an unseen speaker by converting text into speech. The introduction of an unseen speaker’s speaking style into a TTS system offers a wide range of application scenarios, including personal assistant, news...
In this paper, we propose a novel normalization framework, multi-modal normalization(MultiNorm) that learns the multiple modalities through affine transformations involved in the normalization architecture. We have shown its effectiveness in speech-driven facial video generation and video emotion detection which are complex problems due to its mult...
Stickers are popularly used while messaging to visually express nuanced thoughts. We describe a real-time sticker recommendation (SR) system. We decompose SR into two steps: predict the message that is likely to be sent, and substitute that message with an appropriate sticker. To address the challenges caused by transliteration of message from user...
COVID-19 has made the immersive experiences such as video conferencing, virtual reality/augmented reality, the most important modes of exchanging information. Despite much advancement in the network bandwidth and codec techniques, the current system still suffers from glitches, lags and poor video quality, especially under unreliable network condit...
We consider the challenging problem of audio to animated video generation. We propose a novel method OneShotAu2AV to generate an animated video of arbitrary length using an audio clip and a single unseen image of a person as an input. The proposed method consists of two stages. In the first stage, OneShotAu2AV generates the talking-head video in th...
Audio to Video generation is an interesting problem that has numerous applications across industry verticals including film making, multi-media, marketing, education and others. High-quality video generation with expressive facial movements is a challenging problem that involves complex learning steps for generative adversarial networks. Further, e...
The style of the speech varies from person to person and every person exhibits his or her own style of speaking that is determined by the language, geography, culture and other factors. Style is best captured by prosody of a signal. High quality multi-speaker speech synthesis while considering prosody and in a few shot manner is an area of active r...
Federated learning has allowed the training of statistical models over remote devices without the transfer of raw client data. In practice, training in heterogeneous and large networks introduce novel challenges in various aspects like network load, quality of client data, security and privacy. Recent works in FL have worked on improving communicat...
Text classification is a primary task in Natural Language Processing (NLP). It has many real-life applications, such as web search, information retrieval, ranking and document classification. Information retrieval systems have become an indispensable tool in our day-today life. We use it often to retrieve documents, images, locate/search places, re...
The task of finding variance change points has been the focus of considerable research in sequential data analysis. In spite of empirical success of many change point algorithms, there are several unresolved issues: (a) use various probabilistic modeling assumptions in one form and another, (b) fail when there are multiple change points, especially...
We present a novel offline variance change point detection algorithm based on dynamic mode decomposition (DMD). The developed algorithm dynamic mode decomposition based variance change point detection (DVCPD) is completely data driven, doesn't require any knowledge of underlying governing equation or any probabilistic model assumption for time seri...
A system, method and program product for managing hydrocarbon energy production. A hydrocarbon field modeler models physical characteristics of a hydrocarbon energy field. A load predictor predicts processing workload in modeling the hydrocarbon energy field, and identifying a balanced modeling unit distribution across multiple processors simulatin...
Modeling of big faults or weak planes of strong and weak discontinuities is of major importance to assess the Geomechanical behaviour of mining/civil tunnel, reservoirs etc. For modelling fractures in Geomechanics, prior art has been limited to Interface Elements which suffer from numerical instability and where faults are required to be aligned wi...
The stability of underground structures made (especially) in jointed rock mass is of the utmost important to designers, engineers and operators. Rock bolting is generally being practised to reinforce excavation walls and roofs by minimizing the movement of rock joints. This study proposes a new analytical model for the prediction of displacements,...
Modeling of discontinuities (fractures, joints, fault planes) is of major importance to assess the geomechanical behavior of oil and gas reservoirs. A methodology is developed in extended finite element method (XFEM) to analyze the behaviour of pre-existing multiple intersecting discontinuities or joints in rock material. This XFEM procedure allows...
Modeling of discontinuities (fractures, joints, fault planes) is of major importance to assess the geomechanical behavior of oil and gas reservoirs. A methodology is developed in extended finite element method (XFEM) to analyze the behaviour of pre-existing multiple intersecting discontinuities or joints in rock material. This XFEM procedure allows...