Sachin Kajareker's research while affiliated with Apple Inc. and other places
What is this page?
This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.
If you're a ResearchGate member, you can follow this page to keep up with this author's work.
If you are this author, and you don't want us to display this page anymore, please let us know.
Publications (5)
We describe our novel deep learning approach for driving animated faces using both acoustic and visual information. In particular, speech-related facial movements are generated using audiovisual information, and non-speech facial movements are generated using only visual information. To ensure that our model exploits both modalities during training...
Speech-driven visual speech synthesis involves mapping acoustic speech features to the corresponding lip animation controls for a face model. This mapping can take many forms, but a powerful approach is to use deep neural networks (DNNs). The lack of synchronized audio, video, and depth data is a limitation to reliably train DNNs, especially for sp...
Speech-driven visual speech synthesis involves mapping features extracted from acoustic speech to the corresponding lip animation controls for a face model. This mapping can take many forms, but a powerful approach is to use deep neural networks (DNNs). However, a limitation is the lack of synchronized audio, video, and depth data required to relia...
Citations
... The applications of talking face generation can be broadly categorized into two groups, as depicted in Fig. 1. The first group involves generating talking faces based on text inputs, which can be used for video production or multimodal chatbots [2][3][4][5][6][7][8]. In most cases, this group also requires simultaneous generation of speech synchronized with talking faces. ...
... Visão geral do método proposto porAbdelaziz et al. (2020).Fonte:Abdelaziz et al. (2020). ...
... 3D Coefficient based. Besides 2D facial coefficient models, 3D facial coefficients via principal component analysis (PCA) are more commonly used in VSG [67,70,171,172,173,174,175]. Pham et al. [171,172,176] proposed utilizing CNN + RNN based backbone architectures to map audio signals to blendshape coefficients [177] of a 3D face. ...