Ivan Kukanov

Ivan Kukanov
University of Eastern Finland | UEF · School of Computing

PhD of Computer Science

About

13
Publications
17,269
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
117
Citations
Citations since 2017
8 Research Items
112 Citations
201720182019202020212022202305101520
201720182019202020212022202305101520
201720182019202020212022202305101520
201720182019202020212022202305101520

Publications

Publications (13)
Preprint
Full-text available
Embeddings play an important role in many recent end-to-end solutions for language processing problems involving more than one data modality. Although there has been some effort to understand the properties of single-modality embedding spaces, particularly that of text, their cross-modal counterparts are less understood. In this work, we study a jo...
Preprint
Full-text available
Since the invention of cinema, the manipulated videos have existed. But generating manipulated videos that can fool the viewer has been a time-consuming endeavor. With the dramatic improvements in the deep generative modeling, generating believable looking fake videos has become a reality. In the present work, we concentrate on the so-called deepfa...
Conference Paper
Full-text available
The paper is a work in progress report on the development of DigiMo, a chatbot with emotional intelligence. The chatbot development is based on a data collection and annotations of real dialogues between local Singaporeans expressing genuine emotions. The models were trained with Cakechat, an open source sequence-to-sequence deep neural network. Pe...
Article
Full-text available
Bottleneck features (BNFs) generated with a deep neural network (DNN) have proven to boost spoken language recognition accuracy over basic spectral features significantly. However, BNFs are commonly extracted using language-dependent tied-context phone states as learning targets. Moreover, BNFs are less phonetically expressive than the output layer...
Preprint
Full-text available
The I4U consortium was established to facilitate a joint entry to NIST speaker recognition evaluations (SRE). The latest edition of such joint submission was in SRE 2018, in which the I4U submission was among the best-performing systems. SRE'18 also marks the 10-year anniversary of I4U consortium into NIST SRE series of evaluation. The primary obje...
Conference Paper
Full-text available
This work tackles the problem of the domestic audio tagging or environmental sound classification, where one audio recording can contain one or more acoustic events and a rec-ognizer should output all of those tags. A baseline model for this task is a convolutional recurrent neural network (CRNN) with sigmoid output nodes optimized using the binary...
Conference Paper
Full-text available
This article describes the systems jointly submitted by Institute for Infocomm (I$^2$R), the Laboratoire d'Informatique de l'Universit\'e du Maine (LIUM), Nanyang Technology University (NTU) and the University of Eastern Finland (UEF) for 2015 NIST Language Recognition Evaluation (LRE). The submitted system is a fusion of nine sub-systems based on...
Conference Paper
Full-text available
We have recently proposed a universal acoustic characterisa-tion to foreign accent recognition, in which any spoken foreign accent was described in terms of a common set of fundamental speech attributes. Although experimental evidence demonstrated the feasibility of our approach, we belive that speech attributes , namely manner and place of articul...

Network

Cited By

Projects