Gil-Jin Jang

Gil-Jin Jang
  • Ulsan National Institute of Science and Technology

About

59
Publications
7,271
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
966
Citations
Current institution
Ulsan National Institute of Science and Technology

Publications

Publications (59)
Article
Background and Objectives The purpose of this study is to evaluate value of diagnostic tool for vocal cord palsy utilizing artificial intelligence without laryngoscopeMaterials and Method A dataset consisting of recordings from patients with unilateral vocal cord paralysis (n=54) as well as normal individuals (n=163). The dataset included prolonged...
Article
Full-text available
Neural machine translation (NMT) methods based on various artificial neural network models have shown remarkable performance in diverse tasks and have become mainstream for machine translation currently. Despite the recent successes of NMT applications, a predefined vocabulary is still required, meaning that it cannot cope with out-of-vocabulary (O...
Article
Full-text available
Speech recognition consists of converting input sound into a sequence of phonemes, then finding text for the input using language models. Therefore, phoneme classification performance is a critical factor for the successful implementation of a speech recognition system. However, correctly distinguishing phonemes with similar characteristics is stil...
Article
Full-text available
In this paper, we propose speech/music pitch classification based on recurrent neural network (RNN) for monaural speech segregation from music interferences. The speech segregation methods in this paper exploit sub-band masking to construct segregation masks modulated by the estimated speech pitch. However, for speech signals mixed with music, spee...
Article
Full-text available
Single-channel singing voice separation has been considered a difficult task, as it requires predicting two different audio sources independently from mixed vocal and instrument sounds recorded by a single microphone. We propose a new singing voice separation approach based on the curriculum learning framework, in which learning is started with onl...
Article
A Long Short-Term Memory (LSTM) Recurrent Neural Network (RNN) has driven tremendous improvements on an acoustic model based on Gaussian Mixture Model (GMM). However, these models based on a hybrid method require a forced aligned Hidden Markov Model (HMM) state sequence obtained from the GMM-based acoustic model. Therefore, it requires a long compu...
Article
A novel single channel blind source separation method based on probabilistic matrix factorisation (PMF) is proposed. Compared to the conventional non-negative matrix factorisation (NMF) employing Euclidean distance or Kullback-Leibler divergence, PMF uses the log posterior probability as a cost function for optimising spectrum and activation matric...
Article
Face recognition is a well-known approach for identity recognition. Variation in head pose is a main factor that interferes with face recognition systems. This paper proposes an efficient head pose determination method and its application to face recognition on a multi-pose face DB in order to solve the pose variation-related problem. The first ste...
Article
Full-text available
The Internet of Things (IoT) operates solely on local interactions among its components, which include various devices with communications capabilities. Because the IoT is a fully distributed computing network, it is important to mitigate any negative effects resulting from faults occurring in its components and to provide sustainable services. Thi...
Chapter
This paper proposes a voice QR code for mobile devices. The QR code shows great performance for error correction and recovers decoding errors caused by skewed image angle or luminosity. In order to correct an image shot of the QR code symbol, a complex error code and data map need to be generated. Additionally, there is a need for an efficient QR c...
Chapter
In this paper, we analyze the performance of feed forward neural network (FFNN)-based language model in contrast with n-gram. The probability of n-gram language model was estimated based on the statistics of word sequences. The FFNN-based language model was structured by three hidden layers, 500 hidden units per each hidden layer, and 30 dimension...
Conference Paper
introduces an efficient method for identifying various facial expressions from image inputs. To recognize the emotions of the facial expressions, a number of facial feature points were extracted. The extracted feature points are then transformed to 49-dimensional feature vectors which are robust to scale and translational variations, and the facial...
Article
Full-text available
There are several applications connected to IT health devices on the self-organizing software platform (SoSp) that allow patients or elderly users to be cared for remotely by their family doctors under normal circumstances or during emergencies. An evaluation of the SoSp applied through PAAR watch/self-organizing software platform router was conduc...
Article
This paper proposes an efficient method for automatically distinguishing various facial expressions. To recognize the emotions from facial expressions, the facial images are obtained by digital cameras, and a number of feature points were extracted. The extracted feature points are then transformed to 49-dimensional feature vectors which are robust...
Article
This paper proposes an efficient algorithm for detecting occlusions in a video sequences of ground vehicles using color information. The proposed method uses a rectangular window to track a target vehicle, and the window is horizontally divided into several sub-regions of equal width. Each region is determined to be occluded or not based on the col...
Article
This study proposes an unsupervised noise reduction scheme that improves the performance of voice-based information retrieval tasks in mobile environments. Various types of noises could interfere with speech processing tasks, and noise reduction has become an essential technique in this field. In particular, noise reduction needs to be carefully pr...
Chapter
In the thermal imaging, human object detection is difficult when the temperatures of surrounding objects are similar to or higher than human’s. In this paper, we propose a novel algorithm suitable to those environments. The proposed method first compute a mean and variance of each pixel value from the initial several frames, assuming that there is...
Chapter
This paper addresses the issues of single microphone based noise estimation technique for speech recognition in noisy environments. Many researches have been performed on the environmental noise estimation; however, most of them require voice activity detection (VAD) for accurate estimation of noise characteristics. We propose an approach for effic...
Article
Full-text available
In this paper, we propose a human detection technique using thermal imaging camera. The proposed method is useful at night or rainy weather where a visible light imaging cameras is not able to detect human activities. Under the observation that a human is usually brighter than the background in the thermal images, we estimate the preliminary human...
Article
A novel method is proposed to improve the voice recognition performance by suppressing acoustic interferences that add nonlinear distortion to a target recording signal when received by the recognition device. The proposed method is expected to provide the best performance in smart TV environments, where a remote control collects command speech by...
Article
Full-text available
This study proposes a music-aided framework for affective interaction of service robots with humans. The framework consists of three systems, respectively, for perception, memory, and expression on the basis of the human brain mechanism. We propose a novel approach to identify human emotions in the perception system. The conventional approaches use...
Article
Full-text available
A novel method is proposed to improve the performance of independent vector analysis (IVA) for blind signal separation of acoustic mixtures. IVA is a frequency-domain approach that successfully resolves the well-known permutation problem by applying a spherical dependency model to all pairs of frequency bins. The dependency model of IVA is equivale...
Article
The work in this paper concerns the determination of a recognition unit in a small footprint Chinese large vocabulary word recognition system for embedded devices. This paper proposes a method of extending the conventional initial/final phonetic units in Chinese language to be used as a recognition unit. The word recognition performance of the prop...
Article
This paper proposes a novel Language Model (LM) adaptation method based on Minimum Discrimination Information (MDI). In the proposed method, a background LM is viewed as a discrete distribution and an adapted LM is built to be as close as possible to the background LM, while satisfying unigram constraint. This is due to the fact that there is a lim...
Conference Paper
Full-text available
We propose a pitch track correction technique using a sigmoidal objective function in a particle filtering framework. The conventional method for pitch track correction simply considers the peak locations of the autocorrelation functions as the pitch values and depends only on the longest reliable pitch streak. This constraint may induce pitch corr...
Article
This paper proposes a multistage utterance verification method as a post-processing technique for online spoken content retrieval in portable electric devices. The online spoken content retrieval system analyzes spoken content in an online manner and searches speech segments of pre-defined keywords. To maintain stable performance, we propose a reli...
Article
Full-text available
1 Abstract—A novel approach for pitch track correction and music-speech classification is proposed in order to improve the performance of the speech segregation system. The proposed pitch track correction method adjusts unreliable pitch estimates from adjacent reliable pitch streaks, in contrast to the previous approach using a single pitch streak...
Article
Full-text available
This paper presents a novel method for estimating reliable noise spectral magnitude for acoustic background noise suppression where only a single microphone recording is available. The proposed method finds noise estimates from spectral magnitudes measured at line spectral frequencies (LSFs), under the observation that adjacent LSFs are near the pe...
Article
Full-text available
This paper proposes a noise suppression technique for speech-centric interface of various smart devices. The proposed method estimates noise spectral magnitudes from line spectral frequencies (LSFs), using the observation that adjacent LSFs correspond to peak frequencies of spectrum, whereas isolated LSFs are close to flattened valley frequencies r...
Article
In a previous study we identified metallothionein (MT) as a candidate gene potentially influencing collaterogenesis. In this investigation, we determined the effect of MT on collaterogenesis and examined the mechanisms contributing to the effects we found. Collateral blood flow recovery was assessed using laser Doppler perfusion imaging, and angiog...
Article
Full-text available
To determine if the patterns uncovered with variational Bayesian-independent component analysis-mixture model (VIM) applied to a large set of normal and glaucomatous fields obtained with the Swedish Interactive Thresholding Algorithm (SITA) are distinct, recognizable, and useful for modeling the severity of the field loss. SITA fields were obtained...
Conference Paper
Full-text available
We propose a new blind source separation approach that models the inherent signal dependencies such as those observed in speech signals in order to solve the problem of separating convolved sources. The frequency domain methods for the convolved mixture problem require a solution to the well-known permutation problem. Our approach is based on assum...
Article
Full-text available
Despite numerous animal trials reporting that cell therapy promotes collateral flow, clinical trials have not convincingly shown benefit. Patient-related risk factors are often used to explain these discrepancies. However, during the course of our own angiogenesis studies using mice, we noted large anatomical variability in collateral vessels. The...
Chapter
This chapter discusses source separation methods when only single channel observation is available. The problem is underdeterministic, in that multiple source signals should be extracted from a single stream of observations. To overcome the mathematical intractability, prior information on the source characteristics is generally assumed and applied...
Article
Rapamycin has been shown to reduce neointimal thickening in the setting of balloon angioplasty and chronic graft vessel disease. This study was designed to test the effect of oral rapamycin on atherosclerotic plaque progression and the possible mechanism involved. Apolipoprotein E (apoE) knockout mice were fed either a diet supplemented with choles...
Article
Resistin, an adipocyte-derived cytokine linked to insulin resistance and obesity, has recently been shown to activate endothelial cells (ECs). Using microarrays, we found that along with numerous other pro-atherosclerotic genes, resistin expression levels are elevated in the aortas of C57BL/6J apoE-/- mice; these findings led us to further explore...
Conference Paper
Despite an abundance of research outcomes of blind source separation (BSS) in many types of simulated environments, their performances are still not satisfiable to apply to the real environments. The major obstacle may seem the finite filter length of the assumed mixing model and the nonlinear sensor noises. This paper presents a two-step speech en...
Article
Full-text available
An algorithm for single channel signal separation is presented. The algorithm projects the observed signal to given subspaces, and recovers the original sources by probabilistic weighting and recombining the subspace signals. The results of separating mixtures of two different natural sounds are reported.
Article
We review the sparse representation principle for processing speech signals. A transformation for encoding the speech signals is learned such that the resulting coefficients are as independent as possible. We use independent component analysis with an exponential prior to learn a statistical representation for speech signals. This representation le...
Conference Paper
Full-text available
Our goal is to extract multiple source signals when only a single observation channel is available. We propose a new signal separation algorithm based on a subspace decomposition. The observation is transformed into subspaces of interest with different sets of basis functions. A flexible model for density estimation allows an accurate modeling of t...
Article
Full-text available
This paper presents a new technique for achieving blind signalseparation when given only a single channel recording. The mainconcept is based on exploiting sets of time-domainbasis functions learned by independent component analysis (ICA) tothe separation of mixed source signals observed in a singlechannel. The inherent time structure of sound sour...
Article
We apply independent component analysis for extracting an optimal basis to the problem of finding efficient features for representing speech signals of a given speaker. The speech segments are assumed to be generated by a linear combination of the basis functions, thus the distribution of speech segments of a speaker is modeled by adapting the basi...
Article
Full-text available
A new technique has been developed to enable blind source separation given only a single channel recording. The proposed method infers source signals and their contribution factors at each time point by a number of adaptation steps maximizing log-likelihood of the estimated source parameters given the observed single channel data and sets of basis...
Conference Paper
Full-text available
We present a new technique for achieving source separation when given only a single channel recording. The main idea is based on exploiting the inherent time structure of sound sources by learning a priori sets of basis filters in time domain that encode the sources in a statistically efficient manner. We derive a learning algorithm using a maximum...
Article
Full-text available
The goal of this paper is to learn or adapt statistical features of gender specific speech signals. The adaptation is performed by finding basis functions that encode the speech signal such that the resulting coefficients are statistically independent and the information redundancy is minimized. We use a flexible independent component analysis (ICA...
Article
Full-text available
This paper presents a technique for extracting multiple source signals when only a single channel observation is available. The proposed separation algorithm is based on a subspace decomposition. The observation is pro-jected onto subspaces of interest with different sets of basis functions, and the original sources are obtained by weighted sums of...

Network

Cited By