Franz de Leon's research while affiliated with University of the Philippines and other places

Publications (18)

Preprint
Full-text available
The grammatical analysis of texts in any human language typically involves a number of basic processing tasks, such as tokenization, morphological tagging, and dependency parsing. State-of-the-art systems can achieve high accuracy on these tasks for languages with large datasets, but yield poor results for languages such as Tagalog which have littl...
Conference Paper
Music separation aims to extract the signals of individual sources from a given audio mixture. Recent studies explored the use of deep learning algorithms for this problem. Although these algorithms have proven to have good performance, they are inefficient as they need to learn an independent model for each sound source. In this study, we demonstr...
Conference Paper
Inspired by the success of image classification and speech recognition, deep learning algorithms have been explored to solve music source separation. Solving this problem would open to a wide range of applications like automatic transcription, audio post-production, and many more. Most algorithms usually use the Short Time Fourier Transform (STFT)...
Conference Paper
Full-text available
The classical guitar is described as a "miniature orchestra" due to the various tone colors that it can produce. One parameter that can control the timbre on the guitar is the excitation point on the string. In this study, a machine learning model was built to determine the excitation point on the string given an audio signal. It was noted that inc...
Conference Paper
In this study, we compared two methods for extracting the melody pitch from select Philippine indigenous music. Pitch is expressed as the fundamental frequency of the main melodic voice or lead instrument of a music sample. Our implementation of automatic melody extraction involves blind source separation and pitch detection. For blind source separ...

Citations

... The data sets used were the MIR-1K and DSD100 data sets. Multi-task learning of music sources was presented by Tan et al. [21]. This was a much optimized method; it separated the sources initially and then the parallel modules, which were further followed by classification and recovery. ...
... The proposed system utilizes a Time-Frequency (T-F) representation of the audio mixture. Based on previous experiments, the best T-F representation for the architecture is the Short Time Fourier Transform (STFT) with the following parameters: window length set to 512 with 75% overlap and a Blackman window function [13]. ...
... A similar work has been done by [6] where the estimate is obtained by analyzing the time lag between consecutive pulses arriving at the bridge of the guitar using an undersaddle pickup, which yielded estimation errors that are usually less than 1 cm. In [7], the plucking point position is estimated using a supervised learning approach which yields an average error of at least 0.07 cm. The authors have also supported the statement that the tonal parameter is affected by varying physiology since the performance of the machine learning model drastically reduced when it was trained on a certain subject and tested on a different subject. ...
... The vibration modes of guitar primarily denote the actual deformation of the top-plate and back-plate of a guitar body [12]. Hence, only the sound box was considered for the modal study. ...
... However, the schemes proposed by [6], [14] can be fragile when the eavesdropper is equipped with a wideband receiver and is interested in all MUEs' communication in that only the intended MUE is protected in [6], [14]. [18] provides a way to address this issue by formulating an optimization problem constrained by each user's secrecy capacity requirement, to optimize the precoding vectors of the MBS and four SBSs so that the power consumption can be minimized. In principle, most studies on PLS in UDNs are built on some predetermined frequency assignment schemes, on the basis of which either power allocation or beamforming, with only the typical subchannel and the group of users on it considered, is performed. ...
... There are three popular approaches in music separation. First, the Harmonic Percussive Source Separation (HPSS) separates the audio mixture into harmonic and percussive components through determining the contours of the signal in the time-frequency domain [2]- [4]. The second approach is the singing voice separation. ...