[Show abstract][Hide abstract] ABSTRACT: This paper investigates a face recognition system based on Scale Invariant Feature Transform (SIFT) feature and its distribution on feature space. The system takes advantage of SIFT which possess strong robustness to expression, accessory pose and illumination variations. Since we use each of SIFT keypoint as the feature of face and SIFT keypoints are very complicated in feature space, we apply the feature partition on Self Organizing Map (SOM) and adopt local Multilayer Perceptron (MLP) for each node on map to improve the classification performance. Moreover the distinctive features from all SIFT keypoints in each face class are defined and extracted based on feature distribution on SOM. Finally the face can be recognized through the proposed scoring method depending on the classification result of these distinctive features. In the experiments, the proposed method gave a higher face recognition rate than other methods including matching and holistic feature based methods in three famous databases.
[Show abstract][Hide abstract] ABSTRACT: In this paper, we propose an algorithm for the frequency channel segmentation using peaks and valleys in spectrogram. The frequency channel segments means that local groups of channels in frequency domain that could be arisen from the same sound source. The proposed algorithm is based on the smoothed spectrum of the input sound. Peaks and valleys in the smoothed spectrum are used to determine centers and boundaries of segments, respectively. To evaluate a suitableness of the proposed segmentation algorithm before that the grouping stage is applied, we compare the synthesized results using ideal mask with that of proposed algorithm. Simulations are performed with mixed speech signals with narrow band noises, wide band noises and other speech signals.
[Show abstract][Hide abstract] ABSTRACT: In this paper, we propose a new face detector that is less affected with background. To reduce the affect of various backgrounds, we apply more strong constraints to face. In previous works, classier in face detector determine that the input image is more like face or more like non-face, so the training set for non- face has more affect face detection. But to apply more strong constraints for face, the detector determines only whether the input image is like face or not, i.e. background has less affect in decision process. Constraints that used in this paper for face is how the image is look like face (image based), and that the image satisfies structural features of face, especially edge of face. The experimental result for proposed face/non-face classifier showed 95.8% classification rate of face and 96.5% classification rate of non-face with a small quantity efface image for a set of training.
Convergence Information Technology, 2007. International Conference on; 12/2007
[Show abstract][Hide abstract] ABSTRACT: In this research, we propose an efficient Speech/Music discrimination method that uses spectrum analysis and neural network. The proposed method extracts the duration feature parameter(MSDF) from a spectral peak track by analyzing the spectrum, and it was used as a feature for Speech/Music discriminator combined with the MFSC. The neural network was used as a Speech/Music discriminator, and we have reformed various experiments to evaluate the proposed method according to the training pattern selection, size and neural network architecture. From the results of Speech/Music discrimination, we found performance improvement and stability according to the training pattern selection and model composition in comparison to previous method. The MSDF and MFSC are used as a feature parameter which is over 50 seconds of training pattern, a discrimination rate of 94.97% for speech and 92.38% for music. Finally, we have achieved performance improvement 1.25% for speech and 1.69% for music compares to the use of MFSC.
[Show abstract][Hide abstract] ABSTRACT: In this paper, we propose an algorithm for the frequency channel segmentation using a neural oscillatory network. The frequency channel segments means that local groups of channels in frequency domain that could be arisen from the same sound source. The proposed algorithm is based on the smoothed spectrum of the input sound. Valleys in the smoothed spectrum are used to determine vertical weights and the continuity of segment boundaries is used to determine vertical weights in the oscillatory network. To evaluate a suitableness of the proposed segmentation algorithm before the grouping stage is applied, we compare the synthesis results of ideal mask with that of proposed algorithm