Conference Paper

Optimal filtering of dynamics in short-time features for music organization.

Conference: ISMIR 2006, 7th International Conference on Music Information Retrieval, Victoria, Canada, 8-12 October 2006, Proceedings
Source: DBLP

ABSTRACT There is an increasing interest in customizable methods for organizing music collections. Relevant music characteriza- tion can be obtained from short-time features, but it is not obvious how to combine them to get useful information. In this work, a novel method, denoted as the Positive Con- strained Orthonormalized Partial Least Squares (POPLS), is proposed. Working on the periodograms of MFCCs time series, this supervised method finds optimal filters which pick up the most discriminative temporal information for any music organization task. Two examples are presented in the paper, the first being a simple proof-of-concept, where an altosax with and without vibrato is modelled. A more complex 11 music genre classification setup is also inves- tigated to illustrate the robustness and validity of the pro - posed method on larger datasets. Both experiments showed the good properties of our method, as well as superior per- formance when compared to a fixed filter bank approach suggested previously in the MIR literature. We think that the proposed method is a natural step towards a customized MIR application that generalizes well to a wide range of dif- ferent music organization tasks.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The enormous growth of digital music databases has led to a comparable growth in the need for methods that help users organize and access such information. One area in particular that has seen much recent research activity is the use of automated techniques to describe audio content and to allow for its identification, browsing and retrieval. Conventional approaches to music content description rely on features characterizing the shape of the signal spectrum in relatively short-term frames. In the context of Automatic Speech Recognition (ASR), Hermansky \cite{Hermansky_1} described an interesting alternative to short-term spectrum features, the TRAP-TANDEM approach which uses long-term band-limited features trained in a supervised fashion. We adapt this idea to the specific case of music signals and propose a generic system for the description of temporal patterns. The same system with different settings is able to extract features describing either timbre or rhythmic content. The quality of the generated features is demonstrated in a set of music retrieval experiments and compared to other state-of-the-art models.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Since a few years, classification in music research is a very broad and quickly growing field. Most important for adequate classification is the knowledge of adequate observable or deduced features on the basis of which meaningful groups or classes can be distinguished. Unsupervised classification additionally needs an adequate similarity or distance measure grouping is to be based upon. Evaluation of supervised learning is typically based on the error rates of the classification rules. In this paper we first discuss typical problems and possible influential features derived from signal analysis, mental mechanisms or concepts, and compositional structure. Then, we present typical solutions of such tasks related to music research, namely for organization of music collections, transcription of music signals, cognitive psychology of music, and compositional structure analysis.
    Advances in Data Analysis and Classification 11/2007; 1:255-291. · 0.92 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The subject of music information retrieval (MIR) is to analyze and categorize music pieces. Over the last years many approaches have been designed to automatically extract music data from the digitized audio signal. This article presents a survey of the state-of-the-art algorithms on the basis of a broad literature study and a tool analysis. It should help to navigate through different MIR techniques and tools. An overview of different music features to characterize timbre, harmony, melody and rhythmic information is given. The various time scales of feature extraction to form meta-features from basic features are discussed. The task-specific pruning of features is presented to reduce the computational complexity. The article continues with a discussion of different classification techniques and how the results are evaluated. Finally the properties of four state-of-the-art MIR tools are outlined.


Available from