Conference Paper

Multi-modal music genre classification approach

Multimedia Lab. of Inf. Sch., Renmin Univ. of China, Beijing, China
DOI: 10.1109/ICCSIT.2010.5564489 Conference: Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on, Volume: 8
Source: IEEE Xplore

ABSTRACT As a fundamental and critical component of music information retrieval (MIR) systems, automatically classifying music by genre is a challenging problem. The traditional approaches which solely depending on low-level audio features may not be able to obtain satisfactory results. In recent years, the social tags have emerged as an important way to provide information about resources on the web. So, in this paper we propose a novel multi-modal music genre classification approach which uses the acoustic features and the social tags together for classifying music by genre. For the audio content-based classification, we design a new feature selection algorithm called IBFFS (Interaction Based Forward Feature Selection). This algorithm selects the features depending on the pre-computed rules which considering the interaction between the different features. In addition, we are interested in another aspect, that is how performing automatic music genre classification depending on the available tag data. Two classification methods based on the social tags (including music-tags and artist-tags) which crawled from website are developed in our work: (1) we use the generative probabilistic model Latent Dirichlet Allocation (LDA) to analyze the music-tags. Then, we can obtain the probability of every tag belonging to each music genre. (2) The starting point of the second method is that music's artist is often associated with music genres more closely. Therefore, we can compute the similarity between the artist-tag vectors to infer which genre the music belongs to. At last, our experimental results demonstrate the benefit of our multi-modal music genre classification approach.

  • [Show abstract] [Hide abstract]
    ABSTRACT: A novel framework for music genre classification, namely the joint sparse low-rank representation (JSLRR) is proposed in order to: 1) smooth the noise in the test samples, and 2) identify the subspaces that the test samples lie onto. An efficient algorithm is proposed for obtaining the JSLRR and a novel classifier is developed, which is referred to as the JSLRR-based classifier. Special cases of the JSLRR-based classifier are the joint sparse representation-based classifier and the low-rank representation-based one. The performance of the three aforementioned classifiers is compared against that of the sparse representation-based classifier, the nearest subspace classifier, the support vector machines, and the nearest neighbor classifier for music genre classification on six manually annotated benchmark datasets. The best classification results reported here are comparable with or slightly superior than those obtained by the state-of-the-art music genre classification methods.
    12/2014; 22(12):1905-1917. DOI:10.1109/TASLP.2014.2355774

Full-text (2 Sources)

Available from
Jun 2, 2014