Robust music identification based on low-order zernike moment in the compressed domain.
ABSTRACT In this paper, we devise a novel robust music identification algorithm utilizing compressed-domain audio Zernike moment adapted from image processing techniques as the pivotal feature. Audio fingerprint derived from this feature exhibits strong robustness against various audio signal distortions including the challenging pitch shifting and time-scale modification. Experiments show that in our test dataset composed of 1822 popular songs, a 5s music query example which might have been severely corrupted is still sufficient to identify its original near-duplicate copy, with more than 90% top five precision rate.
- [Show abstract] [Hide abstract]
ABSTRACT: In this paper, we present an organized survey of the existing literature on music information retrieval systems in which descriptor features are extracted directly from the compressed audio files, without prior decompression to pulse-code modulation format. Avoiding the decompression step and utilizing the readily available compressed-domain information can significantly lighten the computational cost of a music information retrieval system, allowing application to large-scale music databases. We identify a number of systems relying on compressed-domain information and form a systematic classification of the features they extract, the retrieval tasks they tackle and the degree in which they achieve an actual increase in the overall speed—as well as any resulting loss in accuracy. Finally, we discuss recent developments in the field, and the potential research directions they open toward ultra-fast, scalable systems.New Review in Hypermedia and Multimedia 07/2014; 20(3). DOI:10.1080/13614568.2014.889223 · 0.30 Impact Factor