About
148
Publications
18,336
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,365
Citations
Citations since 2017
Additional affiliations
January 2006 - June 2016
Publications
Publications (148)
The mere availability of most musical heritage as image documents or audio recordings does not enable tasks such as indexing or editing unless they are transcribed into a structured digital format. Given the cost and time required for manual transcription, Optical Music Recognition (OMR) and Automatic Music Transcription (AMT) deal with the aforeme...
The recognition of patterns that have a time dependency is common in areas like speech recognition or natural language processing. The equivalent situation in image analysis is present in tasks like text or video recognition. Recently, Convolutional Recurrent Neural Networks (CRNN) have been broadly applied to solve these tasks in an end-to-end fas...
End-to-end solutions have brought about significant advances in the field of Optical Music Recognition. These approaches directly provide the symbolic representation of a given image of a musical score. Despite this, several documents, such as pianoform musical scores, cannot yet benefit from these solutions since their structural complexity does n...
Music transcription, which deals with the conversion of music sources into a structured digital format, is a key problem for Music Information Retrieval (MIR). When addressing this challenge in computational terms, the MIR community follows two lines of research: music documents, which is the case of Optical Music Recognition (OMR), or audio record...
Optical Music Recognition (OMR) systems typically consider workflows that include several steps, such as staff detection , symbol recognition, and semantic reconstruction. However, fine-tuning these systems is costly due to the specific data labeling process that has to be performed to train models for each of these steps. In this paper, we present...
Music transcription, which deals with the conversion of music sources into a structured digital format, is a key problem for Music Information Retrieval (MIR). When addressing this challenge in computational terms, the MIR community follows two lines of research: music documents, which is the case of Optical Music Recognition (OMR), or audio record...
Inspired by the Text Recognition field, end-to-end schemes based on Convolutional Recurrent Neural Networks (CRNN) trained with the Connectionist Temporal Classification (CTC) loss function are considered one of the current state-of-the-art techniques for staff-level Optical Music Recognition (OMR). Unlike text symbols, music-notation elements may...
State-of-the-art end-to-end Optical Music Recognition (OMR) systems use Recurrent Neural Networks to produce music transcriptions, as these models retrieve a sequence of symbols from an input staff image. However, recent advances in Deep Learning have led other research fields that process sequential data to use a new neural architecture: the Trans...
Optical music recognition is a research field whose efforts have been mainly focused, due to the difficulties involved in its processes, on document and image recognition. However, there is a final step after the recognition phase that has not been properly addressed or discussed, and which is relevant to obtaining a standard digital score from the...
Pattern Recognition tasks in the structural domain generally exhibit high accuracy results, but their time efficiency is quite low. Furthermore, this low performance is more pronounced when dealing with instance-based classifiers, since, for each query, the entire corpus must be evaluated to find the closest prototype. In this work we address this...
The digitization of the content within musical manuscripts allows the possibility of preserving, disseminating, and exploiting that cultural heritage. The automation of this process has been object of study for a long time in the field of Optical Music Recognition (OMR), with a wide variety of proposed solutions. Currently, there is a tendency to u...
Optical Music Recognition workflows perform several steps to retrieve the content in music score images, being symbol recognition one of the key stages. State-of-the-art approaches for this stage currently address the coding of the output symbols as if they were plain text characters. However, music symbols have a two-dimensional nature that is ign...
Optical Music Recognition workflows perform several steps to retrieve the content in music score images, being symbol recognition one of the key stages. State-of-the-art approaches for this stage currently address the coding of the output symbols as if they were plain text characters. However, music symbols have a two-dimensional nature that is ign...
Human supervision is necessary for a correct edition and publication of handwritten early music collections. The output of an optical music recognition system for that kind of documents may contain a significant number of errors, making it tedious to correct for a human expert. An adequate strategy is needed to optimize the human feedback informati...
The recognition of patterns that have a time dependency is common in areas like speech recognition or natural language processing. The equivalent situation in image analysis is present in tasks like text or video recognition. Recently, Recurrent Neural Networks (RNN) have been broadly applied to solve these task with good results in an end-to-end f...
The recognition of patterns that have a time dependency is common in areas like speech recognition or natural language processing. The equivalent situation in image analysis is present in tasks like text or video recognition. Recently, Recurrent Neural Networks (RNN) have been broadly applied to solve these task with good results in an end-to-end f...
The transcription process from early and modern notation manuscripts to a structured digital encoding has been traditionally performed following a fully manual workflow. At most it has received some technological support in particular stages, like optical music recognition (OMR) of the source images, or transcription to modern notation with music e...
In the field of Automatic Music Transcription, note tracking systems constitute a key process in the overall success of the task as they compute the expected note-level abstraction out of a frame-based pitch activation representation. Despite its relevance, note tracking is most commonly performed using a set of hand-crafted rules adjusted in a man...
En este trabajo se presentan los elementos necesarios para codificar digitalmente música preservada en manuscritos de los siglos XVI y XVII. Se presentan soluciones, propuestas para superar las dificultades que generan algunos aspectos que hacen esta notación diferente de la notación occidental moderna. Los problemas abordados son, por ejemplo, la...
The transcription of music sources requires new ways of interacting with musical documents. Assuming that automatic technologies will never guarantee a perfect transcription, our intention is to develop an interactive system in which user and software collaborate to complete the task. Since the use of traditional software for score edition might be...
Prototype selection is one of the most popular approaches for addressing the low efficiency issue typically found in the well-known k-Nearest Neighbour classification rule. These techniques select a representative subset from an original collection of prototypes with the premise of maintaining the same classification accuracy. Most recently, rank m...
Onset detection still has room for improvement, especially when dealing with polyphonic music signals. For certain purposes in which the correctness of the result is a must, user intervention is hence required to correct the mistakes performed by the detection algorithm. In such interactive paradigm, the exactitude of the detection can be guarantee...
Prototype Selection methods aim at improving the efficiency of the Nearest Neighbour classifier by selecting a set of representative examples of the training set. These techniques have been studied in situations in which the classes at issue are balanced, which is not representative of real-world data. Since class imbalance affects the classificati...
Genetic-based composition algorithms are able to explore an immense space of possibilities, but the main difficulty has always been the implementation of the selection process. In this work, sets of melodies are utilized for training a machine learning approach to compute fitness, based on different metrics. The fitness of a candidate is provided b...
In a harmonic analysis task, melodic analysis determines the importance and role of each note in a particular harmonic context. Thus, a note is classified as a harmonic tone when it belongs to the underlying chord, and as a non-harmonic tone otherwise, with a number of categories in this latter case. Automatic systems for fully solving this task wi...
The research community related to the human-interaction framework is becoming increasingly more interested in interactive pattern recognition, taking direct advantage of the feedback information provided by the user in each interaction step in order to improve raw performance. The application of this scheme requires learning techniques that are abl...
Automatic music transcription has usually been performed as an autonomous task and its evaluation has been made in terms of precision, recall, accuracy, etc. Nevertheless, in this work, assuming that the state of the art is far from being perfect, it is considered as an interactive one, where an expert user is assisted in its work by a transcriptio...
In this paper we present an application of language modeling using n-grams to model the style of different composers. For this, we repeated the experiments performed in previous works by other authors using a corpus of 5 composers from the Baroque and Classical periods. In these experiments we found some signs that the results could be influenced b...
This paper proposes two alternative methods of random projections and compares their performance for robust and efficient spam detection when trained using a small number of examples. Robustness refers to learning and adaptation leading to a high level ...
In this paper, a new approximation to off-line signature verification is proposed based on two-class classifiers using an expert decisions ensemble. Different methods to extract sets of local and a global features from the target sample are detailed. Also a normalization by confidence voting method is used in order to decrease the final equal error...
Some new rank methods to select the best prototypes from a training set are proposed in this paper in order to establish its size according to an external parameter, while maintaining the classification accuracy. The traditional methods that filter the training set in a classification task like editing or condensing have some rules that apply to th...
This study presents efficient techniques for multiple fundamental frequency estimation in music signals. The proposed methodology can infer harmonic patterns from a mixture considering interactions with other sources and evaluate them in a joint estimation scheme. For this purpose, a set of fundamental frequency candidates are first selected at eac...
Standard MIDI files consist of a number of tracks containing information that can be considered as a symbolic representation of music. Usually each track represents an instrument or voice in a music piece. The goal for this work is to identify the track that contains the bass line. This information is very relevant for a number of tasks like rhythm...
Music comparison and retrieval tasks rely on the concept of music similarity, whatever this might be. No similarity measure performs the best for all tasks, genres, or music formats. It is not even easy to formalize human perception of music similarity. A number of papers in the literature deal with the development of appropriate similarity measure...
A novel approach for dominant point detection in chain-coded contours is presented. Classical operations, such as computing a measurement of the curvature from the (x, y) co-ordinates of the contour points, finding curvature maxima, etc., are substituted by a neural network that traverses the contour, and gives a measurement of the relevance of eve...
Music transcription consists of transforming an audio signal encoding a music performance in a symbolic representation such as a music score. In this paper, a multimodal and interactive prototype to perform music transcription is presented. The system is oriented to monotimbral transcription, its working domain is music played by a single instrumen...
In a number of practical situations, data have structure and the relations among its component parts need to be coded with
suitable data models. Trees are usually utilized for representing data for which hierarchical relations can be defined. This
is the case in a number of fields like image analysis, natural language processing, protein structure,...
Similarity computation is a difficult issue in music information retrieval, because it tries to emulate the special ability
that humans show for pattern recognition in general, and particularly in the presence of noisy data. A number of works have
addressed the problem of what is the best representation for symbolic music in this context. The tree...
Similarity computation is a difficult issue in music information retrieval tasks, because it tries to emulate the special ability that humans show for pattern recognition in general, and particularly in the presence of noisy data. A number of works have addressed the problem of what is the best representation for symbolic music in this context. The...
In this paper we present an application of language modeling techniques using n-grams to an authorship attribution task. An stylometric study has been conducted on a pair of datasets of baroque and classical composers, with which other authors performed previously a similar study using a set of musicological features and pattern recognition techniq...
MML 2010, the International Workshop on Machine Learning and Music, continues a series of workshops related to artificial intelligence and machine learning in music. In this short article the Programme Chairs summarize the content of the workshop.
In this paper, we evaluate the impact of feature selection on the classification accuracy and the achieved dimensionality reduction, which benefits the time needed on training classification models. Our classification scheme therein is a Cartesian ensemble classification system, based on the principle of late fusion and feature subspaces. These fea...
This paper presents a musical genre classification system based on the combination of two kinds of information of very different nature: the instrumentation information contained in a MIDI file (metadata) and the chords that provide the harmonic structure of the musical score stored in that file (content). The fusion of these two information source...
The representation of symbolic music by means of trees has shown to be suitable in melodic similarity computation. In order to compare trees, different tree edit distances have been previously used, being their com-plexity a main drawback. In this paper, the application of stochastic k-testable tree-models for computing the similarity between two m...
Professional musicians intuitively manipulate sound properties such as pitch, timing, amplitude and timbre in order to produce expressive performances of particular pieces. However, there is little explicit information about how and in which musical contexts this manipulation occurs. In this paper we describe a machine learning approach to modeling...
We present a genre classification framework for audio music based on a symbolic classifi-cation system. Audio signals are transformed to a symbolic representation of harmony us-ing a chord transcription algorithm, by com-puting Harmonic Pitch Class Profiles. Then, language models built from a groundtruth of chord progressions for each genre are use...
Trees are a powerful data structure for representing data for which hierarchical relations can be defined. They have been applied in a number of fields like image analysis, natural language processing, protein structure, or music retrieval, to name a few. Procedures for comparing trees are very relevant in many task where tree representations are i...
We present a cartesian ensemble classification system that is based on the principle of late fusion and feature subspaces. These feature subspaces describe different aspects of the same data set. The framework is built on the Weka machine learning toolkit and able to combine arbitrary feature sets and learning schemes. In our scenario, we use it fo...
The presented onset detection approach is a very simple method described in [1]. An implementation in D2K was already submitted for MIREX 05 [2], yielding a relatively low success rate. However, probably there were some problems in the evaluation, as the mean distance between the detected and actual onsets was too high (about-22 ms, see [3]). There...
Music genre meta-data is of paramount importance for the organisation of music repositories. People use genre in a natural way when entering a music store or looking into music collections. Automatic genre classification has become a popular topic in music information retrieval research both, with digital audio and symbolic data. This work focuses...
Content-based music comparison is a task where no musical similarity measure can perform well in all possible cases. In this paper we will show that a careful combination of different similarity measures in an ensemble measure, will behave more robust than any of the included individual measures when applied as stand-alone measures. For the experim...
In a music recognition task, the classification of a new melody is often achieved by looking for the closest piece in a set of already known prototypes. The definition of a relevant similarity measure becomes then a crucial point. So far, the edit distance approach with a-priori fixed operation costs has been one of the most used to accomplish the...
Music genre meta-data is of paramount importance for the organization of music repositories. People use genre in a natural
way when entering a music store or looking into music collections. Automatic genre classification has become a popular topic
in music information retrieval research. This work brings to symbolic music recognition some technolog...
This work is supported by the spanish national projects: GV06/166 and CICyT TIN2006? 14932?C02, partially supported by EU ERDF.
We present preliminary work on automatic human-readable melody characterization. In order to obtain such a characterization, we (1) extract a set of statistical descriptors from the tracks in a dataset of MIDI files, (2) apply a rule induction algorithm to obtain a set of (crisp) classification rules for melody track identification, and (3) automat...
Identifying copies or different versions of a same musical work is a focal problem in maintaining large music databases. In
this paper we introduce novel ideas and methods that are applicable to metered, symbolically encoded polyphonic music. We
show how to represent and compare polyphonic music using a tree structure. Moreover, we put for trial va...
The goal of a polyphonic music transcription system is to extract a score from an audio signal. A multiple fundamental frequency estimator is the main piece of these systems, whereas tempo detection and key estimation complement them to correctly extract the score. In this work, in order to detect the fundamental frequencies that are present in a s...
This work is an effort towards the develop- ment of a system for the automation of tra- ditional tonal analysis of polyphonic scores in symbolic format. The system detects chords with their tonal functions, and key changes. All the possible tonal and key analyses are represented as a weighted directed acyclic graph. The best analysis is the path th...
The melodic similarity is an important concept to consider in music information retrieval. Among the possible applications a number of content-based systems may be developed for copyright management, plagiarism detection, computer-aided composition, etc., and the intervallic analysis is an essential tool for these applications. There exist several...
In the field of computer music, pattern recognition algorithms are very relevant for music information retrieval applications. One challenging task in this area is the automatic recognition of musical style, having a number of applications like indexing and selecting musical databases. From melodies symbolically represented as digital scores (stand...
Recent research in music genre classification hints at a glass ceiling being reached using timbral audio features. To overcome this, the combination of multiple different feature sets bearing diverse characteristics is needed. We propose a new approach to extend the scope of the fea- tures: We transcribe audio data into a symbolic form using a tran...
There is an increasing interest in music information re- trieval for reference, motive, or thumbnail extraction from a piece in order to have a compact and representative rep- resentation of the information to be retrieved. One of the main references for music is its melody. In a practical environment of symbolic format collections the informa- tio...
La similitud melódica es un concepto importante a considerar en la recuperación de información musical. Algunas de las posibles aplicaciones son sistemas basados en contenido desarrollados para administración de derechos de autor, detección de plagio de ideas ya expuestas por un artista en el pasado, la asistencia a la composición, etc. Existen var...
In this work, a normalisation of the weights utilized for combining classifiers decisions based on similarity Euclidean distance is presented. This normalisation is used by the confidence voting methods to decrease the final error rate in an OCR task. Different features from the characters are extracted. Each set of features is processed by a singl...
Music motive extraction is an important concept to consider in music information retrieval. Among the possible applications are the creations of music databases that need of indexing tools and access in a dynamic way, copyright management and plagiarism detection, computer-aided composition, etc. This paper presents an unsupervised method for autom...
The melodic similarity is an important concept to consider in music information retrieval. Among the possible applications a number of content-based systems may be developed for copyright management, plagiarism detection, computer-aided composition, etc., and the intervallic analysis is an essential tool for these applications. There exist several...
This work presents a comparison of current research in the use of voting ensembles of classifiers in order to improve the
accuracy of single classifiers and make the performance more robust against the difficulties that each individual classifier
may have. Also, a number of combination rules are proposed. Different voting schemes are discussed and...
Digital contours in a binary image can be described as an ordered vector set. In this paper an extension of the string edit
distance is defined for its computation between a pair of ordered sets of vectors. This way, the differences between shapes
can be computed in terms of editing costs. In order to achieve efficency a dominant point detection al...
Although telerobotic systems are becoming more complex, there are few actions they can perform on their own and, moreover, knowledge about the tasks they are being used for often relies only on their operator. In this paper, we present the design of a telerobotic system that features learning capabilities, can accept commands given in natural langu...
Standard MIDI files contain data that can be considered as a symbolic representation of music (a digital score), and most of them are structured as a number of tracks. One of them usually contains the melodic line of the piece, while the other tracks contain accompaniment music. The goal of this work is to identify the track that contains the melod...
Most of the western tonal music is based on the concept of tonality or key. It is often desirable to know the tonality of a song stored in a symbolic format (digital scores), both for content based management and musicological studies to name just two applications. The majority of the freely available symbolic music is coded in MIDI format. But, un...