Antonio Ríos Vila

Antonio Ríos Vila
University of Alicante | UA · Department of Software and Computing Systems

Multimedia Engineering

About

27
Publications
3,276
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
118
Citations

Publications

Publications (27)
Preprint
Full-text available
The digitization of vocal music scores presents unique challenges that go beyond traditional Optical Music Recognition (OMR) and Optical Character Recognition (OCR), as it necessitates preserving the critical alignment between music notation and lyrics. This alignment is essential for proper interpretation and processing in practical applications....
Chapter
Full-text available
State-of-the-art end-to-end Optical Music Recognition (OMR) has, to date, primarily been carried out using monophonic transcription techniques to handle complex score layouts, such as polyphony, often by resorting to simplifications or specific adaptations. Despite their efficacy, these approaches imply challenges related to scalability and limitat...
Preprint
Full-text available
Optical Music Recognition is a field that has progressed significantly, bringing accurate systems that transcribe effectively music scores into digital formats. Despite this, there are still several limitations that hinder OMR from achieving its full potential. Specifically, state of the art OMR still depends on multi-stage pipelines for performing...
Preprint
Full-text available
State-of-the-art end-to-end Optical Music Recognition (OMR) has, to date, primarily been carried out using monophonic transcription techniques to handle complex score layouts, such as polyphony, often by resorting to simplifications or specific adaptations. Despite their efficacy , these approaches imply challenges related to scalability and limita...
Chapter
Full-text available
In this paper, we present the Aligned Music Notation and Lyrics Transcription (AMNLT) challenge, whose goal is to retrieve the content from document images of vocal music. This new research area arises from the need to automatically transcribe notes and lyrics from music scores and align both sources of information conveniently. Although existing m...
Chapter
Optical Music Recognition (OMR) is an interdisciplinary field that aims to automate the process of transcribing sheet music into a digital format. Over the past few years, significant progress has been made in developing OMR systems that can recognize musical symbols with high accuracy. However, completing the pipeline of OMR remains a challenging...
Chapter
Full-text available
The automatic labeling of music symbols in a score, namely music symbol classification, represents one of the main stages in Optical Music Recognition systems. Most commonly, this task is addressed by resorting to deep neural models trained in a supervised manner, which report competitive performance rates at the expense of large amounts of labeled...
Preprint
Full-text available
In this work, the novel Image Transformation Sequence Retrieval (ITSR) task is presented, in which a model must retrieve the sequence of transformations between two given images that act as source and target, respectively. Given certain characteristics of the challenge such as the multiplicity of a correct sequence or the correlation between consec...
Article
Full-text available
End-to-end solutions have brought about significant advances in the field of Optical Music Recognition. These approaches directly provide the symbolic representation of a given image of a musical score. Despite this, several documents, such as pianoform musical scores, cannot yet benefit from these solutions since their structural complexity does n...
Article
Full-text available
The recognition of symbols within document images is one of the most relevant steps involved in the Document Analysis field. While current state-of-the-art methods based on Deep Learning are capable of adequately performing this task, they generally require a vast amount of data that has to be manually labeled. In this paper, we propose a self-supe...
Preprint
Full-text available
The evaluation of Handwritten Text Recognition (HTR) systems has traditionally used metrics based on the edit distance between HTR and ground truth (GT) transcripts, at both the character and word levels. This is very adequate when the experimental protocol assumes that both GT and HTR text lines are the same, which allows edit distances to be inde...
Conference Paper
Full-text available
Optical Music Recognition (OMR) systems typically consider workflows that include several steps, such as staff detection , symbol recognition, and semantic reconstruction. However, fine-tuning these systems is costly due to the specific data labeling process that has to be performed to train models for each of these steps. In this paper, we present...
Article
Full-text available
A number of applications would benefit from neural approaches that are capable of generating graphs from images in an end-to-end fashion. One of these fields is optical music recognition (OMR), which focuses on the computational reading of music notation from document images. Given that music notation can be expressed as a graph, the aforementioned...
Article
Full-text available
The Layout Analysis (LA) stage is of vital importance to the correct performance of an Optical Music Recognition (OMR) system. It identifies the regions of interest, such as staves or lyrics, which must then be processed in order to transcribe their content. Despite the existence of modern approaches based on deep learning, an exhaustive study of L...
Chapter
Detecting the corresponding editions from just a pair of input-output images represents an interesting task for artificial intelligence. If the possible image transformations are known, the task can be easily solved by enumeration with brute force, yet this becomes an unfeasible solution for long sequences. There are several state-of-the-art approa...
Chapter
Full-text available
State-of-the-art end-to-end Optical Music Recognition (OMR) systems use Recurrent Neural Networks to produce music transcriptions, as these models retrieve a sequence of symbols from an input staff image. However, recent advances in Deep Learning have led other research fields that process sequential data to use a new neural architecture: the Trans...
Chapter
The structural richness of music notation leads to develop specific approaches to the problem of Optical Music Recognition (OMR). Among them, it is becoming common to formulate the output of the system as a graph structure, where the primitives of music notation are the vertices and their syntactic relationships are modeled as edges. As an intermed...
Article
Inspired by the Text Recognition field, end-to-end schemes based on Convolutional Recurrent Neural Networks (CRNN) trained with the Connectionist Temporal Classification (CTC) loss function are considered one of the current state-of-the-art techniques for staff-level Optical Music Recognition (OMR). Unlike text symbols, music-notation elements may...
Preprint
Full-text available
The Layout Analysis (LA) stage is of vital importance to the correct performance of an Optical Music Recognition (OMR) system. It identifies the regions of interest, such as staves or lyrics, which must then be processed in order to transcribe their content. Despite the existence of modern approaches based on deep learning, an exhaustive study of L...
Chapter
Optical Music Recognition workflows currently involve several steps to retrieve information from music documents, focusing on image analysis and symbol recognition. However, despite many efforts, there is little research on how to bring these recognition results to practice, as there is still one step that has not been properly discussed: the encod...
Article
Full-text available
Optical music recognition is a research field whose efforts have been mainly focused, due to the difficulties involved in its processes, on document and image recognition. However, there is a final step after the recognition phase that has not been properly addressed or discussed, and which is relevant to obtaining a standard digital score from the...
Conference Paper
Full-text available
Most Optical Music Recognition workflows include several steps to retrieve the content from music score images. These steps typically comprise preprocessing, recognition, notation reconstruction and encoding. Currently, state-of-the-art models allow performing graphic recognition in an almost end-to-end fashion, performing the steps from preprocess...
Preprint
Full-text available
Optical Music Recognition workflows perform several steps to retrieve the content in music score images, being symbol recognition one of the key stages. State-of-the-art approaches for this stage currently address the coding of the output symbols as if they were plain text characters. However, music symbols have a two-dimensional nature that is ign...
Conference Paper
Optical Music Recognition workflows perform several steps to retrieve the content in music score images, being symbol recognition one of the key stages. State-of-the-art approaches for this stage currently address the coding of the output symbols as if they were plain text characters. However, music symbols have a two-dimensional nature that is ign...

Network

Cited By