Conference PaperPDF Available

Left and right-hand guitar playing techniques detection

Authors:
Left and right-hand guitar playing techniques detection
Loïc Reboursière
Numediart Institute, UMONS
TCTS Lab
Mons, Belgium
loicreboursiere@gmail.com
Otso Lähdeoja
Numediart Institute
TCTS Lab, UMONS
Mons, Belgium
otso.lahdeoja@gmail.com
Thomas Drugman
Numediart Institute
TCTS Lab, UMONS
Mons, Belgium
thomas.drugman@umons.ac.be
Stéphane Dupont
Numediart Institute
TCTS Lab, UMONS
Mons, Belgium
stephane.dupont@umons.ac.be
Cécile Picard-Limpens
Numediart Institute
TCTS Lab, UMONS
Mons, Belgium
ccl.picard@gmail.com
Nicolas Riche
Numediart Institute
TCTS Lab, UMONS
Mons, Belgium
nicolas.riche@umons.ac.be
ABSTRACT
In this paper we present a series of algorithms developed
to detect the following guitar playing techniques : bend,
hammer-on, pull-off, slide, palm muting and harmonic. De-
tection of playing techniques can be used to control exter-
nal content (i.e audio loops and effects, videos, light events,
etc.), as well as to write real-time score or to assist gui-
tar novices in their learning process. The guitar used is a
Godin Multiac with an under-saddle RMC hexaphonic piezo
pickup (one pickup per string, i.e six mono signals).
Keywords
Guitar audio analysis, playing techniques, hexaphonic pickup,
controller, augmented guitar
1. INTRODUCTION
Guitar has maintained a close relationship with technologi-
cal innovation throughout its history, from acoustic to elec-
tric and now to virtual [3]. The term ”augmented instru-
ment”is generally used to refer to a traditional (acoustic) in-
strument with added sonic possibilities. The augmentation
can be physical like John Cage’s sonic research on prepared
pianos, but nowadays the term has acquired a more com-
putational meaning: the use of digital audio to enhance the
sonic possibilities of a given instrument as well as the use of
sensors and/or signal analysis algorithms to give extra and
expressive controls to the player. Guitar playing techniques
detection are part of this last category and several of them
have alreday been investigated. In [13] and [10] focus has
been put on estimating the point where the string has been
plucked, i.e the plucking point. In [5], left-hand fingering of
a guitar player has been analyzed and characterized offline.
In [6] algorithms to detect plucking and expression styles
for bass guitar has been investigated. In [9] automatic note
transcription from an hexaphonic pickup has been achieved.
On the other hand, several studies focus more on the artis-
tic side using added sensors and/or analysis algorithms to
control synthetic sound parameters [7], [11], [12] or [4].
In this paper, we put our efforts on the audio signal anal-
ysis part, in order to detect guitar playing techniques. As
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
NIME’12, May 21 – 23, 2012, University of Michigan, Ann Arbor.
Copyright remains with the author(s).
they are closely linked to the guitarist it is a very natural
way for the instrumentalist to control any type of media or
effects and turn the guitar into a multi-media instrument /
controller.
Our system could be compared to existing pitch-to-midi
technologies like the 1Axon AX 50 or the 2Roland VG-99
systems but our approach is broader and more complete as
we include all major guitar playing techniques.
2. EXTRACTION OF PLAYING ARTICU-
LATIONS
Articulations used to play the guitar can vary a lot from one
guitarist to another as well as from one style to another. We
worked on the most common guitar articulations (defined
in [8]) listed as follows: hammer-on, pull-off, slide, bend,
harmonic, palm muting.
Our algorithmic approach for articulation detection is de-
scribed in Figure 1. The diagram emphasizes the discrimi-
nation between right-hand and left-hand attack as the first
important element leading to the detection of all the artic-
ulations. We did not develop any pitch detector algorithm,
as this question is a fairly well-known problem in signal
processing. The following algorithms have been used: YIN
(temporal), sigmund~ Max MSP object (spectral) and MIR
toolbox pitch detection algorithm (used with the autocor-
relation method).
Figure 1: Diagram of the articulations detection algorithm
1http://www.axon-technologies.net/
2http://www.roland.com
3. BUILDING THE DATABASE
A Godin Multiac guitar with an under saddle hexaphonic
RMC pickup and Alvarez Alliance HT classic nylon strings
mounted on it has been used to build the database. Two
styles of picking (finger and pick) have been recorded by two
different guitarists, leading to the building of two databases
for which: the range of the recorded notes is going from
the open string to the 16th fret; hammer-on, pull-off and
slide notes range from a half-tone to one and a half tone of
variation; slide notes have been recorded in both directions;
bended notes couldn’t hardly go above a half-tone of varia-
tion, as we were using nylon strings; the five first harmonics
have been recorded for each string; normal notes have been
played at three different positions: bridge, soundhole and
fretboard; all recordings have been hand-segmented with
3Sonic Visualizer software.
Each picking style database is made out of 1416 samples
(normal notes: 288, palm muted: 96, hammer-on: 234, pull-
off: 234, slide: 468, bend: 66, harmonic: 30).
The guitar database we recorded is available 4online.
4. ONSET DETECTION
As a general scheme for describing onset detection approaches,
one can say that they consist in three steps [2]. First, the
audio signal is pre-processed in order to accentuate certain
aspects important to the detection task, then, the amount
of data from the processed signal is reduced in order to
obtain a lower sample rate. Finally, thresholding and/or
peak-picking can be applied to isolate the potential onsets.
4.1 Evaluation
Here, we have been comparing a range of onset detection al-
gorithms which covers 18 variants of amplitude-based meth-
ods (including the energy, log-energy domains, and their
time derivatives) and short-time Fourier transform based
methods (including spectral flux in different domains, phase-
based methods and their variants using amplitude weight-
ing, and complex-based methods). Evaluation is performed
using the approach proposed in [1]. A tolerance of plus or
minus 50 ms on the timing of the detected onset has been
used to still be considered as valid because of the lack of
accuracy of the hand-annotated reference files. We have
not been using methods requiring a pitch estimation. The
monophonic recordings (sum of the 6 separate channels) of
the guitar signals were used and we optimized the detection
threshold for peak F-measure for each detection method in-
dividually. Table 1 presents the results for the four best
performing methods, and provide the peak F-measure on
the whole data set, as well as recall values for 4 categories
of attacks: right-hand attack (i.e normal notes), harmonic,
hammer-on, pull-off. Bends haven’t been considered for this
evaluation.
We observe that if the detection of right hand (including
harmonics) attacks is generally not a problem, the detection
of hammer-on and pull-off attacks is better achieved using
STFTs-based methods. Indeed, a significant part of those
left-hand attacks do not show any amplitude increase at
all. Finally, the best performing approach has a F-measure
above 96% with a recall close to 100% for right-hand at-
tacks, 98% for hammer-ons, and a moderate 88% for pull-
offs. Some further research would hence be necessary to
understand how pull-offs can be better detected.
4.2 Discrimination between left and right hand
attacks
3http://www.sonicvisualiser.org/
4http://www.numediart.org/GuitarDB/
When the string is plucked (by a finger or a pick), its vibra-
tory regime is momentarily stopped, resulting in a trough
just before the attack in the signal envelope. As opposed
to a plucked note, legato attack doesn’t stop the string’s
vibration leading to a shift in pitch without a significant
change in amplitude. After several testing with different
speeds of playing, it appears that the gap can vary from
20ms to 50ms. It has to be noticed that for faster playing,
e.g tremolo, the gap disappears and a dedicated algorithm
should be implemented to detect such audio event. Our
left-hand / right-hand attack discrimination system is thus
based on the observation of the very first milliseconds be-
fore the onset. A simple measure of the slope between the
minimum energy point preceding the attack and the maxi-
mum point at the attack. Best results were obtained when
computing the slope using two neighboring half-overlapping
frames. This lead to a 94.0% correct classification rate when
using a properly optimized threshold. We observed that
classification errors are often due to string noise preceding
right hand attacks, causing the energy slope to be smaller
than it could be. We are hence looking into improving ro-
bustness to these playing artefacts.
5. LEFT-HAND ARTICULATIONS
Once the distinction between right-hand and left-hand at-
tack is performed, it is thus possible to categorize the left-
hand articulations by inspecting into the pitch profile of the
note. The left hand works on the string tension and fretting,
thus affecting the pitch. Our method of left-hand playing
technique detection operates by measuring the pitch time
derivative.
Figure 2: Distribution of the number of transition half
tones with the maximal relative slopes for notes with left-
hand articulation: hammer-on (blue), pull-off (black), bend
(green) and slide (red).
Hammer-on (ascending legato) is characterized by an abrupt
change in pitch, as well as its counterpart, the pull-off (de-
scending legato). The bend shows a slower evolution in
pitch. The slide has a pitch derivation similar to the hammer-
on or pull-off, but with a ”staircase” profile corresponding
to the frets over which the sliding finger passes.
As a first step, two parameters are investigated: the num-
ber of half tones of the note transition, and the maximal
relative pitch slope, defined as the maximal difference of
pitch between two consecutive frames spaced by 10ms di-
vided by the open string pitch value. Figure 2 shows how
these two parameters are distributed for notes articulated
with a hammer-on, a pull-off, a bend or a slide. It can be
noted from this figure that the great majority of bended
Method F-measure Normal Right-hand Recall Harmonic Recall Hammer-on Recall Pull-off Recall
Spectral Flux 96.2% 99.6% 100% 97.9% 88.0%
Weighted Phase Divergence 95.8% 98.9% 100% 97.4% 87.6%
Amplitude 92.4% 99.6% 100% 92.3% 75.2%
Delta Amplitude 91.8% 99.6% 100% 91.9% 74.4%
Table 1: Results of the onset detection with four different techniques
notes (green points) are easily identified as they present a
very low pitch slope. Secondly, it can be observed that for
transitions with more than one half tone, a perfect determi-
nation of slide versus hammer-on or pull-off is achieved. As
a consequence, the only remaining ambiguity concerns the
distinction of slide with hammer-on/pull-off for a transition
of a half tone.
To address this latter issue, two parameters are extracted:
the energy ratio and the spectral center of gravity ratio.
Both of them are computed on two 40ms-long frames: one
ends at the transition middle, while the other starts at that
moment.
Based on the aforementioned approaches, a detection of
left-hand articulated notes has been proposed simply by us-
ing thresholding. The results of the ensuing classification
are presented in Table 2. It can be noticed that all bend
effects are correctly identified. Slides are determined with
an accuracy of 97.61%. Finally, hammers and pulloffs are
detected in more than 93%. The main source of errors for
these latter effects is the remaining ambiguity with slides of
one half tone.
Hammer Pull-off Bend Slide
Hammer 93.27% 0.45% 0% 6.28%
Pull-off 0% 93.69% 0% 6.31%
Bend 0% 0% 100% 0%
Slide 1.74% 0.43% 0.21% 97.61%
Table 2: Confusion matrix for detection of left-hand artic-
ulated notes
6. RIGHT-HAND ARTICULATIONS
In this section, two right-hand articulations are studied:
palm muting and harmonics notes.
6.1 Palm Muting
Palm muting is obtained by plucking the string with the
palm of the right hand slightly touching the string. The
produced sound is stifled, the sustain period of the palm
muted note is shorter and decreases faster than the one of a
normal note. Moreover high frequencies decrease faster than
other part of the spectrum when the note is palm muted.
As a consequence the spectrum was filtered out from 0 to
500Hz. Figure 3 shows the slopes (starting at the attack) of
the spectral envelopes of four notes: three notes are played
with a normal attack (at the bridge, soundhole and fret-
board) and one is palm muted. For the three normal notes,
one can see the longer sustain period as the curves stay
rather flat compared to the palm muted note whose slope
increase logarithmically.
Based on that behavior, our algorithmic approach calcu-
lates the value of the energy envelope slope at the attack
and compares it to a defined threshold. The slope is com-
puted as the energy ratio between two frames: one located
at the energy peak following note onsets, and the one that
follows. Table 3 shows the results of the algorithm which
has been run on 48 normal notes for each strings (16 per
position: bridge, soundhole, fretboard and 16 palm muted
Figure 3: Slope of the energy of four notes played at the
bridge (blue), at the soundhole (red), at the fretboard(green)
and palm muted (black)
notes). The misclassified notes are mostly due to imperfect
playing. 97.91% means that one note has been misclassified
and 93.75% and 87.5% respectively leads to one and two
misclassified notes.
String Normal notes Palm muted notes
1 (thresh -0.06 ) 97.91% 93.75%
2 (thresh -0.06 ) 100% 100%
3 (thresh -0.06 ) 100% 100%
4 (thresh -0.04 ) 100% 100%
5 (thresh -0.05 ) 100% 100%
6 (thresh -0.05 ) 97.91% 87.5%
Table 3: Palm muting detection results for the six strings
6.2 Harmonics
Harmonics are obtained by slightly fretting a note on a node
of the string with the left hand. Two techniques are inves-
tigated to achieve this detection. One operates in the time
domain, while the other one focuses on the spectral domain.
Both approaches only consider the note attack, which allows
for a real-time detection of harmonic notes. It has been at-
tested that, after the attack, the differentiation between an
harmonic and a normal note might become more difficult,
especially for the 7th and 12th fret. The two methods are
explained in the following.
Time-domain approach: Figure 4 shows the wave-
form at the attack for a normal note and an harmonic.
Two parameters are proposed: the attack duration is
defined as the timespan during which the attack wave-
form remains positive; the relative discontinuity dur-
ing the attack is defined as Amin/Amax (right side
of Figure 4).
Frequency-domain approach: Figure 5 shows the
magnitude spectrum of the attack for a normal note
(left) and an harmonic note (right) using a 40ms Han-
ning window. On these spectra, we extracted a single
parameter: the harmonic-to-subharmonic ratio is de-
fined as the difference in dB between the amplitude of
Figure 4: Attack during the production of a normal note
(left) and an harmonic note (right).
the first harmonic (at F0) and the amplitude of the
subharmonic at 1.5·F0.
Figure 5: Magnitude spectrum during the attack of a nor-
mal note (left) and an harmonic note (right).
Based on the two previous approaches, a detection of har-
monic notes has been proposed simply by using a thresh-
old. The results of the ensuing classification are presented
in Table 4. It can be noticed that all harmonic notes are
correctly identified, while for normal notes, the best results
are achieved for notes played on the fretboard (with only
0.52% of misclassification) and the worst ones are obtained
for notes played at the bridge (with a bit less of 95% of
correct detection).
Harmonic detection Normal detection
Harmonic 100% 0%
Fretboard 0.52% 99.48%
Soundhole 2.78% 97.22%
Bridge 5.38% 94.62%
Table 4: Confusion matrix for harmonics detection.
7. CONCLUSION AND PERSPECTIVES
This paper focused on features extraction from an hexa-
phonic guitar signal, in order to detect and recognize all
the playing techniques commonly used on the guitar. The
methodology was built on the successive detection and clas-
sification of 1) attacks/note onsets, 2) left-hand and right-
hand discrimination, 3) articulation types: normal, mute,
bend, slide, hammer-on, pull-off, harmonic and palm mut-
ing.
Playing techniques algorithm have been tested and imple-
mented separately in Matlab and Max/MSP with positive
feedback from informal live evaluations. However a global
algorithm needs to be implemented to gather all playing
techniques detection. This implementation should assessed
the processing power issues encountered when using at least
three detections in real-time. In addition, the defined algo-
rithms have to be tested by different guitarists as well as
with different types of guitar in order to build a relevant
user-study. Finally, a machine-learning method should be
considered to enhance the adaptation of the global algo-
rithm to these different users.
8. ACKNOWLEDGMENTS
Thomas Drugman is supported by FNRS and Nicolas Riche
by FNRS/FRIA Belgium. Lo¨
ıc Reboursi`ere, Otso L¨
ahdeoja,
St´ephane Dupont and C´ecile Picard-Limpens are supported
by 5numediart, a long-term research program centered on
Digital Media Arts, funded by R´egion Wallonne, Belgium
(grant N716631).
9. REFERENCES
[1] A.Holzapfel, Y.Stylianou, A.C.Gedik, and B. Bozkurt.
Three dimensions of pitched instrument onset
detection. IEEE Transactions on Audio, Speech and
Language Processing, 18 (6), August 2010.
[2] J. P. Bello, L. Daudet, S. Abdallah, C. Duxbury,
M. Davies, and M. B. Sandler. A tutorial on onset
detection in music signals. IEEE Transactions on
Speech and Audio Processing,, 16 (5), September 2005.
[3] G. Carfoot. Acoustic, electric and virtual noise: The
cultural identity of the guitar. Leonardo Music
Journal, 16:35–39, 2006.
[4] R. Graham. A live performance system in pure data:
Pitch contour as figurative gesture. In Proceedings of
the annual pure data convention, 2001.
[5] E. Guaus and J. L. Arcos. Analyzing left hand
fingering in guitar playing. In Proc. of SMC, 2010.
[6] G. S. Jacob Abesser, Hanna Lukashevich.
Feature-based extraction of plucking and expression
styles of the electric bass. Proceedings of the ICCASP
Conference, 2012.
[7] O. L¨
ahdeoja. Une approche de l’instrument augment´e:
La guitare ´electrique. PhD thesis, ´
Ecole Doctorale
Esth´etique, Sciences et Technologies des Arts, Paris 8,
2010.
[8] J. C. Norton. Motion capture to build a foundation for
a computer-controlled instrument by study of classical
guitar performance. PhD thesis, CCRMA, Dept. of
Music, Stanford University, 2008.
[9] P. D. O’Grady and S. T. Rickard. Automatic
hexaphonic guitar transcription using non-negative
constraints. In Proceedings of the Irish Signal and
Systems Conference, 2009.
[10] H. Pettinen and V. V¨
alim¨
aki. A time-domain
approach to estimating the plucking point of guitar
tones obtained with an under-saddle pickup. In
Applied Acoustics, volume 65, pages 1207–1220, 2004.
[11] M. Puckette. Patch for guitar. In Pd-convention,
2007. Workshop Pd patches available here
http://crca.ucsd.edu/~msp/lac/.
[12] L. Reboursi`ere, C. Frisson, O. L¨
ahdeoja, J. I. Mills,
C. Picard, and T. Todoroff. Multimodal guitar: A
toolbox for augmented guitar performances. In Proc.
of NIME, 2010.
[13] C. Traube and P. Depalle. Extraction of the excitation
point location on a string using weighted least-square
estimation of comb filter delay. In Proceedings of the
Conference on Digital Audio Effects (DAFx), 2003.
5http://www.numediart.org
... A performancelevel detection indicates the exact areas of playing techniques rather than note-level intervals, the parameters have a major impact on a listener's perception of the music [9], but are not completely reflected by the music notations. Audioonly playing technique analyses on guitars [10]- [15], bowed strings [16]- [18], guqin [19] and Chinese bamboo flutes [20], [21] enrich the functionality in a single angle. CompMusic [22] project covers a corpus of Indian, Arabian, Chinese and Spanish music and a software was developed for automatic description of these folk music. ...
... Peak picking on energy envelope generally has sufficient precision for monophonic onset detection. Involved with simultaneous amplitude and pitch shift, the coarse onset detectors, such as SpecFlux [10], SuperFlux [69] and ComplexFlux [70], are always followed with a energybased time refinement step [71]. Specially, the fake nail attack on the string brings a pair of adjacent peaks for a single tone, namely the attack peak and natural transient, which respectively serve to note segmentation and velocity estimation. ...
Article
Full-text available
Music information retrieval (MIR) is developing these years rapidly. As the fundamental MIR tasks, automatic music transcription (AMT) and expressive analysis (EA) are gaining momentum in both Western and non-European music. However, the annotated datasets for non-Eurogenic instruments remain scarce in terms of quantity and feature diversity so that general evaluations and data-driven models on various tasks cannot be well explored. As one of the most popular traditional plucked string instruments in Asia, which is barely studied in the MIR community, pipa has lots of distinctive national and local characteristics including the fake nails, intrinsic pitch shift, rubato, as well as a high diversity of sophisticated playing techniques that greatly enhance the music expressiveness. Our work aims to systematically clarify a creation procedure of a pipa dataset with audio, musical notation and multiview video modalities for traditional Chinese solos. The use of 4-track string vibration signals captured by optical sensors paves a path for high quality annotations. Furthermore, a transcription and Expressiveness Annotation System (TEAS) was transparently implemented to ensure the scalability of dataset. Three expressive analysis approaches in this system were newly proposed and evaluated in this paper. Finally, two AMT models were investigated and a series of the existing and emerging MIR tasks enabled by this dataset were enumerated for the future exploration.
... Zenodo returned 51 papers as the result of this research. Eight papers were excluded from the analysis, of which five were excluded because in these papers the term score is simply mentioned, but scores are not central in the interaction: [125,190,229,242,264]. Of the remaining three, one paper used the word score to indicate point counting in a game [69]. ...
Thesis
This thesis’s primary goal is to investigate performance ecologies, that is the compound of humans, artifacts and environmental elements that contribute to the result of a performance. In particular, this thesis focuses on designing new interactive technologies for sound and music. The goal of this thesis leads to the following Research Questions (RQs): • RQ1 How can the design of interactive sonic artifacts support a joint expression across different actors (composers, choreographers, and performers, musicians, and dancers) in a given performance ecology? • RQ2 How does each different actor influence the design of different artifacts, and what impact does this have on the overall artwork? • RQ3 How do the different actors in the same ecology interact, and appropriate an interactive artifact? To reply to these questions, a new framework named ARCAA has been created. In this framework, all the Actors of a given ecology are connected to all the Artifacts throughout three layers: Role, Context and Activity. This framework is then applied to one systematic literature review, two case studies on music performance and one case study in dance performance. The studies help to better understand the shaded roles of composers, performers, instrumentalists, dancers, and choreographers, which is relevant to better design interactive technologies for performances. Finally, this thesis proposes a new reflection on the blurred distinction between composing and designing a new instrument in a context that involves a multitude of actors. Overall, this work introduces the following contributions to the field of interaction design applied to music technology: 1) ARCAA, a framework to analyse the set of interconnected relationship in interactive (music) performances, validated through 2 music studies, 1 dance study and 1 systematic literature analysis; 2) Recommendations for designing music interactive system for performance (music or dance), accounting for the needs of the various actors and for the overlapping on music composition and design of interactive technology; 3) A taxonomy of how scores have shaped performance ecologies in NIME, based on a systematic analysis of the literature on score in the NIME proceedings; 4) Proposal of a methodological approach combining autobiographical and idiographical design approaches in interactive performances.
... Our implementation of an expressive technique classifier can be compared to the work of Reboursiere et al. [12], which consisted of a series of algorithms to track in real-time a similar set of expressive guitar techniques. However, despite the high accuracy obtained by the authors on polyphonic performances, the different techniques were tracked with separate algorithms and tested separately because of processing power issues. ...
Conference Paper
Full-text available
Real-time applications of Music Information Retrieval (MIR) have been gaining interest as of recently. However, as deep learning becomes more and more ubiquitous for music analysis tasks, several challenges and limitations need to be overcome to deliver accurate and quick real-time MIR systems. In addition, modern embedded computers offer great potential for compact systems that use MIR algorithms, such as digital musical instruments. However, embedded computing hardware is generally resource-constrained, posing additional limitations. In this paper, we identify and discuss the challenges and limitations of embedded real-time MIR. Furthermore, we discuss potential solutions to these challenges, and demonstrate their validity by presenting an embedded real-time classifier of expressive acoustic guitar techniques. The classifier achieved 99.2% accuracy in distinguishing pitched and percussive techniques and a 99.1% average accuracy in distinguishing four distinct percussive techniques with a fifth class for pitched sounds. The full classification task is a considerably more complex learning problem, with our preliminary results reaching only 56.5% accuracy. The results were produced with an average latency of 30.7 ms.
... Our implementation of an expressive technique classifier can be compared to the work of Reboursiere et al. [12], which consisted of a series of algorithms to track in real-time a similar set of expressive guitar techniques. However, despite the high accuracy obtained by the authors on polyphonic performances, the different techniques were tracked with separate algorithms and tested separately because of processing power issues. ...
Preprint
|| THIS IS THE PREPRINT VERSION. Published version on Researchate || Real-time applications of Music Information Retrieval (MIR) have been gaining interest as of recently. However, as deep learning becomes more and more ubiquitous for music analysis tasks, several challenges and limitations need to be overcome to deliver accurate and quick real-time MIR systems. In addition, modern embedded computers offer great potential for compact systems that use MIR algorithms, such as digital musical instruments. However , embedded computing hardware is generally resource constrained , posing additional limitations. In this paper, we identify and discuss the challenges and limitations of embedded real-time MIR. Furthermore, we discuss potential solutions to these challenges , and demonstrate their validity by presenting an embedded real-time classifier of expressive acoustic guitar techniques. The classifier achieved 99.2% accuracy in distinguishing pitched and percussive techniques and a 99.1% average accuracy in distinguishing four distinct percussive techniques with a fifth class for pitched sounds. The full classification task is a considerably more complex learning problem, with our preliminary results reaching only 56.5% accuracy. The results were produced with an average latency of 30.7 ms.
... Les ensembles de données (dataset) présents dans l'état de l'art et intégrants des enregistrements de guitare sont essentiellement constitués pour l'extraction de données musicales (Music (ou Media) Information Retrieval ou MIR) (Xi et al., 2018;Su et al., 2014;Reboursière et al., 2012;Kehling et al., 2014;Stein et al., 2010). Leur but est essentiellement de permettre la détection et la classification des notes , des accords (Nadar et al., 2019), ...
Thesis
The hexaphonic guitar is an electric guitar with a hexaphonic microphone. The hexaphonic microphone is a set of six microphones, each of which picks up the sound of a particular string. This prosthesis can be used to facilitate instrumental gestural control, as is the case with synthesizer guitars, for example. It also allows the application of independent sound effects for each string through hexaphonic sound processing pedals. These two use cases constitute two very different approaches to the electric guitar and have developed, over the years, very different sized communities of practice. This difference in the development of these two types of use is hardly understandable in our eyes, as both approaches seem to us to be fertile in terms of sonic and compositional potentials.This research work attempts to understand why the practice of hexaphonic sound processing does not integrate into a community of guitarists and why is there a difference between the two uses of the hexaphonic microphone? In order to answer these questions, this thesis is structured around a double organological and practical approach. First of all, a base of knowledge on the evolution of the guitar is created through the descriptive analysis of 18 mutations that have marked and continue to mark the development of the instrument. This base allows us to develop a technical and gestural genealogy of the guitar, bringing to light its filiations. These are at the same time the reflection of the great technical evolutions but also the development of transversal concepts updated in the course of the evolution. These developments help to position the double object of our study (hexaphonic sound processing and instrumental gestural control) both in technical terms (in relation to what has existed) and in terms of its anchorage in one or several transversal practices. The practical part of this thesis is twofold and is nourished by the realized organology. The hardware and software elements allowing to use the potentials of the hexaphonic guitar have been developed. The different elements are developed from the current rapid prototyping platforms and demonstrate that these two uses can be integrated in the instrumentarium of the electric guitarist. Among the different elements developed, a hexaphonic multi-effects and a note detection and playing techniques software were used in two types of practices: the first is the creation of a musical and video performance (Puzzle, Ivann Cruz and Lionel Palun) and the second is a series of experiments with five guitarists.This last practice made it possible to constitute a dataset of about ten hours of annotated musical improvisations.The analysis of these different practices brought to light several characteristics of the hexaphonic device created: this one positions the guitar as a multi-track instrument whose diversity of timbres that it can deploy is close to that offered by the MAO systems. Hexpahonic sound processing finally appears as an extension of the electric guitar that allows the musician to keep his instrumental practice and to extend the finesse acquired over the years without loss in moving to sound processing. These considerations combined with the current evolution of sound processing pedals (which share similarities) tend to point to a favourable period for the development of a perennial hexaphonic practice community. The tools developed and the gathered dataset are made available to the community in order to support these potential developments.
... Guitar transcription differs from piano transcription in many aspects, including the diversity of guitar tones, the intensive use of guitar playing techniques [24][25][26][27][28][29] such as bending and sliding, as well as the need of string (fingering) detection (i.e., to detect which string a note is played) [30,31]. We differentiate below two types of guitar transcription tasks: tab transcription, which involves detecting not only the pitches but also the string playing the pitches, and sheet transcription, which does not deal with string detection. ...
Preprint
Full-text available
In this paper, we propose a new dataset named EGDB, that con-tains transcriptions of the electric guitar performance of 240 tab-latures rendered with different tones. Moreover, we benchmark theperformance of two well-known transcription models proposed orig-inally for the piano on this dataset, along with a multi-loss Trans-former model that we newly propose. Our evaluation on this datasetand a separate set of real-world recordings demonstrate the influenceof timbre on the accuracy of guitar sheet transcription, the potentialof using multiple losses for Transformers, as well as the room forfurther improvement for this task.
Chapter
Modern cochlear implants (CIs) generate electric current pulsatile stimuli from real-time incoming to stimulate residual auditory nerves of deaf ears. In this unique way, deaf people can (re)gain a sense of hearing and consequent speech communication abilities. The electric hearing mimics the normal acoustic hearing (NH), but with a different physical interface to the neural system, which limits the performance of CI devices. Simulating the electric hearing process of CI users through NH listeners is an important step in CI research and development. Many acoustic modelling methods have been developed for simulation purposes, e.g., to predict the performance of a novel sound coding strategy. Channel vocoders with noise or sine-wave carriers are mostly popular among the methods. The simulation works have accelerated the procedures of re-engineering and understanding of the electric hearing. This paper presents an overview of the literature on channel-vocoder simulation methods. Strengths, limitations, applications, and future works about acoustic vocoder simulation methods are introduced and discussed.KeywordsCochlear implantAuditory prosthesisSpeech perceptionHearing researchPitchVocoder
Chapter
Involving with rapid intensity modulation, the tremolo in plucked string instruments, particularly Chinese traditional instrument pipa, vitally enriches the musical human perceptions and local styles. In this paper, we propose to detect the pipa tremolo onsets in a new angle and found that the spectral slope curve, as a single parameter framewise feature, is effective to deal with the spurious amplitude peaks produced by the attack noise from fake nails. Evaluated on our toy dataset, STFT-based spectral slope curve demonstrates its significance on tremolo onset detection in monophonic pipa clips. The tremolo onsets analyzed here could be applied to more detailed parameter estimates afterward.KeywordsTremolo analysisOnset detectionFake nailsAttack noiseSpectral slopeMonophonic pipa recordings
Article
Playing techniques contain distinctive information about musical expressivity and interpretation. Yet, current research in music signal analysis suffers from a scarcity of computational models for playing techniques, especially in the context of live performance. To address this problem, our paper develops a general framework for playing technique recognition. We propose the adaptive scattering transform, which refers to any scattering transform that includes a stage of data-driven dimensionality reduction over at least one of its wavelet variables, for representing playing techniques. Two adaptive scattering features are presented: frequency-adaptive scattering and direction-adaptive scattering. We analyse seven playing techniques: vibrato, tremolo, trill, flutter-tongue, acciaccatura, portamento, and glissando. To evaluate the proposed methodology, we create a new dataset containing full-length Chinese bamboo flute performances (CBFdataset) with expert playing technique annotations. Once trained on the proposed scattering representations, a support vector classifier achieves state-of-the-art results. We provide explanatory visualisations of scattering coefficients for each technique and verify the system over three additional datasets with various instrumental and vocal techniques: VPset, SOL, and VocalSet.
Conference Paper
Full-text available
This project aims at studying how recent interactive and in- teractions technologies would help extend how we play the guitar, thus defining the “multimodal guitar”. Our contri- butions target three main axes: audio analysis, gestural control and audio synthesis. For this purpose, we designed and developed a freely-available toolbox for augmented guitar performances, compliant with the PureData and Max/MSP environments, gathering tools for: polyphonic pitch estimation, fretboard visualization and grouping, pressure sensing, modal synthesis, infinite sustain, rearranging looping and “smart” harmonizing.
Article
Full-text available
This paper focuses on the extraction of the excitation point loca-tion on a guitar string by an iterative estimation of the structural parameters of the spectral envelope. We propose a general method to estimate the plucking point location, working into two stages: starting from a measure related to the autocorrelation of the sig-nal as a first approximation, a weighted least-square estimation is used to refine a FIR comb filter delay value to better fit the mea-sured spectral envelope. This method is based on the fact that, in a simple digital physical model of a plucked-string instrument, the resonant modes translate into an all-pole structure while the initial conditions (a triangular shape for the string and a zero-velocity at all points) result in a FIR comb filter structure.
Conference Paper
Full-text available
Automatic music transcription is a widely studied problem, Typically, recordings that are used for transcription are taken from standard instruments, in the case of electric stringed instruments—such as the electric guitar—the recordings are captured from a standard pick-up, which unwantedly mixes the signals from each string and complicates subsequent analysis. We propose an approach to electric guitar transcription where the signal generated by each string at the guitar pickup is captured and analysed separately; thus providing six separate signals as opposed to one mixed signal, which enables finger positions to be identified. Such an instrument is known as a hexaphonic guitar and is a popular instrument for spatial music performances. We build the equipment necessary to modify a standard electric guitar into a hexaphonic guitar, and present an application of Non-Negative Matrix Factorisation to the task of transcription—where a basis for each note on the fretboard is learned and fitted to a magnitude spectrogram of the hexaphonic recording, which then undergoes a nonlinearity generating a piano roll representation of the music performance.
Conference Paper
This paper describes an ongoing PhD research on augmented instruments, focusing especially on a digital enhancement of the electric guitar. An analysis of the gesture-object relationship as a series « contact points » creates the theoretical frame where an experimentation with sensor and signal analysis takes place, producing a prototype of an augmented guitar described here.
Article
A method for estimating the plucking point of guitar tones is proposed. The algorithm is based on investigating the time lag between two consecutive pulses arriving at the bridge of the guitar. The signal is detected with an under-saddle pickup attached to the bridge. The method determines the minimum of the autocorrelation function for one period of the signal. The time lag of the minimum can be converted into the distance from the bridge where the string was plucked. The results obtained with the method are good, the error remains smaller than one centimetre, except for a few outliers. The algorithm is easy to implement and can be used to analyse playing styles. The efficiency of the method gives the potential to also use it in real-time computer music applications.
Article
In this paper, we present our research on left hand gesture acquisition and analysis in guitar performances. The main goal of our research is the study of expressiveness. Here, we focus on a detection model for the left hand fingering based on gesture information. We use a capacitive sensora to capture fingering positions and we look for a prototyp-ical description of the most common fingering positions in guitar playing. We report the performed experiments and study the obtained results proposing the use of clas-sification techniques to automatically determine the finger positions.
Conference Paper
In this paper,we present a feature-based approach for the classification of different playing techniques in bass guitar recordings. The applied audio features are chosen to capture typical instrument sounds induced by 10 different playing techniques. A novel database that consists of approx. 4300 isolated bass notes was assembled for the purpose of evaluation. The usage of domain-specific features in a combination of feature selection and feature space transformation techniques improved the classification accuracy by over 27% points in comparison to a state-of-the-art baseline system. Classification accuracy reached 93.25% and 95.61% for the recognition of plucking and expression styles respectively.
Article
In this paper, we suggest a novel group delay based method for the onset detection of pitched instruments. It is proposed to approach the problem of onset detection by examining three dimensions separately: phase (i.e., group delay), magnitude and pitch. The evaluation of the suggested onset detectors for phase, pitch and magnitude is performed using a new publicly available and fully onset annotated database of monophonic recordings which is balanced in terms of included instruments and onset samples per instrument, while it contains different performance styles. Results show that the accuracy of onset detection depends on the type of instruments as well as on the style of performance. Combining the information contained in the three dimensions by means of a fusion at decision level leads to an improvement of onset detection by about 8% in terms of F-measure, compared to the best single dimension.
Article
Motion capture to build a foundation for a computer-controlled instrument by study of classical guitar performance
Article
Guitar technology underwent significant changes in the 20th century in the move from acoustic to electric instruments. In the first part of the 21st century, the guitar continues to develop through its interaction with digital technologies. Such changes in guitar technology are usually grounded in what we might call the “cultural identity” of the instrument: that is, the various ways that the guitar is used to enact, influence and challenge sociocultural and musical discourses. Often, these different uses of the guitar can be seen to reflect a conflict between the changing concepts of “noise” and “musical sound.” Yes Yes