| Means of perceived valence ratings across pitch register for each instrument family and each musical training group.

| Means of perceived valence ratings across pitch register for each instrument family and each musical training group.

Source publication
Article
Full-text available
Composers often pick specific instruments to convey a given emotional tone in their music, partly due to their expressive possibilities, but also due to their timbres in specific registers and at given dynamic markings. Of interest to both music psychology and music informatics from a computational point of view is the relation between the acoustic...

Context in source publication

Context 1
... contrasts on valence, tension arousal and energy arousal were computed over octaves 2-6 (in which all instrument families are present) with the lsmeans package in R separately for musicians and nonmusicians (see Table S2). For Valence ratings (Figure 2), register was highly significant and globally presents a concave (inverted U-shaped) increasing form with a peak around octave 5 or 6. The polynomial contrasts reveal significant linear increasing and concave quadratic trends for brass, woodwinds, and strings for nonmusicians and for brass and strings for musicians. ...

Citations

... One criticism of the dimensional approach is that it may not be able to distinguish between, e.g., anger and fear, as they are both negatively valenced and high in arousal. Some dimensional approaches include a third dimension, for example distinguishing between two forms of arousal: tension and energy (e.g., Eerola et al., 2012;McAdams et al., 2017). ...
... For the dimensional model, valence and tension arousal were strongly correlated. Whereas some previous studies also find that valence and tension arousal are strongly correlated (Eerola et al., 2012;Eerola & Vuoskoski, 2011;Vuoskoski & Eerola, 2011), curiously a study that used a highly similar stimulus selection to the current Exp 1 did not find that the two scales correlated strongly (McAdams et al., 2017;r(135) = -.46). One major difference between the current experiment and McAdams et al. was that the former was conducted online and the latter in a laboratory environment. ...
... Musical sophistication was also frequently correlated with the affect ratings. Although some studies have found that there was no difference in affective response between musicians and nonmusicians (e.g., Bigand et al., 2005), other studies have (e.g., McAdams et al., 2017). This discrepancy may be partly explained by the ambiguity in differentiating musicians from nonmusicians, which the Gold-MSI measuring musical sophistication tries to circumvent. ...
Conference Paper
Full-text available
One source of academic discourse is what model or method best represents the affective response to music. Here, we compare the two main affect models, dimensional (valence, tension arousal, and energy arousal) and discrete (anger, fear, sadness, happiness, and tenderness), as a method for capturing the self-reported affective response to affectively ambiguous, short, musical sounds. Stimulus length (single notes vs. chromatic scales) and locus of affect (perceived or induced) are compared, and individual differences are measured, as these may all influence the applicability of either model. We find that consistency and rater agreement are high overall, but slightly higher for the dimensional model. Correlation and principal component analyses show that two dimensions or two discrete affect categories capture most of the variation in the affective response. Furthermore, energy arousal varies in a manner that is not captured by the discrete affect model. All sources of individual differences are moderately correlated with the affect scales, in particular pre-existing mood. The dimensional model is also slightly more influenced by individual differences than is the discrete model. Thus, our results suggest that for both affect loci and stimulus lengths two dimensions of valence and energy arousal best capture the affective response. Future studies should take the role of individual differences into account.
... Vibrations caused by interactions with the mouthpiece set up resonances in the tube, creating the note, but because tubes resonate at different frequencies depending on the energy of the vibration, a single pattern of finger interactions can activate several different notes (Woldendorp et al. 2016). The set of notes available with a certain resonance energy is referred to as a register within the range of an instrument (McAdams et al. 2017), and each register contains a set of unique key or valve selections corresponding to a range of notes. A trumpet, for example, with three valves, has a total of 8 possible valve combinations in each register, but in practice, most registers contain fewer than 8 notes; in the extreme case, a bugle is a brass horn with no valves, and all note selection is done by altering the mouthpiece interactions to activate one of the available resonance frequencies of the horn. ...
Article
Full-text available
Despite the benefits of learning an instrument, many students drop out early because it can be frustrating for the student, expensive for the caregiver, and loud for the household. Virtual Reality (VR) and Extended Reality (XR) offer the potential to address these challenges by simulating multiple instruments in an engaging and motivating environment through headphones. To assess the potential for commercial VR to augment musical experiences, we used standard VR implementation processes to design four virtual trumpet interfaces: camera-tracking with tracked register selection (two ways), camera-tracking with voice activation, and a controller plus a force-feedback haptic glove. To evaluate these implementations, we created a virtual music classroom that produces audio, notes, and finger pattern guides loaded from a selected Musical Instrument Digital Interface (MIDI) file. We analytically compared these implementations against physical trumpets (both acoustic and MIDI), considering features of ease of use, familiarity, playability, noise, and versatility. The physical trumpets produced the most reliable and familiar experience, and some XR benefits were considered. The camera-based methods were easy to use but lacked tactile feedback. The haptic glove provided improved tracking accuracy and haptic feedback over camera-based methods. Each method was also considered as a proof-of-concept for other instruments, real or imaginary.
... In the present experiment, participants rated instrument tones from semitone clusters of different F0-registers and three different dynamic levels on the 20 scales. Thus, the approach is similar to work by McAdams et al. (2017), who measured affective qualities of instrument tones across a wide range of F0s. The resulting rating profiles were interpreted qualitatively. ...
Article
Full-text available
Traditional approaches in timbre research have often equalized sounds according to pitch, loudness, duration in order to study timbral differences across instruments. In a compact case study of the semantic qualities of the oboe and French horn, Reymore (2021) takes a different approach and considers timbral differences within musical instruments, which arise due to the covariation of timbre with the musical parameters of fundamental frequency (pitch) and playing effort (dynamic level). The study constitutes a timely contribution to a growing body of work on the covariation between timbre, pitch, and loudness. After providing a background and summary of important aspects of the target article, I elaborate on results from a recent complementary study that analyzed acoustical signal properties regarding that matter. Finally, I address three important issues in this context that appear to be worthy of future research.
... Musical training was also reported to have an impact on emotion perception. In addition, lower frequencies were rated with lower valence by musicians in a study by [38]. This finding may be due to the impact of musical training on one's perception of musical cues and their relation to conveyed emotion [39]. ...
Article
Full-text available
Music is capable of conveying many emotions. The level and type of emotion of the music perceived by a listener, however, is highly subjective. In this study, we present the Music Emotion Recognition with Profile information dataset (MERP). This database was collected through Amazon Mechanical Turk (MTurk) and features dynamical valence and arousal ratings of 54 selected full-length songs. The dataset contains music features, as well as user profile information of the annotators. The songs were selected from the Free Music Archive using an innovative method (a Triple Neural Network with the OpenSmile toolkit) to identify 50 songs with the most distinctive emotions. Specifically, the songs were chosen to fully cover the four quadrants of the valence-arousal space. Four additional songs were selected from the DEAM dataset to act as a benchmark in this study and filter out low quality ratings. A total of 452 participants participated in annotating the dataset, with 277 participants remaining after thoroughly cleaning the dataset. Their demographic information, listening preferences, and musical background were recorded. We offer an extensive analysis of the resulting dataset, together with a baseline emotion prediction model based on a fully connected model and an LSTM model, for our newly proposed MERP dataset.
... Musical training was also reported to have an impact on emotion perception. In addition, lower frequencies were rated with lower valence by musicians in a study by [38]. This finding may be due to the impact of musical training on one's perception of musical cues and their relation to conveyed emotion [39]. ...
Preprint
Full-text available
Music is capable of conveying many emotions. The level and type of emotion of the music perceived by a listener, however, is highly subjective. In this study, we present the Music Emotion Recognition with Profile information dataset (MERP). This database was collected through Amazon Mechanical Turk (MTurk) and features dynamical valence and arousal ratings of 54 selected full-length songs. The dataset contains music features, as well as user profile information of the annotators. The songs were selected from the Free Music Archive using an innovative method (a Triple Neural Network with the OpenSmile toolkit) to identify 50 songs with the most distinctive emotions. Specifically, the songs were chosen to fully cover the four quadrants of the valence arousal space. Four additional songs were selected from DEAM to act as a benchmark in this study and filter out low quality ratings. A total of 277 participants participated in annotating the dataset, and their demographic information, listening preferences, and musical background were recorded. We offer an extensive analysis of the resulting dataset, together with a baseline emotion prediction model based on a fully connected model and an LSTM model, for our newly proposed MERP dataset.
... Questions about emotional impression are based on the valence-arousal model of emotions [38]. The dimension tired/awake was added based on a previous study by [39] showing that calm/exciting and tired/awake appear to be two different emotional properties that are not correlated. Furthermore, participants were asked to provide free associations for each image and chose a keyword out of five (inspiring, annoying, boring, peaceful, pretty) to describe the image. ...
... In contrast, the overall average grade on the axis "Tired/Awake" was 4.6, denoting that the images include the participants to fell more "awake" than "tired". This result demonstrates that these two emotions are independent [39]. ...
Article
Full-text available
People have an ancient and strong bond to flowers, which are known to have a positive effect on the mood. During the COVID-19 pandemic, sales of ornamental plants increased, and many turned to gardening, possibly as a way to cope with ubiquitous increases in negative mood following lockdowns and social isolation. The nature of the special bond between humans and flowers requires additional elucidation. To this means, we conducted a comprehensive online mixed methods study, surveying 253 individuals (ages 18–83) from diverse ethnic backgrounds and continents, regarding their thoughts and feelings towards photos of flowers, nature scenes and flower drawings. We found that looking at pictures and drawings of flowers, as well as nature scenes induced positive emotions, and participants reported a variety of positive responses to the images. More specifically, we found associations of flowers with femininity, and connotations to particular flowers that were affected by geographical location. While nature scene photos induced positive reactions, flower photos were preferred, denying a mere substitution of nature by flowers and vice versa. Drawings of flowers elicited less positive emotions than photos, as people related more to the art than to the flower itself. Our study reveals the importance of ornamental flowers and nature in our life and well-being, and as such their cultivation and promotion are essential.
... that MDS-generated timbre spaces are often described as being dependent on acoustic features like the attack time and spectral centroid. Consistently, the researchers also wanted to depict acoustically the semantic aspects of musical timbre (Alluri and Toiviainen, 2010;Disley and Howard, 2004;McAdams et al., 2017;Zacharakis et al., 2014). ...
... Nowadays, we can make the most of machine learning and feature extraction tools to characterize these metaphorical terms acoustically (Bogdanov et al., 2013;McFee et al., 2015;Peeters et al., 2011). There have already been some attempts to model semantic dimensions or concepts using machine learning tools (Jiang et al., 2020;McAdams et al., 2017;Pearce et al., 2019). ...
... For example (McAdams et al., 2017), looking at the importance of fundamental frequency on the evaluation of subjective timbre qualities, used a neural network which was shown to perform better than a PLSR on accuracy. The challenge for such paradigms is to be able to collect enough judgement on sounds to allow a convergence of the learning model. ...
Thesis
Full-text available
The mysteries of the auditory world prompt us all to describe, as best as we can, what we hear. In some professional environments, the ability to verbally convey one’s perception of sound qualities is crucial, whether you are a sound engineer, a musician, a sound designer, or a composer. Sometimes, talking about a sensation of any kind leads us to use metaphorical vocabulary. Thus, communication in the world of sound and music often depends on terms extracted from other sensory modalities like vision or touch. This is the case of four well-known attributes at the heart of this study, brightness, warmth, roundness, and roughness. But do we all have the same auditory sensation associated with such "extrasonic" concepts? To what extent are we able to faithfully describe a sensation expressed by these metaphors? Brightness, warmth, roundness, roughness. The meaning of these terms used as sound attributes has been studied within the general framework of the semantic dimensions of sounds. However, the specific origins of such metaphorical terms and their mutual connections remain to be discovered. The aim of this study is to explore and expose the connection between these attributes and their projection in the sound domain. In other words, we aim to align their semantic definitions with mental representations expressed by their acoustic portraits. For each of the four attributes, we have reported on different layers of semantic descriptions that can be acoustic, metaphorical, or source-related. Through interviews and an online survey, we were able to develop definitions for each of the attributes based on the most relevant information from a population of sound professionals. However, the four terms depended on a lot of metaphorical elements that were still difficult to elucidate. To disambiguate these metaphorical descriptions, we asked three different expert populations (sound engineers, conductors and non-experts) to evaluate brightness, warmth, roundness and roughness in a corpus of orchestral sounds. We chose to use the new method of Best-Worst Scaling to fulfill that goal. This method allowed us to show that while some concepts transcend sound expertise, others can be specific to it. Gathering the data from the sound professionals brought forth a musical composition called� Quadrangulation – by Bertrand Plé – whose objective was to illustrate and transmit the meaning of the four concepts. Through this interdisciplinary approach, we shed light on connections between our ability to understand a sound attribute’s meaning and the mental representation associated with them. In addition, we uncovered potential incongruities between the perceptual projection of a metaphorical sound concept and the clarity of its definition. Finally, based on our results, we proposed a semantic explanation of the relations between the four concepts, thus inviting a better understanding of their use in professional conversations.
... The instruments displayed variations in dynamics (from piano to forte) and pitch (octaves of C). Similarly to McAdams et al. (2017), and to avoid any potential bias created by intervals, we only presented octaves of C (except for multiphonics). Besides, some studies have observed an influence of pitch on the appreciation of timbre (Allen & Oxenham, 2014;Alluri & Toiviainen, 2010;Marozeau et al., 2003;McAdams et al., 2017;Siedenburg et al., 2021). ...
... Similarly to McAdams et al. (2017), and to avoid any potential bias created by intervals, we only presented octaves of C (except for multiphonics). Besides, some studies have observed an influence of pitch on the appreciation of timbre (Allen & Oxenham, 2014;Alluri & Toiviainen, 2010;Marozeau et al., 2003;McAdams et al., 2017;Siedenburg et al., 2021). For comfort reasons and to exclude the loudness as a main factor, we normalized the loudness of each sound sample (-23 LUFS) following the EBU norm on loudness (R-128). ...
Article
Full-text available
Music or sound professionals use specific terminology to communicate about timbre. Some key terms do not come from the sound domain and do not have a clear definition due to their metaphorical nature. This work aims to reveal shared meanings of four well-used timbre attributes: bright, warm, round, and rough. We conducted two complementary studies with French sound and music experts (e.g., composers, sound engineers, sound designers, musicians, etc.). First, we led interviews to gather definitions and instrumental sound examples for the four attributes (N = 32). Second, using an online survey, we tested the relevance and consensus on multiple descriptions most frequently evoked during the interviews (N = 51). The analysis of the rich corpus of verbalizations from the interviews yielded the main description strategies used by the experts, namely acoustic, metaphorical, and source-related. We also derived definitions for the attributes based on significantly relevant and consensual descriptions according to the survey results. Importantly, the definitions rely heavily on metaphorical descriptions. In sum, this study presents an overview of the shared meaning and perception of four metaphorical timbre attributes in the French language.
... For each of the five models, the random structure was established first by comparing two models containing all three variables of interest and all interactions; the first model included only random intercepts for participant and stimulus, whereas the second model used the maximal random effects structure (random intercepts for participant with random slopes for register, instrument family, and technique and random intercepts for the stimuli). This maximal random effects structure was derived from that of similar linear mixed models described in McAdams et al. (2017; see also Barr et al., 2013). Next, these pairs of models (effects only vs. effects and slopes) were compared using a log likelihood ratio test via the anova function. ...
... Our observed increasing linear trend of valence and register represents a different finding from that of McAdams et al. (2017), who observed a non-linear, convex relationship of valence and register (except in the percussion family) with the sixth octave having lower predicted mean valence (preference) ratings across instruments than the fifth. This difference in findings may in part be due to the distribution of extended technique sounds across registers in our stimulus set: overall our set included relatively fewer examples in the 6th register and fewer examples within this register that used extended techniques. ...
... Prior to building the models, we performed an analysis of collinearity among descriptors across the stimulus set. As in previous literature (e.g., Peeters et al., 2011;McAdams et al., 2017), descriptors were multicollinear. A hierarchical cluster analysis was performed using Ward linkage with Euclidean distance; the dendrogram, which demonstrates the FIGURE 2 | Estimated marginal means for register (octave) in models of exertion, valence, raspy/grainy/rough, and harsh/noisy; vertical bars represent 95% confidence intervals. ...
Article
Full-text available
Audio features such as inharmonicity, noisiness, and spectral roll-off have been identified as correlates of “noisy” sounds. However, such features are likely involved in the experience of multiple semantic timbre categories of varied meaning and valence. This paper examines the relationships of stimulus properties and audio features with the semantic timbre categories raspy/grainy/rough, harsh/noisy, and airy/breathy. Participants (n = 153) rated a random subset of 52 stimuli from a set of 156 approximately 2-s orchestral instrument sounds representing varied instrument families (woodwinds, brass, strings, percussion), registers (octaves 2 through 6, where middle C is in octave 4), and both traditional and extended playing techniques (e.g., flutter-tonguing, bowing at the bridge). Stimuli were rated on the three semantic categories of interest, as well as on perceived playing exertion and emotional valence. Correlational analyses demonstrated a strong negative relationship between positive valence and perceived physical exertion. Exploratory linear mixed models revealed significant effects of extended technique and pitch register on valence, the perception of physical exertion, raspy/grainy/rough, and harsh/noisy. Instrument family was significantly related to ratings of airy/breathy. With an updated version of the Timbre Toolbox (R-2021 A), we used 44 summary audio features, extracted from the stimuli using spectral and harmonic representations, as input for various models built to predict mean semantic ratings for each sound on the three semantic categories, on perceived exertion, and on valence. Random Forest models predicting semantic ratings from audio features outperformed Partial Least-Squares Regression models, consistent with previous results suggesting that non-linear methods are advantageous in timbre semantic predictions using audio features. Relative Variable Importance measures from the models among the three semantic categories demonstrate that although these related semantic categories are associated in part with overlapping features, they can be differentiated through individual patterns of audio feature relationships.
... Audio features have been widely used in timbre research for explaining quantitatively the dimensions of timbre spaces (Grey and Gordon, 1978;Iverson and Krumhansl, 1993;McAdams et al., 1995;Lakatos, 2000), affective ratings (Laurier et al., 2009;Farbood and Price, 2017;McAdams et al., 2017), and the perceptual similarity of short music clips (Siedenburg and Müllensiefen, 2017). Most often, the spectral features are derived from statistical computations on a spectrogram, whereas the temporal features are usually extracted from the raw waveform. ...
Article
Full-text available
Two experiments were conducted for the derivation of psychophysical scales of the following audio descriptors: spectral centroid, spectral spread, spectral skewness, odd-to-even harmonic ratio, spectral deviation, and spectral slope. The stimulus sets of each audio descriptor were synthesized and (wherever possible) independently controlled through appropriate synthesis techniques. Partition scaling methods were used in both experiments, and the scales were constructed by fitting well-behaving functions to the listeners' ratings. In the first experiment, the listeners' task was the estimation of the relative differences between successive levels of a particular audio descriptor. The median values of listeners' ratings increased with increasing feature values, which confirmed listeners' abilities to estimate intervals. However, there was a large variability in the reliability of the derived interval scales depending on the stimulus spacing in each trial. In the second experiment, listeners had control over the stimulus values and were asked to divide the presented range of values into perceptually equal intervals, which provides a ratio scale. For every descriptor, the reliability of the derived ratio scales was excellent. The unit of a particular ratio scale was assigned empirically so as to facilitate qualitative comparisons between the scales of all audio descriptors. The construction of psychophysical scales based on univariate stimuli allowed for the establishment of cause-and-effect relations between audio descriptors and perceptual dimensions, contrary to past research that has relied on multivariate stimuli and has only examined the correlations between the two. Most importantly, this study provides an understanding of the ways in which the sensation magnitudes of several audio descriptors are apprehended.