Figure 1 - uploaded by Amílcar Cardoso
Content may be subject to copyright.
Characteristics of happy, sad, activating and relaxing music This systematized knowledge is used by works aiming to transform the emotional content of music. These works have developed computational systems with a knowledge- based control of structural and performance features of pre-composed musical scores [2, 7, 16, 17]. Winter [17] built a real-time application to control structural factors of 

Characteristics of happy, sad, activating and relaxing music This systematized knowledge is used by works aiming to transform the emotional content of music. These works have developed computational systems with a knowledge- based control of structural and performance features of pre-composed musical scores [2, 7, 16, 17]. Winter [17] built a real-time application to control structural factors of 

Source publication
Article
Full-text available
We are developing a computational system that produces music expressing desired emotions. This paper is focused on the automatic transformation of 2 emotional dimensions of music (valence and arousal) by changing musical features: tempo, pitch register, musical scales, instruments and articulation. Transformation is supported by 2 regression models...

Context in source publication

Context 1
... are developing a computational system that produces music expressing desired emotions. This paper is focused on the automatic transformation of 2 emotional dimensions of music (valence and arousal) by changing musical features: tempo, pitch register, musical scales, instruments and articulation. Transformation is supported by 2 regression models, each with weighted mappings between an emotional dimension and music features. We also present 2 algorithms used to sequence segments. We made an experiment with 37 listeners who were asked to label online 2 emotional dimensions of 132 musical segments. Data coming from this experiment was used to test the effectiveness of the transformation algorithms and to update the weights of features of the regression models. Tempo and pitch register proved to be relevant on both valence and arousal. Musical scales and instruments were also relevant for both emotional dimensions but with a lower impact. Staccato articulation influenced only valence. The automatic production of music that expresses desired emotions is a problem with a large spectrum for improvements. The importance of developing systems with such a capability is evident to the society. Every context with a particular emotional need can use systems of this kind to accomplish its objectives. However, only recently there has been a great improvement in this area. Scientists have tried to quantify and explain how music expresses certain emotions [3, 4, 11]. Engineers have developed systems with the capability of producing music conveying specific emotions [7, 16, 17] by using the knowledge acquired by scientists. We are developing a computational system used to produce music expressing desired emotions (section 3), grounded on research of Music Psychology and Music Computing (section 2). In this work we are focused on the transformation of music and improvement of the weights of features of the regression models used in the control of the emotional content of music; we also present sequencing algorithms (section 4). We made an experiment with 37 listeners that emotionally labeled 132 musical segments: 63 of transformed and 69 of non- transformed music (section 5). The analysis of the data obtained from this experiment and the update of the regression models are present in section 6. Section 7 makes some final remarks. Our work involves research done in Music Psychology and Music Computing. The comprehension of the influence of musical features in emotional states has contributed to bridge the semantic gap between emotions and music. We are interested in the effect of structural and performance features on the experienced emotion [10]. We analyzed several works [2, 3, 4, 6, 7, 11, 15] and made a systematization of the relevant characteristics to this work that are common to four types of music: happy, sad, activating and relaxing (Figure 1). a composition. Pre-composed scores were manipulated through the application of rules with control values for different features: mode, instrumentation, rhythm and harmony. REMUPP [16] is a system that allows real-time manipulation of features like tonality, mode, tempo and instrumentation. Pre-composed music is given to a music player and specific music features are used to control the sequencer (e.g., tempo); to employ filters and effects (e.g., rhythmic complexity); and to control synthesizers (e.g., instrumentation). Livingstone and Brown [7] implemented a rule-based system to affect perceived emotions by modifying the musical structure. This system is grounded on a list of performance and structural features, and their emotional effect. The KTH rule-based system for music performance [2] relates performance features to emotional expression. This system is grounded on studies of music psychology. The work presented in this paper is part of a project that intends to develop a system that produces music expressing a desired emotion. This objective is accomplished in 3 main stages: segmentation, selection and transformation; and 3 secondary stages: features extraction, sequencing and synthesis. We are using 2 auxiliary structures: a music base and a knowledge base. The music base has pre-composed MIDI music tagged with music features. The knowledge base is implemented as 2 regression models that consist of relations between each emotional dimension and music features. Aided by Figure 2 we will describe with more detail each of these stages. Pre-composed music of the music base is input to a segmentation module that produces fragments. These fragments must as much as possible have a musical sense of its own and express a single emotion. Segmentation consists in a process of discovery of fragments. This process occurs from the beginning of the piece by looking to each note onset with the higher weights. An adaptation of LBDM [1] is used to attribute these weights according to the importance and degree of variation of five features: pitch, rhythm, silence, loudness and instrumentation). Resulting fragments are input to the module of features extraction that obtains music features used to label these fragments which are then stored in the music base. Selection and transformation are supported by the same knowledge base. Selection module intends to obtain musical pieces with an emotional content similar to the desired emotion. These pieces are obtained from the music base, according to similarity metrics between desired emotion and music emotional content. This emotional content is calculated through a weighted sum of the music features, with the help of a vector of weights defined in the knowledge base for each emotional dimension. Selected pieces can then be transformed to come even closer to the desired emotion. Transformation is applied in 6 features (section 4). The knowledge base has weights that control the degree of transformation for each feature. Produced pieces from the transformation module are sequenced in the sequencing module. This module changes musical features with the objective of obtaining a smooth sequence of segments with similar emotional content. This sequence is given to a synthesis module, which uses information about the MIDI instruments and timbral features to guide the selection of sounds from a library of sounds. This section presents the methods being used to transform music, sequence music and improve regression models. Music transformation algorithms have the objective to approximate the emotional content of selected music to the desired emotion. By knowing the characteristics common to different types of music (section 2) we developed six algorithms that transform different features: tempo, pitch register, musical scale, instruments, articulation and contrast of the duration of notes. These algorithms start by calculating the emotional distance between the emotional content of the selected music and the desired emotion. The value of this distance is divided by the value of the weight of the feature being transformed. The value that results from this division corresponds to the amount of increase/decrease we need to make on the feature to approximate music emotional content to the desired emotion. Next paragraphs explain how this increase/decrease is made by each algorithm on the MIDI file. The algorithm used to transform tempo obtains the original tempo of the music (in beats per minute) and then increases/decreases the note onsets and duration of notes. The algorithm that transforms pitch register transposes up/down music by a specific number of octaves to increase/decrease valence/arousal. We choose octaves, because they are the intervallic transformation more consonant [14] with audible repercussion in the frequency spectrum. This is done by adding positive/negative multiples of 12 to the pitch of all the notes. The algorithm that transforms musical scales ...

Similar publications

Research
Full-text available
Computer-composed music is becoming a major key instrument to measure the overall capabilities of software and is potentially hard to distinguish with human composed music. ANTON 2.0 is such a computer system that is able to generate music using mathematical formulas, which take melody, harmony and rhythm into account. The focus of this research li...

Citations

... The subjective component of music evaluation was perception self-assessment by each user. That is, rating (on a 10-point scale) their own valence and arousal (emotional) [6] response to each musical piece. For valence, extremes were from negative to positive and for arousal from calm to excited. ...
Conference Paper
Full-text available
Computer based music generation (synthesis) has a rich history spanning several decades [1]. Current music evolution methods are interactive (periodic user evaluation to drive evolutionary selection), or otherwise feature-based where specific musical feature metrics are incorporated into the fitness function for synthesized music evaluation and selection [2]. In the former case, various musical styles and compositions have been evolved to suit user preferences, though evolved composition diversity and complexity are limited by user fatigue and the fitness function (for example, what musical features the user evaluates). In the latter case, evolved music diversity and complexity is similarly limited by fitness function metrics. Thus, metrics conforming to specific musical styles or genres will only result in the artificial evolution of musical compositions that resemble such styles or genres [1], [2].
... Oliveira and Cardoso [13] also perform affective transformations on MIDI music, and utilize the valencearousal approach to affective specification. These are to be mapped on to musical features: tempo, pitch register, musical scales, and instrumentation. ...
Conference Paper
Full-text available
We propose to significantly extend our work in EEG-based emotion detection for automated expressive performances of algorithmically composed music for affective communication and induction. This new system involves music composed and expressively performed in real-time to induce specific affective states, based on the detection of affective state in a human listener. Machine learning algorithms will learn: (1) how to use biosensors such as EEG to detect the user’s current emotional state; and (2) how to use algorithmic performance and composition to induce certain trajectories through affective states. In other words the system will attempt to adapt so that it can – in real-time - turn a certain user from depressed to happy, or from stressed to relaxed, or (if they like horror movies!) from relaxed to fearful. Expressive performance is key to this process as it has been shown to increase the emotional impact of affectively-based algorithmic composition. In other words if a piece is composed by computer rules to communicate an emotion of happiness, applying expressive performance rules to humanize the piece will increase the likelihood it is perceived as happy. As well as giving a project overview, a first step of this research is presented here: a machine learning system using case-based reasoning which attempts to learn from a user how themes of different affective types combine sequentially to communicate emotions.
... Oliveira and Cardoso [13] also perform affective transformations on MIDI music, and utilize the valence-arousal approach to affective specification. These are to be mapped on to musical features: tempo, pitch register, musical scales, and instrumentation. ...
Article
Full-text available
We propose to significantly extend our work in EEG-based emotion detection for computational affective expressive performances of algorithmically composed music. This new system involves affective music composed and expressively performed in real-time, based on the detection of affective state in a human listener. Machine learning algorithms will learn: (1) how to use EEG and other biosensors to detect the user‟s current emotional state; and (2) how to use algorithmic performance and composition to induce certain affective trajectories. In other words the system will attempt to adapt so that it can – in real-time - turn a certain user from depressed to happy, or from stressed to relaxed, or (if they like horror movies!) from relaxed to fearful. Expressive performance is key to this process as it has been shown to increase the emotional impact of affectively-based algorithmic composition. In other words if a piece is composed by computer rules to communicate an emotion of happiness, applying expressive performance rules to humanize the piece will increase the likelihood it is perceived as happy. As well as giving a project overview, a first step of this research is presented here: a machine learning system using case-based reasoning which attempts to learn from a user about which particular expressively performed musical features communicate which affective states. It is also designed to learn how themes of different affective types combine sequentially to communicate emotions.
... Positive/ negative weights were defined for each feature according to their influence on the emotional dimensions (Fig. 1). Then, three experiments [31][32][33]35] were conducted to build regression models and to successively refine their set of features and corresponding weights with data obtained from web-based questionnaires 2,3,4 . The third experiment also aimed to verify the effectiveness of the regression models in supporting the transformation module. ...
... The third experiment [35] was devoted to the verification of the effectiveness of the knowledge base in supporting the transformation algorithms and to make a subsequent update of the regression models. The test involved 132 pieces, 37 listeners and 337 features. ...
... We applied training/test split (66%/34%) and 10-fold cross-validation to evaluate the performance of several classifiers with their default parameters [55]. We used data of the first three experiments [31,32,35] and considered three metrics: correlation coefficient, mean absolute error, root mean square error. The classification of valence and arousal (Figs. 10 and 11) considered the best features of each experiment (Tables 1, 2 and 4). ...
Article
The automatic control of emotional expression in music is a challenge that is far from being solved. This paper describes research conducted with the aim of developing a system with such capabilities. The system works with standard MIDI files and develops in two stages: the first offline, the second online. In the first stage, MIDI files are partitioned in segments with uniform emotional content. These are subjected to a process of features extraction, then classified according to emotional values of valence and arousal and stored in a music base. In the second stage, segments are selected and transformed according to the desired emotion and then arranged in song-like structures.The system is using a knowledge base, grounded on empirical results of works of Music Psychology that was refined with data obtained with questionnaires; we also plan to use data obtained with other methods of emotional recognition in a near future. For the experimental setups, we prepared web-based questionnaires with musical segments of different emotional content. Each subject classified each segment after listening to it, with values for valence and arousal. The modularity, adaptability and flexibility of our system’s architecture make it applicable in various contexts like video-games, theater, films and healthcare contexts.
... As the score plays a central role in Western music in determining musical emotion (Thompson and Robitaille 1992;Gabrielsson and Lindstrom 2001), this focus on performance limits the utility of these two systems for research on emotion. More recently, Oliveira and Cardoso (2009) began investigating modifications to the score. ...
Article
Full-text available
Composers and performers communicate emotional intentions through the control of basic musical features such as pitch, loudness, and articulation. The extent to which emotion can be controlled by software through the systematic manipulation of these features has not been fully examined. To address this, we present CMERS, a Computational Music Emotion Rule System for the real-time control of musical emotion that modifies features at both the score level and the performance level. In Experiment 1, 20 participants continuously rated the perceived emotion of works each modified to express happy, sad, angry, tender, and normal. Intended emotion was identified correctly at 78%, with valence and arousal significantly shifted regardless of the works’ original emotions. Existing systems developed for expressive performance, such as Director Musices (DM), focus on modifying features of performance. To study emotion more broadly, CMERS modifies features of both score and performance. In Experiment 2, 18 participants rated music works modified by CMERS and DM to express five emotions. CMERS’s intended emotion was correctly identified at 71%, DM at 49%. CMERS achieved significant shifts in valence and arousal, DM in arousal only. These results suggest that features of the score are important for controlling valence. The effects ofmusical training on emotional identification accuracy are also discussed.