Article

A Generative Theory of Tonal Music

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Music is a high-level cognitive capacity that exists universally across human cultures. Similar to the structural underpinnings of language (e.g., linguistic syntax), music has an abstract, rhythmic-harmonic structure that is thought to convey socioaffective meaning (Lerdahl and Jackendoff, 1983;Patel, 2010). At a fundamental level of organization, musical events (e.g., notes, chords) are arranged, accented, and sustained across time through the organizing principles of rhythm and meter. ...
... Musical rhythm refers to patterns of stress and timing of individual acoustic events, while musical meter reflects the organization of musical events on multiple, hierarchically nested timescales. At the principal level in the hierarchical organization, music has a basic beat, the tactus, often described as an underlying pulse of a musical work (Cooper and Meyer, 1963;Lerdahl and Jackendoff, 1983;London, 2004;Large et al., 2015). The beat, while not necessarily the slowest or fastest rhythmic component of a musical work, is often the most perceptually salient level of metrical organization-the level at which listeners and dancers behaviorally entrain to music, such as tapping their feet or nodding their heads. ...
... The beat, while not necessarily the slowest or fastest rhythmic component of a musical work, is often the most perceptually salient level of metrical organization-the level at which listeners and dancers behaviorally entrain to music, such as tapping their feet or nodding their heads. Additionally, musical meter reflects alternating patterns of strong-and-weak beats, such that some beats are physically or perceptually accented relative to other (Cooper and Meyer, 1963;Lerdahl and Jackendoff, 1983;London, 2004;Large et al., 2015). Finally, music also contains faster events (London, 2004;Large et al., 2015), collectively called "rhythmic patterns" or "rhythmic groups, " that reflect the relative durations between auditory events, and, importantly, the perceptual grouping of these events (Cooper and Meyer, 1963). ...
Article
Full-text available
Musical rhythm abilities—the perception of and coordinated action to the rhythmic structure of music—undergo remarkable change over human development. In the current paper, we introduce a theoretical framework for modeling the development of musical rhythm. The framework, based on Neural Resonance Theory (NRT), explains rhythm development in terms of resonance and attunement, which are formalized using a general theory that includes non-linear resonance and Hebbian plasticity. First, we review the developmental literature on musical rhythm, highlighting several developmental processes related to rhythm perception and action. Next, we offer an exposition of Neural Resonance Theory and argue that elements of the theory are consistent with dynamical, radically embodied (i.e., non-representational) and ecological approaches to cognition and development. We then discuss how dynamical models, implemented as self-organizing networks of neural oscillations with Hebbian plasticity, predict key features of music development. We conclude by illustrating how the notions of dynamical embodiment, resonance, and attunement provide a conceptual language for characterizing musical rhythm development, and, when formalized in physiologically informed dynamical models, provide a theoretical framework for generating testable empirical predictions about musical rhythm development, such as the kinds of native and non-native rhythmic structures infants and children can learn, steady-state evoked potentials to native and non-native musical rhythms, and the effects of short-term (e.g., infant bouncing, infant music classes), long-term (e.g., perceptual narrowing to musical rhythm), and very-long term (e.g., music enculturation, musical training) learning on music perception-action.
... [2.2] Our work, moreover, is less directly related to generative linguistics than authors like Lerdahl and Jackendoff (1983) or Marsden (2005Marsden ( , 2010. We do, however, incorporate a concept from computational linguistics: skipgrams, or pa erns among nonadjacent elements. ...
... The features listed above were selected for transparency of operationalization. The features bear a significant resemblance to Lerdahl and Jackendoff's (1983) time-span reduction preference rules (TSRPRs). In particular, features 1 and 2 above are related to TSRPRs 1 and 3. Lerdahl and Jackendoff also draw on Gestalt principles and pitch proximity, to which our contiguity feature corresponds. ...
... [7.4] Despite our reservations about Lerdahl and Jackendoff's (1983) formalization, the features we identify correspond to some of their time-span reduction preference rules. The primary difference is that the method presented here does not require segmentation before applying a reduction Return to beginning method. ...
Article
Full-text available
Onset (metric position) and contiguity (pitch adjacency and time proximity) are two melodic features that contribute to the salience of individual notes (core tones) in a monophonic voice or polyphonic texture. Our approach to reductions prioritizes contextual features like onset and contiguity. By awarding points to notes with such features, our process selects core tones from melodic surfaces to produce a reduction. Through this reduction, a new form of musical pattern discovery is possible that has similarities to Gjerdingen’s (".fn_cite_year(gjerdingen2007).")galantschemata.Recurringngrams(scaledegreeskeletons)arematchedinanalgorithmicapproachthatwehavetestedmanually(withaprintedscoreandpenandpaper)andimplementedcomputationally(withsymbolicdataandscriptedalgorithmsinMATLAB).ArelativelysimplemethodsuccessfullyidentifiesthelocationofallstatementsofthesubjectinBachsFugueinCMinor(BWV847)identifiedbyBruhn(".fnciteyear(gjerdingen_2007).") galant schemata. Recurring n-grams (scale degree skeletons) are matched in an algorithmic approach that we have tested manually (with a printed score and pen and paper) and implemented computationally (with symbolic data and scripted algorithms in MATLAB). A relatively simple method successfully identifies the location of all statements of the subject in Bach’s Fugue in C Minor (BWV 847) identified by Bruhn (".fn_cite_year(bruhn_1993).") and the location of all instances of the Prinner and Meyer schemata in Mozart’s Sonata in C Major (K. 545/i) identified by Gjerdingen (".fn_cite_year(gjerdingen2007).").WealsoapplythemethodtoanexcerptbyKirnbergeranalyzedinRabinovitch(".fnciteyear(gjerdingen_2007)."). We also apply the method to an excerpt by Kirnberger analyzed in Rabinovitch (".fn_cite_year(rabinovitch_2019)."). Analysts may use this flexible method for pattern discovery in reduced textures through software freely accessible at https://www.atavizm.org. While our case studies in the present article are from eighteenth-century European music, we believe our approach to reduction and pattern discovery is extensible to a variety of musics.
... La régularité et la répétition de l'accentuation subjective des beats (forts et faibles) permettraient à l'auditeur de percevoir la structure métrique du rythme de la musique (Cooper & Meyer, 1960;Lerdahl & Jackendoff, 1983). C'est cette régularité de l'accentuation des beats qui agencerait les éléments de cette structure rythmique entre eux. ...
... C'est cette régularité de l'accentuation des beats qui agencerait les éléments de cette structure rythmique entre eux. Selon la théorie métrique de Lerdahl & Jackendoff (1983), le beat est un construit perceptif qui n'a pas toujours de réalité physique. En effet, la perception du beat et de la métrique est basée sur une interprétation qui est influencée par des stimuli externes mais aussi par des éléments endogènes (Iversen, Repp, & Patel, 2009). ...
... En effet, la perception du beat et de la métrique est basée sur une interprétation qui est influencée par des stimuli externes mais aussi par des éléments endogènes (Iversen, Repp, & Patel, 2009). Par conséquent, l'auditeur percevrait les beats à partir d'une séquence rythmique complexe dont la périodicité et la régularité seraient extraites sur base des caractéristiques physiques et temporelles des événements acoustiques mais aussi en se référant à la perception subjective qu'il aurait de ces caractéristiques (Lerdahl & Jackendoff, 1983;Palmer & Krumhansl, 1990;London, 2012). ...
Thesis
Dans les interventions musicales réalisées auprès de personnes atteintes de la maladie d’Alzheimer ou de maladies apparentées, il est fréquemment demandé aux participants de bouger au rythme de la musique. La synchronisation au rythme musical, particulièrement en groupe, implique des réponses à différents niveaux (moteur, rythmique, social et émotionnel) et pourrait procurer du plaisir ainsi que renforcer les liens sociaux des patients et de leur entourage. Cependant, la synchronisation au rythme de la musique et le lien qui pourrait exister entre ces différents niveaux de la réponse à cette activité sont peu connus dans la maladie d’Alzheimer. L’objectif de cette thèse est d’examiner les différents aspects du comportement des personnes avec une maladie d’Alzheimer (ou maladies apparentées) et des participants avec un vieillissement physiologique ‘normal’ au cours d’une activité de synchronisation au rythme musical réalisée en action conjointe avec un musicien. L’approche préconisée dans ce travail se base sur une méthode pluridisciplinaire incluant les sciences du mouvement, la psychologie sociale et la neuropsychologie. En premier lieu, nous avons étudié l’effet du contexte social et de la musique (et de ses caractéristiques temporelles) sur les performances de synchronisation et sur l’engagement social, émotionnel, rythmique et moteur de personnes atteintes de la maladie d’Alzheimer dans cette activité (étude 1 chapitre 4 et 5). Les résultats ont montré que la présence physique d’une chanteuse réalisant la tâche de synchronisation avec le participant modulait différemment les performances de synchronisation et la qualité de la relation sociale et émotionnelle par comparaison à un enregistrement audio-visuel de cette chanteuse. Cet effet du contexte social était d’ailleurs plus important en réponse à la musique qu’au métronome et était modulé par le tempo et la métrique. De plus, nous avons trouvé que la musique augmentait l’engagement rythmique des participants par comparaison au métronome. Ensuite, nous avons comparé les réponses à la tâche de synchronisation dans le vieillissement pathologique et physiologique (étude 2 chapitre 6 et 7). Les résultats ont révélé que les performances de synchronisation ne différaient pas entre les deux groupes suggérant une préservation du couplage audio-moteur dans la maladie d’Alzheimer à travers cette tâche. Bien que la maladie réduisait l’engagement moteur, social et émotionnel en réponse à la musique par comparaison au vieillissement physiologique, un effet du contexte social était observé sur le comportement dans les deux groupes. Enfin, nous avons comparé les groupes de participants atteints de la maladie d’Alzheimer entre les deux études montrant que la sévérité de la maladie pouvait altérer la synchronisation et l’engagement dans l’activité (chapitre 8). En conclusion, ce travail de thèse a mis en évidence que le couplage audio-moteur est en partie préservé chez les personnes atteintes de la maladie d’Alzheimer et que l’action conjointe avec un partenaire module la qualité de la relation sociale ainsi que l’engagement à la musique. Les connaissances théoriques acquises par ce travail permettent de mieux comprendre l’évolution des comportements en réponse à la musique dans la maladie d’Alzheimer. La méthode mise au point par cette thèse offre ainsi l’opportunité d’évaluer les bénéfices thérapeutiques des interventions musicales à différents niveaux sur le comportement des personnes avec une maladie d’Alzheimer. De telles perspectives permettraient d’améliorer la prise en charge de ces personnes et de leurs aidants.
... Note that transpositions are only applied to the left child, because Western classical music is thought to be fundamentally goal directed [2,8,62]. This means that the character of a section is largely determined by how it ends (the right child), which should also be reflected in the value of the parent node. ...
... with non-terminal variables x (1) , x (2) , . . . ∈ X and terminal variables y (1) , y (2) , . . . ...
... with non-terminal variables x (1) , x (2) , . . . ∈ X and terminal variables y (1) , y (2) , . . . ∈ Y, we introduce new non-terminal variables x y (1) , x y (2) , . . . ...
Preprint
Full-text available
Probabilistic context-free grammars (PCFGs) and dynamic Bayesian networks (DBNs) are widely used sequence models with complementary strengths and limitations. While PCFGs allow for nested hierarchical dependencies (tree structures), their latent variables (non-terminal symbols) have to be discrete. In contrast, DBNs allow for continuous latent variables, but the dependencies are strictly sequential (chain structure). Therefore, neither can be applied if the latent variables are assumed to be continuous and also to have a nested hierarchical dependency structure. In this paper, we present Recursive Bayesian Networks (RBNs), which generalise and unify PCFGs and DBNs, combining their strengths and containing both as special cases. RBNs define a joint distribution over tree-structured Bayesian networks with discrete or continuous latent variables. The main challenge lies in performing joint inference over the exponential number of possible structures and the continuous variables. We provide two solutions: 1) For arbitrary RBNs, we generalise inside and outside probabilities from PCFGs to the mixed discrete-continuous case, which allows for maximum posterior estimates of the continuous latent variables via gradient descent, while marginalising over network structures. 2) For Gaussian RBNs, we additionally derive an analytic approximation, allowing for robust parameter optimisation and Bayesian inference. The capacity and diverse applications of RBNs are illustrated on two examples: In a quantitative evaluation on synthetic data, we demonstrate and discuss the advantage of RBNs for segmentation and tree induction from noisy sequences, compared to change point detection and hierarchical clustering. In an application to musical data, we approach the unsolved problem of hierarchical music analysis from the raw note level and compare our results to expert annotations.
... Perceptually, happens in a frequency range bellow the human hearing and as been considered as a "interaction between meter and grouping" (Clarke, 1999). Recent research by Rohrmeier (2020) contributed to the advancement of formalizing musical rhythm since the seminal work of Lerdahl and Jackendoff (1996). Regarding the task of automatic rhythm estimation, Weihs et al. (2019) splits musical rhythm into five components, beat, tempo, meter, timing and grouping. ...
... The quantization level is an inaudible isochronous sequence of time-points, equidistant time-span (Lerdahl & Jackendoff, 1996) or period, commonly referred to as pulses (Snyder, 2001), which provides a framework against which durations and patterns are heard. The quantization resolution should accommodate the fastest rhythmic gesture or at least the tactus (Lerdahl & Jackendoff, 1996) or pulse salience (Parncutt, 1994), i.e., the metrical level at which a listener naturally taps his foot and where the perception of regularities are stronger. ...
... The quantization level is an inaudible isochronous sequence of time-points, equidistant time-span (Lerdahl & Jackendoff, 1996) or period, commonly referred to as pulses (Snyder, 2001), which provides a framework against which durations and patterns are heard. The quantization resolution should accommodate the fastest rhythmic gesture or at least the tactus (Lerdahl & Jackendoff, 1996) or pulse salience (Parncutt, 1994), i.e., the metrical level at which a listener naturally taps his foot and where the perception of regularities are stronger. ...
Chapter
In this paper, we review computational methods for the representation and similarity computation of musical rhythms in both symbolic and sub-symbolic (e.g., audio) domains. Both tasks are fundamental to multiple application scenarios from indexing, browsing, and retrieving music, namely navigating musical archives at scale. Stemming from the literature review, we identified three main rhythmic representations: string (sequence of alpha-numeric symbols to denote the temporal organization of events), geometric (spatio-temporal pictorial representation of events), and feature lists (transformation of audio into a temporal series of features or descriptors), and twofold categories of feature- and transformation-based distance metrics for similarity computation. Furthermore, we address the gap between explicit (symbolic) and implicit (musical audio) rhythmic representations stressing that a greater interaction across modalities would promote a holistic view of the temporal music phenomena. We conclude the article by unveiling avenues for future work on (1) hierarchical, (2) multi-attribute and (3) rhythmic layering models grounded in methodologies across disciplines, such as perception, cognition, mathematics, signal processing, and music.
... Tabla 1: Tipos de acentos (Lerdahl y Jackendoff, 1983). ...
... Los principios generales de la agrupación han sido descritos consistentemente por la psicología de la Gestalt y posteriormente aplicados a la audición (Bergman, 1990;Lerdahl y Jackendoff, 1983). En relación con las estructuras temporales, existe abundante evidencia que confirma la importancia de la agrupación como facilitadora del procesamiento rítmico musical (Drake, 1998;Drake y Betrand, 2001;Purwins et al., 2008;Ravignani et al., 2017). ...
Article
Full-text available
Con el propósito de optimizar el escaso tiempo asignado a la clase de música y responder a las necesidades de aprendizaje musical del estudiantado, uno de los caminos es considerar teorías y hallazgos de investigación sobre el pro-cesamiento cognitivo. En relación a la formación rítmico-musical, se sabe que es fundamental desde los primeros años de vida, puesto que el ritmo es un componente integral dentro del desarrollo humano y el aprendizaje. En este trabajo se presenta un panorama teórico respecto al procesamiento cognitivo de información rítmico-musical y se plantean posibles caminos para el logro de aprendizajes rítmicos perceptivo-productivos por parte de estudian-tes de Educación Primaria. La literatura revisada indica, por un lado, que hay elementos fundamentales para el aprendizaje rítmico-musical: pulso, rangos de tempi específicos, metro, acentuaciones (Álamos y Tejada, 2021), agrupación y patrones rítmicos (Álamos y Tejada, 2020a). Por otro lado, existe una fuerte asociación entre ritmo y movimiento desde el punto de vista perceptivo-motor (Álamos y Tejada, 2020b), y el uso de elementos relacionados con el lenguaje verbal facilitarían la adquisición de habilidades rítmicas. Estos hallazgos llevan a concluir que tanto la práctica rítmico-corporal como la asociación entre lenguaje verbal y ritmo, son mecanismos especialmente eficientes en la formación rítmica de estudiantes de educación primaria. Esta evidencia teórica ha de ser tratada con cautela, puesto que debiera ser verificada en el futuro a través de trabajos empíricos en el contexto educativo chileno.
... EVS was measured in each song at three points of measurement-notes occurring either at the beginning, middle, or end of a musical phrase. The phrases were interpretatively defined with reference to both the structure of the lyrics and grouping criteria due to temporal and pitch intervals (Lerdahl & Jackendoff, 1983). For each point of measurement, a small note-AOI was drawn. ...
... Over and above these effects of phrase structure, local interval sizes were not found to affect the EVS. In this view, previous results showing that span effects can be driven by large melodic intervals (Huovinen et al., 2018;Penttinen & Huovinen, 2011) might also best be understood in terms of phrase boundaries arising from salient melodic groupings due to pitch distance (Lerdahl & Jackendoff, 1983). ...
Article
Full-text available
In comparison with instrumental sight reading of musical notation, sight singing is typically characterized by the presence of lyrics. The purpose of this study was to explore how skilled sight singers divide their visual attention between written music and lyrics and how their eye-movement behavior is influenced by musical stimulus complexity. Fourteen competent musicians performed 10 newly composed songs in a restricted temporal condition (60 bpm). Eye movements and vocal performances were recorded and complemented with posttask complexity ratings and interviews. In the interviews, the singers emphasized the priority of focusing on the melody instead of the lyrics. Accordingly, eye-movement analyses indicated not only more total fixation time on music than lyrics but also longer fixation durations, longer durations of visits (i.e., sequences of fixations), and a larger number of fixations per visit on music than on lyrics. The singers also more typically arrived at a bar by glancing first at the music instead of lyrics. Generalized linear mixed-model analyses showed that the number of notes and accidentals in a bar influenced the fixation time and that pupil dilation was increased by a larger number of accidentals. Measurements of eye–voice span, that is, the temporal distance between fixating and singing a note, were best predicted by phrase structure and the note density of previous melodic material. According to the interviews, the best sight singers’ approaches were characterized by a flexibility of moving between different sight-singing strategies. The study offers a comprehensive overview regarding the bottom-up and top-down aspects affecting sight-singing performance.
... Dès les années 1960, des théoriciens de la musique utilisent des approches calculatoires pour proposer des notions d'analyse, comme, par exemple, la Set Theory développée par M. Babbitt, G Perle, A. Forte et E. Carter [50]. 7 Une étape importante pour l'analyse musicale computationnelle est la parution en 1983 du livre A Generative Theory of Tonal Music (GTTM) écrit conjointement par le musicologue F. Lerdahl et le linguiste R. Jackendoff [79], influencés par l'analyse schenkérienne du début du XX e siècle [120]. Les auteurs y proposent 56 règles permettant non seulement la segmentation de la musique tonale de deux façons (suivant la structure et suivant l'alternance de temps forts et de temps faibles) mais aussi la réduction d'une partition musicale selon ces deux segmentations. ...
... L'idéal serait de prendre la granularité sous-entendue par le tactus [10] (qu'on peut considérer de façon grossière comme la pulsation naturelle d'une oeuvre musicale), mais déterminer ce niveau qui n'est pas explicitement indiqué sur une partition musicale est compliqué et constitue une tâche de MIR en soi. Cette question est d'ailleurs centrale pour formaliser la théorie de Lerdahl et Jackendoff [79]. ...
Thesis
Cette thèse s’inscrit dans le domaine de l’informatique musicale et plus particulièrement de l’analyse musicale computationnelle. Ces études ont pour but de générer des annotations musicales plus ou moins haut-niveau sur une partition, en particulier pour comprendre la genèse, le geste compositionnel ou encore sa place dans l'œuvre globale d'un compositeur. Cette thèse propose de nouvelles approches basées sur la modélisation, l'algorithmique et l’apprentissage machine pour modéliser la tonalité, un système musical qui permet de hiérarchiser et contextualiser les notes, ainsi que les cadences, qui sont les processus de clôture des phrases musicales. Nous souhaitons ainsi aider à l'analyse d’œuvres en forme sonate.Nous présentons trois corpus établis durant la thèse -- quatuors à cordes de Mendelssohn, quatuors à cordes de Mozart et exemples de modulation -- et discutons les étapes et problématiques d'un tel travail. Nous concevons un algorithme d’estimation des tonalités en tout point de la partition, utilisant une nouvelle modélisation du système tonal pour identifier les points de modulation. Il estime à chaque temps trois signaux musicaux : l’ancrage dans la tonalité, la compatibilité des notes avec une tonalité donnée et la proximité entre les tonalités. L'algorithme est évalué sur des corpus de Mozart et de modulations. Nous établissons un algorithme de détection des cadences par l’extraction de descripteurs haut-niveau caractéristiques de la présence d’un point d’arrivée cadentiel sur la partition musicale. Nous étudions la significativité de chacun de ces descripteurs, puis ceux-ci servent à entraîner un algorithme d’apprentissage qui classe chaque temps de la partition comme un point d’arrivée cadentiel ou non. Cet algorithme est évalué sur un corpus de fugues de Bach et de quatuors à cordes d'Haydn et est adapté à la détection d'une cadence particulière significative pour la forme sonate, la césure médiane.Cette thèse contribue donc à la modélisation informatique de concepts musicaux haut-niveaux comme la tonalité et la forme musicale.
... Third, music shows evidence for complex design, including grammar-like structures analogous to those of language (Lerdahl & Jackendoff, 1983), some of which may be universal . Moreover, music perception is computationally complex, such that artificial intelligence is currently at pains to emulate it (Benetos, Dixon, Giannoulis, Kirchhoff, & Klapuri, 2013). ...
... These "building blocks" appear universally in music Nettl, 2015;, like "building blocks" of language (e.g., Baker, 2001). They provide a grammar-like, combinatorially generative interface through which musical content can be created, improvised, and elaborated upon, through hierarchical organization of meter and tonality 14 (Krumhansl, 2001;Lerdahl & Jackendoff, 1983), in fashions that themselves have universal signatures (Jacoby & McDermott, 2017;. ...
Article
Savage et al. argue for musicality as having evolved for the overarching purpose of social bonding. By way of contrast, we highlight contemporary predictive processing models of human cognitive functioning in which the production and enjoyment of music follows directly from the principle of prediction error minimization.
... Third, music shows evidence for complex design, including grammar-like structures analogous to those of language (Lerdahl & Jackendoff, 1983), some of which may be universal . Moreover, music perception is computationally complex, such that artificial intelligence is currently at pains to emulate it (Benetos, Dixon, Giannoulis, Kirchhoff, & Klapuri, 2013). ...
... These "building blocks" appear universally in music Nettl, 2015;, like "building blocks" of language (e.g., Baker, 2001). They provide a grammar-like, combinatorially generative interface through which musical content can be created, improvised, and elaborated upon, through hierarchical organization of meter and tonality 14 (Krumhansl, 2001;Lerdahl & Jackendoff, 1983), in fashions that themselves have universal signatures (Jacoby & McDermott, 2017;. ...
Article
We propose that not social bonding, but rather a different mechanism underlies the development of musicality: being unable to survive alone. The evolutionary constraint of being dependent on other humans for survival provides the ultimate driving force for acquiring human faculties such as sociality and musicality, through mechanisms of learning and neural plasticity. This evolutionary mechanism maximizes adaptation to a dynamic environment.
... Exposure to a particular music system may also change brain structures and representations; indeed, a number of processing abilities have been argued not to depend on the extent of a listeners' musical training and instead on becoming an "experienced listener" (Bigand & Poulin-Charronnat, 2006;Lerdahl & Jackendoff, 1983). For example, meter induction (Honing, 2012;), formation of musical expectancies (Koelsch, Gunter, Friederici, & Schröger, 2000), and emotional responses to music (Bigand, Vieillard, Madurell, Marozeau, & Dacquet, 2005) follow similar patterns between musicians and untrained listeners. ...
... Our results are thus an example of the importance of context-specificity; only participants who self-reported that they were highly familiar with the 3-2 and 2-3 son clave patterns showed the requisite proficiency necessary to match the clave patterns with their appropriate musical context. Although being an "experienced listener" has advantages in some settings and provides some types of expertise (Bigand & Poulin-Charronnat, 2006;Lerdahl & Jackendoff, 1983), being able to access numerous previously heard salsa exemplars (c.f., Creel, 2011) did not lead to better detection of the correct salsa-clave pairings. It was only through knowledge of context-specific vocabulary (whether gained from general music or dance training, genre-specific instrumental training, or a lifetime of informal listening exposure) that listeners gained requisite advantages. ...
Article
Introduction Previous research has shown ways in which both formal training and informal exposure affect perceptual experience and the development of musical abilities. Here we asked what types of training and exposure are necessary to acquire the context-specific knowledge associated with expertise. We specifically focused on the perception of salsa music: a genre that is rich in rhythmic complexity, but has received relatively little attention in experimental settings. Methods We examined specific groups within the exposure and training populations: those with musical training in the production of salsa rhythms (Study 1) and “native listeners” who grew up listening to salsa music without formal training (Study 2). Using two clave patterns (3–2 and 2–3 son clave) and three constructed alternatives, we asked participants to choose the correct clave pattern for a variety of music excerpts. Results We found that informal listening exposure was not enough to detect the salsa–clave pairings. Instead, proficiency was only developed when training and exposure were both domain-specific. Discussion Our results show the importance of deliberate training and the degree to which expertise comes to fruition through context-specific focus, thus helping to illuminate the complex relationship between the local and the universal in musical-cultural experience.
... Theories have been proposed to define and model harmonic tension (Lerdahl and Jackendoff 1996;Lerdahl 2001;Lerdahl and Krumhansl 2007), however it seems that metric tension has not be well modeled or empirically studied, which may because that the temporal structures of music are complex and are not as easy to quantify as the tonal structures (Farbood 2012). Therefore three simple meters are adopted in our experiment: 2/4 meter, 3/4 meter, and 4/4 meter, which reflect the basic temporal features of metrical structures. ...
Article
Full-text available
As the basis of musical emotions, dynamic tension experience is felt by listeners as music unfolds over time. The effects of musical harmonic and melodic structures on tension have been widely investigated, however, the potential roles of metrical structures in tension perception remain largely unexplored. This experiment examined how different metrical structures affect tension experience and explored the underlying neural activities. The electroencephalogram (EEG) was recorded and subjective tension was rated simultaneously while participants listened to music meter sequences. On large time scale of whole meter sequences, it was found that different overall tension and low-frequency (1 ~ 4 Hz) steady-state evoked potentials were elicited by metrical structures with different periods of strong beats, and the higher overall tension was associated with metrical structure with the shorter intervals between strong beats. On small time scale of measures, dynamic tension fluctuations within measures was found to be associated with the periodic modulations of high-frequency (10 ~ 25 Hz) neural activities. The comparisons between the same beats within measures and across different meters both on small and large time scales verified the contextual effects of meter on tension induced by beats. Our findings suggest that the overall tension is determined by temporal intervals between strong beats, and the dynamic tension experience may arise from cognitive processing of hierarchical temporal expectation and attention, which are discussed under the theoretical frameworks of metrical hierarchy, musical expectation and dynamic attention.
... Chomsky ha sostenido que la operación carece de costo computacional / cognitivo (ya que no puede prescindirse de ella en la generación y manipulación de símbolos discretos en un sistema formal, extendiéndose esto a la cognición si se asume que un sistema formal puede tener base biológica) puesto que se deriva de una necesidad conceptual. Aparentemente, además, no es exclusiva del lenguaje natural, sino que se encuentra en otros sistemas simbólicos, como por ejemplo, la capacidad matemática y la musical (Jackendoff y Lerdahl, 1983). ...
Article
Full-text available
La sintaxis formal y la pragmática han sido frecuentemente vistas como disciplinas independientes, hasta contrapuestas, desde el desarrollo de la gramática generativa transformacional y de la pragmática de orientación filosófica. No obstante, el surgimiento de la pragmática de orientación relevantista-cognitiva, a partir de los trabajos de Sperber y Wilson (1995, 2003), abre la posibilidad de encontrar puntos de contacto entre los dos enfoques sobre el estudio del lenguaje. Ambas teorías se ubican dentro de las ciencias cognitivas, y buscan explicaciones a los fenómenos que estudian en un nivel subpersonal. Intentaremos explicitar algunos puntos de contacto entre las teorías, de manera tal que Teoría de la Relevancia pueda formalizarse como una teoría del componente Conceptual-Intencional generativista, incorporando nociones semántico-pragmáticas en el marco general de una sintaxis semánticamente dirigida.
... The hierarchical nature of music has been studied for a long time [20][21][22][23]. Recently, we see some efforts on learning long-term music representations using hierarchical modeling [12,24,25]. ...
Preprint
Full-text available
Learning symbolic music representations, especially disentangled representations with probabilistic interpretations, has been shown to benefit both music understanding and generation. However, most models are only applicable to short-term music, while learning long-term music representations remains a challenging task. We have seen several studies attempting to learn hierarchical representations directly in an end-to-end manner, but these models have not been able to achieve the desired results and the training process is not stable. In this paper, we propose a novel approach to learn long-term symbolic music representations through contextual constraints. First, we use contrastive learning to pre-train a long-term representation by constraining its difference from the short-term representation (extracted by an off-the-shelf model). Then, we fine-tune the long-term representation by a hierarchical prediction model such that a good long-term representation (e.g., an 8-bar representation) can reconstruct the corresponding short-term ones (e.g., the 2-bar representations within the 8-bar range). Experiments show that our method stabilizes the training and the fine-tuning steps. In addition, the designed contextual constraints benefit both reconstruction and disentanglement, significantly outperforming the baselines.
... Across different phonological analyses, one constant is that the rhythm rule has been assumed to be an online process motivated by domain-general preferences for eurhythmy. It has been argued that such preferences are shared by linguistic and musical behavior, and that both activities share an underlying abstract rhythmic structure (Lerdahl & Jackendoff, 1996;Liberman, 1975;Zec, 2006). In support of this view, researchers have proposed that auditorily prominent events are produced (Quené & Port, 2002) or are perceived (Hayes, 1995, pp. ...
Article
A fundamental question about speech is whether it is governed by rhythmic constraints. One phenomenon that may support the existence of such constraints is the rhythm rule, a phonological pattern hypothesized to resolve prominence clashes and enforce alternations of prominent and non-prominent syllables via shift/deletion of stress and/or pitch accents. We evaluated evidence for the rhythm rule by studying the acoustic correlates of clash in two experiments with speakers of Italian. We found that the first prominent syllable in a clash displays a durational increase and more extreme formant values, when compared to no clash. Thus, a clash is manifested as a localized decrease in speech rate, not as a change to the prominence profile of a word. Since durational increases have been reported for other languages, we argue that they are an online acoustic correlate of clash. We compare two dynamical models of the durational effects, rooted in the framework of Articulatory Phonology: a π-gesture model and a feedback modulation model. Based on our findings, we argue that the rhythm rule is best conceptualized as the result of contextual biases on lexical selection of prominence patterns.
... In music analysis, Schenkerian theory aims at revealing hierarchical relations between musical elements in order to derive a complete reduction of a piece (Cadwallader & Gagné, 1998). Such relations can be studied using frameworks from formal language theory (Manning & Schütze, 2003), and there have been many attempts to formalize Schenkerian's intuitions (e.g., Lerdahl & Jackendoff, 1983), in particular by employing formal grammars that represent hierarchical relations within sequences of notes or chords (Abdallah et al., 2016;Keiler, 1978;Kirlin & Jensen, 2011;Rohrmeier, 2011;Steedman, 1984). Such mathematical models bridge the gap between music theory and psychology by describing human cognition on the computational level (Harasim, 2020;Marr, 1982). ...
Article
Full-text available
Many structural aspects of music, such as tonality, can be expressed using hierarchical representations. In music analysis, so-called keyscapes can be used to map a key estimate (e.g., C major, F minor) to each subsection of a piece of music, thus providing an intuitive visual representation of its tonality, in particular of the hierarchical organization of local and global keys. However, that approach is limited in that the mapping relies on assumptions that are specific to common-practice tonality, such as the existence of 24 major and minor keys. This limitation can be circumvented by applying the discrete Fourier transform (DFT) to the tonal space. The DFT does not rely on style-specific theoretical assumptions but only presupposes an encoding of the music as pitch classes in 12-tone equal temperament. We introduce wavescapes, a novel visualization method for tonal hierarchies that combines the visual representation of keyscapes with music analysis based on the DFT. Since wavescapes produce visual analyses deterministically, a number of potential subjective biases are removed. By concentrating on one or more Fourier coefficients, the role of the analyst is thus focused on the interpretation and contextualization of the results. We illustrate the usefulness of this method for computational music theory by analyzing eight compositions from different historical epochs and composers (Josquin, Bach, Liszt, Chopin, Scriabin, Webern, Coltrane, Ligeti) in terms of the phase and magnitude of several Fourier coefficients. We also provide a Python library that allows such visualizations to be easily generated for any piece of music for which a symbolic score or audio recording is available.
... successive 1:1 ratios) is not confined to theories of metre in Euro-American music (e.g. Lerdahl and Jackendoff 1983) but is also prominent in comparative (Hood 1971;Savage et al. 2015) and Africanist ethnomusicology (Agawu 2006;Arom 1984;Burns 2010;Kubik 1988;Locke 1982;Nketia 1974;Waterman 1952). The psychological theory of dynamic attending (Jones and Boltz 1989;Large and Jones 1999) suggests that humans tend to entrain to isochronous periodicities which they perceive as simple and as underlying other, more complex rhythms in their environment. ...
Chapter
Full-text available
The basic building blocks for rhythmic structure in music are widely believed to be universally constrained to small-integer ratios. In particular, basic metric processes such as pulse perception are assumed to depend on the recognition and anticipation of even, categorically equivalent durations or inter-onset-intervals, which are related by the ratio of 1:1 (isochrony). Correspondingly, uneven (non-isochronous) beat subdivisions are theorized as instances of expressive microtiming variation, that is, as performance deviations from some underlying, categorically isochronous temporal structure. By contrast, ethnographic experience suggests that the periodic patterns of uneven beat subdivision timing in various styles of music from Mali rather themselves constitute rhythmic and metric structures. The present chapter elaborates this hypothesis and surveys a series of empirical research projects that found evidence for it. These findings have implications for metric theory as well as for our broader understanding of how human perception relates to cultural environments. They suggest that the bias towards isochrony, which according to many accounts of rhythm and meter is underlying pulse perception, is culturally specific rather than universal. Claims on cultural diversity in the study of music typically concern styles and meanings of performance practices. In this chapter, I will claim that basic structures of perception can vary across cultural groups, too.
... The perceptive biases that constrained musical creativity in the domain of rhythm and pitch structure are not restricted solely to the active search for the distinctive features of musical discrete units. Music similar to language is a complex communicative tool, the structure of which is governed by syntactic rules (Lerdahl and Jackendoff, 1983). These rules mean that musical structure is perceived in terms of two types of hierarchy-rhythm hierarchy based on meter (London, 2012) and pitch hierarchy based on pitch centricity (Krumhansl and Cuddy, 2010). ...
... There are tremendously many possible combinations, particularly when increasing the number of simultaneous notes. If onset and duration are added, we can describe chord progressions and more complex harmonic patterns that require theory knowledge to understand [LJ83a]. Chord progressions influence the perceived tonality or modulation. ...
Article
Full-text available
Music analysis tasks, such as structure identification and modulation detection, are tedious when performed manually due to the complexity of the common music notation (CMN). Fully automated analysis instead misses human intuition about relevance. Existing approaches use abstract data‐driven visualizations to assist music analysis but lack a suitable connection to the CMN. Therefore, music analysts often prefer to remain in their familiar context. Our approach enhances the traditional analysis workflow by complementing CMN with interactive visualization entities as minimally intrusive augmentations. Gradual step‐wise transitions empower analysts to retrace and comprehend the relationship between the CMN and abstract data representations. We leverage glyph‐based visualizations for harmony, rhythm and melody to demonstrate our technique's applicability. Design‐driven visual query filters enable analysts to investigate statistical and semantic patterns on various abstraction levels. We conducted pair analytics sessions with 16 participants of different proficiency levels to gather qualitative feedback about the intuitiveness, traceability and understandability of our approach. The results show that MusicVis supports music analysts in getting new insights about feature characteristics while increasing their engagement and willingness to explore.
... [4.1] Musical accents necessitate a relativity between events that are "marked for consciousness" (Cooper and Meyer 1960, 8) and their unaccented surroundings. Lerdahl and Jackendoff (1983) distinguish between metric, structural, and phenomenal accents. The following observations will mainly concern phenomenal accents, since notions of metric and structural accents are of li le use in Rebonds. ...
Article
This paper presents a comparative recording analysis of the seminal work for solo percussion Rebonds (Iannis Xenakis, 1989), in order to demonstrate how performances of a musical work can reveal—or even create—aspects of musical structure that score-centered analysis cannot illuminate. In doing so I engage with the following questions. What does a pluralistic, dynamic conception of structure look like for Rebonds ? How do interpretive decisions recast performers as agents of musical structure? When performances diverge from the score in the omission of notes, the softening of accents, the insertion of dramatic tempo changes, or the altering of entire passages, do conventions that arise out of those performance practices become part of the structural fabric of the work? Are these conventions thus part of the Rebonds “text”?
... Music is built on combinatorial structural rules, for example, the rules of harmony. They govern the arrangement of a limited set of musical elements (e.g., notes or chords) into virtually infinite varieties of musical sequences (Lerdahl and Jackendoff 1983;Swain 1995;Rohrmeier 2011). Similar to linguistic grammatical rules that define sentence structure, musical rules define which musical elements are likely to follow in a given context depending on local and temporally extended structural dependencies ( Figure 1A) (Patel 2003;Pearce 2018;Koelsch et al. 2019). ...
Article
Full-text available
Complex sequential behaviors, such as speaking or playing music, entail flexible rule-based chaining of single acts. However, it remains unclear how the brain translates abstract structural rules into movements. We combined music production with multimodal neuroimaging to dissociate high-level structural and low-level motor planning. Pianists played novel musical chord sequences on a muted MR-compatible piano by imitating a model hand on screen. Chord sequences were manipulated in terms of musical harmony and context length to assess structural planning, and in terms of fingers used for playing to assess motor planning. A model of probabilistic sequence processing confirmed temporally extended dependencies between chords, as opposed to local dependencies between movements. Violations of structural plans activated the left inferior frontal and middle temporal gyrus, and the fractional anisotropy of the ventral pathway connecting these two regions positively predicted behavioral measures of structural planning. A bilateral frontoparietal network was instead activated by violations of motor plans. Both structural and motor networks converged in lateral prefrontal cortex, with anterior regions contributing to musical structure building, and posterior areas to movement planning. These results establish a promising approach to study sequence production at different levels of action representation.
... One common trait between these perspectives is the emphasis of establishing a stable tonal center to derive sentiments of musical tension (Lerdahl and Jackendoff, 1983;Bigand, 1993;Bigand, Parncutt, & Lerdahl, 1996). These studies and observations have undoubtedly stemmed from the analysis of tonal repertoire and although some work has been extended to atonal repertoire (Dibben, 1994;Lerdahl, 1989;Krumhansl, Sandel and Sargeant 1987;Dibben 1999), understanding musical tension in repertoire that embodies both tonal and atonal elements, such as neoclassical repertoire, is an area of research that remains to be explored. ...
... Musical expectation can lead to feelings of relaxation or tension in listeners with tension followed by relaxation in Western music. Oftentimes, there is no dissonance in the end of a piece because of the tendency what Lerdahl and Jackendoff (1983) refer to as a "happy-end". Previous studies have demonstrated that musical expectations can contribute to our physiological and emotional responses through measuring heart rate, electrodermal activity, and electroencephalogram (EEG) activity (Steinbeis et. ...
... In Dowling's work, melodic contour is coded as a series of +'s and -'s, representing relative pitch differences between successive notes; this general form of contour coding (as well as its equivalent of 1's and -1's) has been employed by multiple authors (Friedmann, 1985;Marvin & Laprade, 1987;Quinn, 1999). Unfortunately, little work exists on how to characterize duration and/or rhythmic contours, with the majority of work analyzing rhythm more focused on patterns of stress and intonation, as opposed to durations, likely due to the emphasis on rhythm and prosody in speech and language (e.g., Aiello, 1994;Cooper & Meyer, 1960;Lerdahl & Jackendoff, 1983;Patel et al., 2006;Thaut, 2008). Cooper and Meyer's (1960) classic text on the rhythmic structure of music, for instance, explicitly relates musical rhythmic structure to accented and unaccented groupings, using terminology drawn from work in prosody. ...
... While chord recognition focuses on local and key finding on global levels, music theory suggests the existence of several intermediate levels that are frequently conceptualized as being hierarchically nested (e.g., Hauptmann, 1853;Schenker, 1935;Salzer, 1952;Lerdahl and Jackendoff, 1983;Lerdahl, 2001;Rohrmeier, 2011Rohrmeier, , 2020Rohrmeier and Moss, 2021). A number of psychological studies provide evidence for the perceptual reality of hierarchical organization in music (Krumhansl, 2004;Tillmann and Bigand, 2004;Koelsch et al., 2013;Farbood, 2016;Herff et al., 2021) but the exact relation between theoretically postulated and perceived hierarchies is not yet fully understood. ...
Article
Full-text available
Music analysis, in particular harmonic analysis, is concerned with the way pitches are organized in pieces of music, and a range of empirical applications have been developed, for example, for chord recognition or key finding. Naturally, these approaches rely on some operationalization of the concepts they aim to investigate. In this study, we take a complementary approach and discover latent tonal structures in an unsupervised manner. We use the topic model Latent Dirichlet Allocation and apply it to a large historical corpus of musical pieces from the Western classical tradition. This method conceives topics as distributions of pitch classes without assuming a priori that they correspond to either chords, keys, or other harmonic phenomena. To illustrate the generative process assumed by the model, we create an artificial corpus with arbitrary parameter settings and compare the sampled pieces to real compositions. The results we obtain by applying the topic model to the musical corpus show that the inferred topics have music-theoretically meaningful interpretations. In particular, topics cover contiguous segments on the line of fifths and mostly correspond to diatonic sets. Moreover, tracing the prominence of topics over the course of music history over [Formula: see text]600 years reflects changes in the ways pitch classes are employed in musical compositions and reveals particularly strong changes at the transition from common-practice to extended tonality in the 19th century.
... Rhythm can be also perceived visually and through the tactile sense, but the reaction time of the human auditory system is shorter by 20-50 ms, when compared to visual stimuli; [46]. Therefore, rhythm influences the kinetic system (through synchronization and adjustment of muscles to auditory stimuli), and auditory cues [47][48][49]. ...
... Langer (1953) mentioned that music, like language, has discrete parts that can be combined in a variety of ways to make new expressive wholes. Lerdahl and Jackendoff (1983) published their generative theory of tonal music, and Povel (2010) designed algorithms for generating melodies. All these approaches focus on melodies as hierarchically organized units and deal with tonality and with musical time in terms of metrically based durations with proportional values. ...
Article
Full-text available
From a biological point of view, the singing of songs is based on the human vocal learning capacity. It is universally widespread in all cultures. The transmission of songs is an elementary cultural practice, by which members of the older generations introduce both musico-linguistic rules and affect-regulative means to the younger ones. Traditionally, informal singing in familiar settings primarily subserves affect-regulation goals, whereas formal song transmission is embedded in various normative claims and interests, such as preserving cultural heritage and representing collective and national identity. Songs are vocal acts and abstract models that are densely structured and conform to cultural rules. Songs mirror each generations’ wishes, desires, values, hopes, humor, and stories and rest on unfathomable traditions of our cultural and human history. Framed in the emerging scientific field of didactics, I argue that research on formal song transmission needs to make explicit the norms and rules that govern the relationships between song, teacher, and pupils. I investigate these three didactic components, first, by conceptualizing song as rule-governed in terms of a grammar, with songs for children representing the most elementary musico-linguistic genre. The Children’s Song Grammar presented here is based on syllables as elements and on syntactic rules concerning timing, tonality, and poetic language. It makes it possible to examine and evaluate songs in terms of correctness and well-formedness. Second, the pupils’ learning of a target song is exemplified by an acoustical micro-genetic study that shows how vocalization is gradually adapted to the song model. Third, I address the teachers’ role in song transmission with normative accounts and provide exemplary insights into how we study song teaching empirically. With each new song, a teacher teaches the musico-linguistic rules that constitute the respective genre and conveys related cultural feelings. Formal teaching includes self-evaluation and judgments with respect to educational duties and aesthetic norms. This study of the three-fold didactic process shows song transmission as experiencing shared rule-following that induces feelings of well-formedness. I argue that making the inherent normativity of this process more explicit – here systematically at a descriptive and conceptual level – enhances the scientificity of this research domain.
... Tonality is, in important ways, a musical syntax: a set of rules and practices, implicitly understood by listeners, which governs closure and continuity, stability and tension, in melodic and harmonic sequences 12,13 . It thus serves to shape listeners' melodic and harmonic expectancies and governs their evaluation of how well a given tone or chord fits a musical context, as well as their evaluation of the musical tension such tone evokes (for reviews of tonal cognition research see [14][15][16] ). ...
Article
Full-text available
Increasing evidence has uncovered associations between the cognition of abstract schemas and spatial perception. Here we examine such associations for Western musical syntax, tonality. Spatial metaphors are ubiquitous when describing tonality: stable, closural tones are considered to be spatially central and, as gravitational foci, spatially lower. We investigated whether listeners, musicians and nonmusicians, indeed associate tonal relationships with visuospatial dimensions, including spatial height, centrality, laterality, and size, implicitly or explicitly, and whether such mappings are consistent with established metaphors. In the explicit paradigm, participants heard a tonality-establishing prime followed by a probe tone and coupled each probe with a subjectively appropriate location (Exp.1) or size (Exp.4). The implicit paradigm used a version of the Implicit Association Test to examine associations of tonal stability with vertical position (Exp.2), lateral position (Exp3) and size (Exp.5). Tonal stability was indeed associated with perceived physical space: the spatial distances between the locations associated with different scale-degrees significantly correlated with the tonal stability differences between these scale-degrees. However, inconsistently with musical discourse, stable tones were associated with leftward (instead of central) and higher (instead of lower) spatial positions. We speculate that these mappings are influenced by emotion, embodying the “good is up” metaphor, and by the spatial structure of music keyboards. Taken together, the results demonstrate a new type of cross-modal correspondence and a hitherto under-researched connotative function of musical structure. Importantly, the results suggest that the spatial mappings of an abstract domain may be independent of the spatial metaphors used to describe that domain.
... While there are many ways to choose the location p for content insertion, in our work, we utilize the VLI model to expand short-length musical segments at musical boundaries. A musical boundary is viewed as the edge of two musical groups [3], which is often accompanied by changes in rhythm, metre and pitch patterns. It should be expected in general that there is still a musical boundary after expansion. ...
Preprint
Full-text available
In this paper, we investigate using the variable-length infilling (VLI) model, which is originally proposed to infill missing segments, to "prolong" existing musical segments at musical boundaries. Specifically, as a case study, we expand 20 musical segments from 12 bars to 16 bars, and examine the degree to which the VLI model preserves musical boundaries in the expanded results using a few objective metrics, including the Register Histogram Similarity we newly propose. The results show that the VLI model has the potential to address the expansion task.
... Eine Annahme, die insofern gerechtfertigt erscheint, da einiges darauf hindeutet, dass es sich zum Teil um musikalische Universalien, also phylogenetisch festgelegte Wahrnehmungscharakteristika handelt (vgl. Lerdhal und Jackendoff 1983). Immerhin könnte man aber vor dem Hintergrund von zwar komplexen, aber immer gleichen harmonischen Modulationen oder der Polyrhythmik afrikanischer Musik auch den Minimalismus eines gleichförmigen Klanges als Beitrag zur Vielfalt der Musik betrachten. ...
... El manejo del rubato en algunas repeticiones del motivo a, estarían dando cuenta de esa jerarquía. Aunque sin duda este punto aparece como altamente especulativo y requiere de mucha más investigación, el principio de organización jerárquica de los agrupamientos en música también podría ser considerado un rasgo específico de la música de la cultura (aunque muchos reconocen que tal vez sea el rasgo más universal entre los componentes jerárquicos de la música tonal (Lerdahl y Jackendoff 1983;Imberty 1997) Nótese además cómo D utiliza la regulación temporal en relación a los elementos estructurales del motivo (en un caso destacando una nota estructuralmente importante, en otro caso delimitando los agrupamientos con el recurso de "ritardando de final de frase"). Este rasgo resulta interesante dado que sería esperable que D hiciera un uso estandarizado de la variable temporal (tal como lo señalan otros estudios con autistas que dan cuenta del modo en el que las ejecuciones carecen de rasgos expresivos). ...
... The first algorithm takes a segment of a lead sheet as input, and outputs a Schenkeerian tree (Sk_tree) that represents a set of iterated reductions of the given piece, inspired by what is traditionally done in Schenkerian analysis [15] or in the Generative Theory of Tonal Music [10]. Though the details of this algorithm are described in previous related works [5,12,16], it is worth quickly describing how the algorithm functions at a high-level. ...
Conference Paper
Full-text available
Computational models of music, while providing good descriptions of melodic development, still cannot fully grasp the general structure comprised of repetitions, transpositions, and reuse of melodic material. We present a corpus of strongly structured baroque allemandes, and describe a top-down approach to abstract the shared structure of their musical content using tree representations produced from pairwise differences between the Schenkerian-inspired analyses of each piece, thereby providing a rich hierarchical description of the corpus.
... Pearce et al. (2008) investigated the delineation of phrases within melodies, and provided a concise review of existing approaches. The Generative Theory of Tonal Music (Lerdahl & Jackendoff, 1983) outlines a set of grouping preference rules, and an implementation based on clustering and statistical learning (Kanamori et al., 2014) was shown to outperform existing methods when detecting 'local grouping boundaries' as defined by the GTTM. Grouper (Temperley, 2004) was presented as part of the Melisma Music Analyzer, and was a dynamic programming melody partitioning algorithm based on three Phrase Structure Preference Rule definitions. ...
Article
Full-text available
Many studies have presented computational models of musical structure, as an important aspect of musicological analysis. However, the use of grammar-based compressors to automatically recover such information is a relatively new and promising technique. We investigate their performance extensively using a collection of nearly 8000 scores, on tasks including error detection, classification, and segmentation, and compare this with a range of more traditional compressors. Further, we detail a novel method for locating transcription errors based on grammar compression. Despite its lack of domain knowledge, we conclude that grammar-based compression offers competitive performance when solving a variety of musicological tasks.
... Previous studies have suggested that meter perception consists of not only bottom-up processes on acoustical cues but also of top-down attentional processes. We perceive beat strength (also called "accent") from acoustic features of the sound event (such as pitch, duration, and loudness) and recognize the strongest beat as the first of its meter cycle [7][8][9][10]; i.e., higher, longer, or louder sound events tend to be perceived as the accented beats. Furthermore, we can perceive meter by paying attention to isochronous beats without any physical accent [11][12][13][14][15][16]. ...
Article
Full-text available
Meter is one of the core features of music perception. It is the cognitive grouping of regular sound sequences, typically for every 2, 3, or 4 beats. Previous studies have suggested that one can not only passively perceive the meter from acoustic cues such as loudness, pitch, and duration of sound elements, but also actively perceive it by paying attention to isochronous sound events without any acoustic cues. Studying the interaction of top-down and bottom-up processing in meter perception leads to understanding the cognitive system’s ability to perceive the entire structure of music. The present study aimed to demonstrate that meter perception requires the top-down process (which maintains and switches attention between cues) as well as the bottom-up process for discriminating acoustic cues. We created a “biphasic” sound stimulus, which consists of successive tone sequences designed to provide cues for both the triple and quadruple meters in different sound attributes, frequency, and duration. Participants were asked to focus on either frequency or duration of the stimulus, and to answer how they perceived meters on a five-point scale (ranged from “strongly triple” to “strongly quadruple”). As a result, we found that participants perceived different meters by switching their attention to specific cues. This result adds evidence to the idea that meter perception involves the interaction between top-down and bottom-up processes.
... The generative theory by Lerdahl and Jackendoff (1983) 22 is by a large margin the most widely accepted syntactic theory in the field of music. One of its postulates is that the levels of syntactic hierarchy are consecutively derived from the "surface level" of the musical tones. ...
Preprint
Full-text available
This paper puts forward a proposal for a major update of the well-established semiotic theory of Lerdahl and Jackendoff. Two modes of comprehension of music: from top-to-bottom and from bottom-to-top - are cross-examined in a number of music works. The paper argues that the primary way of making sense of a music work is hierarchically driven from bottom to top, and is based on parsing the musical flow into conventional idiomatic units. Nine aspects of expression in music are involved into categorization of musical idioms, and five of these expressive aspects follow their autonomous syntactic rules. Existing theories of musical semiotics grossly simplify the syntactic organization of musical text. The idiomatic basis of musical syntax is discussed in relation to the conventions of expressive exaggerations of notated score in the Western classical music tradition. The semi-conscious spontaneous manner of using such exaggerations on part of performers and their unconscious perception by listeners make musical syntax appear poorly formed and highly susceptible to the impact of the affective states of performers and their audiences. This issue poses a fundamental obstacle to the generative grammar approach as postulated by Lerdahl and Jackendoff. Musical syntax should not be studied in isolation from the semantic contribution of musical emotions that affect parsing strategies of performers and listeners. The affective influence of musical emotions on all nine aspects of musical expression must be taken in consideration in any attempt to analyze the musical syntax.
... Pitch accents, which are aligned with metrically strong syllables, are generally signaled by a local increase in pitch (Breen et al., 2010). Music cognition makes similar claims about how metric structure is realized in Western music; the Generative Theory of Tonal Music predicts that metrically strong positions are more likely to correspond with pitch excursions (Lerdahl & Jackendoff, 1983). Moreover, analyses of Western music reveal that pitch accents frequently occur at metrically strong positions (Huron & Royal, 1996). ...
Conference Paper
Full-text available
We investigated whether speakers use pitch to signal hierarchical metric structure in productions of Dr. Seuss’s The Cat in the Hat, by modeling fundamental frequency (F0) of monosyllabic words as a function of metric strength and a set of control parameters. We modeled maximum F0 of ~25000 words in a corpus of book productions from 17 speakers, comparing a 3-level musical metric model and a 5-level linguistic metric model. Results demonstrate that speakers consistently realized two levels of musical metric strength, as words corresponding with downbeats were produced with higher maximum F0 than all other beats. In addition, speakers simultaneously realized three levels of linguistic metric strength, as maximum F0 decreased linearly across the three highest linguistic metric levels. These results are consistent with previous work in both prose speech production and Western music composition, demonstrating that poetic speech uses pitch variation in ways that are consistent with both music and speech, and they complement prior demonstrations that duration and intensity variation signal musical and linguistic metric structure in the same corpus.
Article
Full-text available
Müzikoloji kavramı çeşitli görüşlere göre farklı tanımlara sahiptir. Genel olarak ise müzikoloji “bilimsel yöntemleri kullanarak müziği bütün yönleriyle araştırıp inceleyen yani müziği ilgilendiren bütün konularla ilgilenen bilim dalıdır” (Uslu, 2006, s. 1). 20. yüzyılda sosyal bilimler, filoloji ve felsefe alanındaki metodlardan faydalanarak müzikolojinin bilimsel metodları oluşturulmuştur. Tarihsel çalışmalar müzikoloji alanın merkezindedir. Bununla birlikte farklı bilim insanları müzikoloji için değişik çerçeveler çizmiştir. Bugün Guido Adler‟in sınıflandırması bazı gelişmelere rağmen güncelliğini korumaktadır. Adler 1855 yılında Vierteljahrsschrift für Musikwissenschaft‟ın ilk cildinde „Müzikolojinin Kapsamı, Yöntemi ve Amacı‟ (Umfang, Methode und Ziel der Musikwissenschaft) başlıklı yazısında müzikoloji „tarihi‟ ve „sistematik‟ olarak iki ana başlıkta incelemiştir. Sistematik müzikolojinin araştırmaları sonucunda ortaya çıkan müzikal analiz yöntemleri 20. yüzyılda kayda değer gelişmeler kaydetmiştir. Müzikal analiz müzikal yapının (structure) izahı, müzikteki basit kurucu unsurların çözümlenmesi ve bu bileşenlere ait fonksiyonların incelenmesidir. Müzikal „yapı‟ bir eserin bölümünü, bütünüyle bir eseri, bir grup eser ya da repertuvarı yazılı veya sözlü bir gelenek içinde temsil edebilir. Müzikal analiz sosyolojik, felsefi ve dini faktörler gibi dış faktörlerden ayrı olarak teknik anlamda yapılır. Müzikal analizin kapsamı müzikal estetik ve kompozisyon teorisidir. Müzikal analizde sadece bestelenmiş olan değil, kayıt altına alınan müzikal materyal de incelenebilir. Müzikal analizde yaklaşım „tanımlama/tasvir‟ ve „karşılaştırma‟dır. Tanımlamada/Tasvirde eserin ne içerdiği ve nasıl ortaya çıktığı gibi konular araştırılır. Karşılaştırmada benzerlikler ve farklılıklar incelenir. Müzikal analiz yöntemleri arasında form analizi, fonksiyon analizi, armonik analiz, Schenker analizi, dizisel müzik analizi, stil analizi, Gestalt Psikolojisi yaklaşımıyla analiz, yorumbilim (Hermeneutics), bilgi teorisi analizi, pitch-class set analizi, oransal analiz, müzikal göstergebilim, A Generative Theory of Tonal Music ve karşılaştırmalı metod gibi yöntemler bulunur. Bu çalışmamızda 20. yüzyılda uygulanan belli başlı müzikal analiz yöntemleri hakkında bilgi vermeye çalışacağız.
Thesis
Full-text available
How can live-performed chamber operas be conceptualized as immersive games with interactive features? This artistic study has resulted in a system model through which degrees of immersion may be generated and analyzed from physical, social, and psychical stimuli. A differentiation of immersive modes has been made possible by the framing of opera-making as game design. The findings indicate that so-called ludo-immersive opera could be developed into operatic chamber opera play for self-reliant participants, constituting an intimate and alternate practice in which dynamic game-masters may replace supervising directors. However, this practice is entangled with the question of future training for operatic practitioners outside the mainstream opera format, and beyond both Wagnerian and Brechtian spectatorship. The shift from the traditional audience/performer relationship to a novel form of immersive interaction requires a new mind-set and training for opera practitioners, to encourage autonomy and active participation by individual visitors. Theoretically, the study connects recent innovations in opera to the aesthetic principles of the Apollonian and the Dionysian and positions ludo-immersive opera in relation the these. The principles bridge immersion, opera, and game-playing, articulated by a reinterpretation of Roger Caillois’ taxonomy of play. The issue of immersion as an artistic aim in opera is highlighted. Moreover, artists’ and visitors’ reciprocal participation in ludo-immersive opera is discussed in regard to its historical context of operatic event-making and forms of presentation. The project explores the detailed consequences of perception and performance in chamber opera with ludic and immersive features, primarily inspired by live-action role playing. The main objective has been to investigate how operatic events can be presented as immersive adventures rather than spectacles, and consequences that the integration of playing visitors in professional opera implies for artistic practice. In four operas created during the period 2016–2020, interventions and encounters between artists and visitors in musically driven situations framed by fictional settings have been staged and studied. The artistic researcher has iteratively been engaged in action as opera singer, librettist, dramaturge, and director. Data from the research cycles include field recordings from the productions and reports from the participants in the form of interviews and surveys.
Article
The organizational structure of music is similar to that found in language, involving a large number of complicated hierarchical and embedded structures. The factors inducing complexities and difficulties in embedded structure processing are important subjects of inquiry in areas of cognitive neuroscience, such as music and language domains. Enlightened by relevant linguistic theories, this study investigated the influence of dependency lengthening and structural shift on musical embedded sequences processing. Results showed that final chords in sequences with long dependence elicited larger ERAN and N5 under near-key shift conditions, while elicited larger ERAN and LPC under far-key shift conditions, when compared to the sequences with short dependence; Further, the final chords in sequences with far-key shift elicited larger N5 under short dependence conditions, while elicited larger LPC under long dependence conditions when compared to the sequences with near-key shift. These results indicate that both dependency lengthening and structure shift could be the factors inducing complexities and difficulties in the processing of musical embedded structures, and there might be some common mechanisms underlying the processing of center-embedded structure across music and language domains.
Thesis
A la fin du XXe siècle, la musique savante a renversé l’hégémonie du paramètre de la hauteur. L’esthétique « post-Darmstadt » a étendu sa pensée à l’échelle du timbre. Mais cette mutation s’opère de façon détournée, sans renoncer au corrélat de la note et de la lutherie classique, au point d’établir, depuis les années 1980, une véritable écriture instrumentale du son complexe. Dès lors, que devient la hauteur en tant que paramètre de rationalisation du son et comment le modernisme évolue-t-il dans son rapport au matériau ?Il s’agit ici d’entreprendre la généalogie des fonctions structurantes acquises par la hauteur au cours d’un processus cumulatif de rationalisation du son. Les techniques de l’écriture instrumentale du son complexe sont ensuite définies par les densités timbriques et harmoniques qui subvertissent la hauteur classique. Cette pseudomorphose de la hauteur pose enfin une question esthétique et philosophique : celle d’un dépassement du critère historique du matériau.
Article
Full-text available
Este artigo tem como objetivo escrutinizar um pressuposto teórico da prática tonal, de que se pode aumentar a tensão harmônica globalmente ao alterar o acorde que exerce a função de Tônica (processo denominado modulação) e diminui-la quando o acorde original reassume a função de Tônica. Isso implica que um ouvinte deveria ser capaz de reconhecer simultaneamente a Tônica local e a Tônica principal após uma modulação. Os resultados de pesquisas experimentais são divergentes, alguns corroborando a existência desta habilidade cognitiva (Lerdahl & Krumhasl, 2007), outros questionando-a (Cook, 1987; Bigand & Parncu􀀑, 1999; Marvin & Brinkman, 1999; Farbood, 2016). Neste artigo, hipotetizamos que a retenção da Tônica principal após modulações na música tonal somente é possível para indivíduos capazes de reter na memória alguma forma de informação absoluta sobre alturas e que esta habilidade é utilizada conjuntamente com a memória relativa para alturas.
Article
Full-text available
Music is a cognitively demanding task. New tones override the previous tones in quick succession, with only a short window to process them. Language presents similar constraints on the brain. The cognitive constraints associated with language processing have been argued to promote the Chunk-and-Pass processing hypothesis and may influence the statistical regularities associated with word and phenome presentation that have been identified in language and are thought to allow optimal communication. If this hypothesis were true, then similar statistical properties should be identified in music as in language. By searching for real-life musical corpora, rather than relying on the artificial generation of musical stimuli, a novel approach to melodic fragmentation was developed specifically for a corpus comprised of improvisation transcriptions that represent a popular performance practice tradition from the 16th century. These improvisations were created by following a very detailed technique, which was disseminated through music tutorials and treatises across Europe during the 16th century. These music tutorials present a very precise methodology for improvisation, using a pre-defined vocabulary of melodic fragments (similar to modern jazz licks). I have found that these corpora follow two paramount, quantitative linguistics characteristics: (1) Zipf’s rank-frequency law and (2) Zipf’s abbreviation law. According to the working hypothesis, adherence to these laws ensures the optimal coding of the examined music corpora, which facilitates the improved cognitive processing for both the listener and the improviser. Although these statistical characteristics are not consciously implemented by the improviser, they might play a critical role in music processing for both the listener and the improviser.
Article
Full-text available
As part of a recent attempt to extend the methods of formal semantics beyond language (‘Super Semantics’), it has been claimed that music has an abstract truth-conditional semantics, albeit one that has more in common with iconic semantics than with standard compositional semantics (Schlenker 2017, 2019a, b). After summarizing this approach and addressing a common objection (here due to Leonard Bernstein), we argue that music semantics should be enriched in three directions by incorporating insights of other areas of Super Semantics. First, it has been claimed by Abusch 2013 that visual narratives make use of discourse referents akin to those we find in language. We argue that a similar conclusion extends to music, and we highlight it by investigating ways in which orchestration and dance may make cross-referential dependencies more explicit. Second, we show that by bringing music semantics closer to the semantics of visual narratives, we can give an account of the semantics of mixed visual and musical sequences. Third, it has been claimed that co-speech gestures trigger characteristic conditionalized presuppositions, called ‘cosuppositions’, and that their semantic status derives from their parasitic character relative to words (Schlenker 2018a, b). We argue that the same conclusion extends to some instances of film and cartoon music: it may trigger cosuppositions that can be revealed by embedding film excerpts or gifs in sentences so as to test presupposition projection. We further argue that under special discourse conditions (pertaining to certain Questions under Discussion), pro-speech gestures and pro-speech music alike can trigger cosuppositions as well. These results suggest that new insights can be gained not just by extending the methods of semantics to new objects, but also by drawing new connections among them.
Article
This paper covers the methods for measuring rhythm and the main paradigms used to study rhythm perception. An overview of ideas about speech rhythm is provided, starting with the traditional view of isochrony and rhythm classes. Production and perception methods used to establish rhythm-class differences are presented and critically reviewed, as are a number of research practices associated with them. Recent developments leading to an alternative view of rhythm are discussed, and suggestions for pedagogical practice and future research are provided.
Book
Full-text available
La música acusmática en tanto modo de producción -el modo de producción Concreto- Acusmático-, presenta ciertas particularidades que se verifican tanto en su producción como en su consumo. Si bien algunos de estos aspectos particulares son compartidos con otras músicas contemporáneas y/o experimentales, permanecen singularidades que la distinguen de las anteriores y que pareciera, se hacen evidentes durante la experiencia musical acusmática. Nos referimos específicamente al rol que juega la espacialidad/espacialización de la obra acusmática en la significación que de ella hacen los auditores al evaluar y describir su experiencia de escucha. El estudio y análisis de la experiencia musical solicita herramientas complementarias a las enunciadas por las teorías de la música, cuyos enfoques se centran preferentemente en aspectos relacionados con la producción de la obra, asumiendo que éstos serán reconocidas como tales durante la experiencia musical del auditor. Las ciencias cognitivas ofrecen enfoques teóricos, metodológicos y epistemológicos que permiten abordar -desde un modo distinto al musicológico la complejidad de esta experiencia y desarrollar explicaciones pertinentes a sus características. Para ello, durante esta investigación se aplicaron un conjunto de estudios experimentales que permitieron levantar información empírica sobre el rol de la espacialidad/espacialización en diversos aspectos de la experiencia musical acusmática. Este conjunto de informaciones tuvo una fuerte incidencia además en el desarrollo de conceptos, ideas y metodologías de trabajo para la composición de tres obras realizadas durante el período de esta tesis, desarrollándose estrechas relaciones entre investigación empírica o científica y creación musical, las que se describen en la última sección de este trabajo y esperan ser un aporte en el conocimiento de la función de la espacialidad en los procesos de significación que de la obra acusmática hacen los auditores.
Chapter
Full-text available
This chapter focuses on the study of the relationship between reading of music and verbal texts and it seeks to define an ecological music reading task that allows comparison of musical and verbal domains. Participants were preservice music students who performed different music reading tasks correlated with a verbal text comprehension test. A Principal Component Analysis (PCA), was performed, explaining 91,5% of the variance. The following two axes were defined: one related to reading compression and the other to music performance variables. The relationship between the selected variables in the factorial plane, particularly the strong association between sight-reading and literal comprehension, suggest that sight-reading is a relevant factor with regards to the study of musical and verbal domains.
Chapter
This paper examines the individuation of concepts in the music faculty (MF) based on their intrinsic congruent structure (extending the theory of congruence in Rawbone (2017) and Rawbone and Jan (2020)), and explores the aggregation and chaining of concepts in a language of musical thought (LMT). It is proposed that music perception is enacted through ‘input modules’, which are components of the MF that ground basic uniparametric concepts through congruence in the realms of rhythm and pitch; these systems are innate, domain-specific, automatic, bottom-up, and informationally encapsulated. More intricate modules of the MF, here described as sub-central systems, build complex multiparametric concepts from basic concepts, generally preserving congruence and comprising a compositional syntax—constraining the LMT. The LMT can be characterised as a sequencing of causal–functional tokens of congruent conceptual representations. While the LMT is located inside the MF, it is suggested that the sub-central systems that assemble it are mediated partly by ‘central’, domain-general systems of thought situated outside the MF. Central systems are needed for thinking and reasoning about information that is ambiguous or noncongruent and also integrating various sources of information, such as consolidating the representations of perception and memory. There are two key considerations for the music modularity and LMT hypotheses. Firstly, determining the extent to which the grounding of multiparametrically congruent concepts is automatic, bottom-up, innate, and encapsulated and secondly, establishing why noncongruent terms are significant when there is no perceptual imperative for coining such concepts.
Chapter
The study of verbal and musical structures shows that the unity of language and music is not limited by the hierarchically organized sequence (Lerdahl‒Jackendoff) or the ‘hierarchy of events’ (Bharucha). Rather, it depends on the types of organization of the subject-predicate complex that represent hierarchical and non-hierarchical structures in both language and music. This idea is elaborated based on Near Eastern professional music of oral traditions, including Israel (the Art of maqām), on the one hand, and the classical Arabic language theory, which does not employ the concept of adequate to the ‘proposition’ in the Western sense, on the other hand. It is concluded that there are two parallel formulas of the process in music, depending on the quality of the subject-predicate linkage: (1) i → m → t (ac. B. Asafyev, 1926) and (2) i ↔ t = m—temporality (G. Shamilli, 2017). The possibility of non-hierarchical organization of the subject-predicate linkage in music is shown. The conclusion has profound strategic implications and points to a type of rationality that manifests itself in various segments of musical culture.
ResearchGate has not been able to resolve any references for this publication.