Peter M. C. Harrison’s research while affiliated with University of Cambridge and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (59)


Testbed configuration. The top row of panels shows (a) the procedure used to measure network latency over Zoom and (b) the resulting 90-second latency and jitter time-series profile. The bottom row shows (c) a diagram of the testbed layout and (d) how this looked within the experiment room, for pianist and drummer.
Experimental conditions. The thirteen conditions tested in the experiment were derived from the transformation of the measurements in Figure 1b. The rows indicate minimum latency values between 0 and 180 ms, with jitter scalings shown by the color of each line. The left column shows the raw latency timings over the 90-second duration of each performance. The right column displays the standard deviation of latency values obtained over a sliding window of four seconds, which we later use in the partial correlation analysis of timing variance shown in Supplementary Materials Figure S9.
Beat extraction procedure. (a) Notation from a hypothetical performance where linear interpolation would have been applied. The note annotated with an exclamation point in the upper stave has been “pushed” before its expected position as the first beat of the next bar. The lower stave shows the performance after interpolation, with this note aligned midway between the quarter notes immediately before and after it (the interbeat interval between which is given by y). (b) The total number of beats contributed by each participant to the final corpus, after filtering and nearest-neighbor matching. The hatched segment of each bar shows the proportion of beats by that performer that required interpolation, with the exact number of beats given above the bar.
Baseline results. Number lines showing baseline values obtained for tempo slope, asynchrony, timing irregularity, performer- and listener-reported success, averaged for both repeats of the control condition by a duo. Note that performer-reported success values are randomly displaced horizontally for increased visual clarity and to prevent over-plotting.
Univariate and bivariate distributions. The histograms on each diagonal show the distribution of the variable plotted in that column/row. The scatter plots below the diagonal show the pair of variables obtained at the intersection of every column and row, with markers representing the score obtained by each duo for all thirteen conditions, averaged across instruments and sessions of the experiment. The straight red lines show a linear regression model fit between both variables, with error bars denoting 95% confidence intervals generated via bootstrapping with 10,000 replicates. Likewise, values above the diagonal denote the coefficient of Pearson’s r calculated between the corresponding variable pair, with the font size also indicating the strength of the correlation. Asterisks indicate the significance of the correlation coefficient, *p < .05. **p < .01. ***p < .001. Values for the tempo slope variable are given in their absolute (unsigned) form.

+4

Trade-offs in Coordination Strategies for Duet Jazz Performances Subject to Network Delay and Jitter
  • Article
  • Publisher preview available

September 2024

·

24 Reads

·

1 Citation

Huw Cheston

·

·

Peter M. C. Harrison

Coordination between participants is a necessary foundation for successful human interaction. This is especially true in group musical performances, where action must often be temporally coordinated between the members of an ensemble for their performance to be effective. Networked mediation can disrupt this coordination process by introducing a delay between when a musical sound is produced and when it is received. This can result in significant deteriorations in synchrony and stability between performers. Here we show that five duos of professional jazz musicians adopt diverse strategies when confronted by the difficulties of coordinating performances over a network—difficulties that are not exclusive to networked performance but are also present in other situations (such as when coordinating performances over large physical spaces). What appear to be two alternatives involve: 1) one musician being led by the other, tracking the timings of the leader’s performance; or 2) both musicians accommodating to each other, mutually adapting their timing. During networked performance, these two strategies favor different sides of the trade-off between, respectively, tempo synchrony and stability; in the absence of delay, both achieve similar outcomes. Our research highlights how remoteness presents new complexities and challenges to successful interaction.

View access options

Jazz Trio Database: Automated Annotation of Jazz Piano Trio Recordings Processed Using Audio Source Separation

August 2024

·

19 Reads

·

1 Citation

Transactions of the International Society for Music Information Retrieval

Recent advances in automatic music transcription have facilitated the creation of large databases of symbolic transcriptions of improvised music forms including jazz, where traditional notated scores are not normally available. In conjunction with music source separation models that enable audio to be “demixed” into separate signals for multiple instrument classes, these algorithms can also be applied to generate annotations for every musician in a performance. This can enable the analysis of interesting performer-level and ensemble-level features that have often been difficult to explore. To this end, we introduce Jazz Trio Database (JTD), a dataset of 44.5 h of jazz piano solos accompanied by bass and drums, with automatically generated annotations for each performer. These annotations consist of onset, beat, and downbeat timestamps, alongside MIDI for the piano soloist. Suitable recordings, broadly representative of the “straight-ahead” jazz style, were identified by scraping user-based listening and discographic data; source separation models were applied to isolate audio for each performer in the trio; annotations were generated by applying appropriate algorithms to both the separated and the mixed audio sources. Onset annotations generated by the pipeline achieved a mean F-measure of 0.94 when compared with ground truth annotations. We conduct several analyses of JTD, including with relation to swing and inter-performer synchronization. We anticipate that JTD will be useful in a variety of music information–retrieval tasks, including artist identification and expressive performance modeling. We have made JTD, including the annotations and associated source code, available at https://github.com/HuwCheston/Jazz-Trio-Database


(A) An illustration of the artificial music grammar (AMG) developed and presented by Rohrmeier, Rebuschat, and Cross (2011). Numbers 0–8 represent the musical pairs of notes. The nodes connect the elements of the grammar (i.e. the musical intervals) with each other. Grammatical sequences start from the leftmost node and move along the pathways indicated by the arrows until the rightmost node is reached. (B) An example of the melodic stimuli for the test sessions. Notes of the AMG melodies with extremely high (green) and extremely low (blue) information content (IC) were identified. High-probability (HP) notes correspond to low IC, whereas low-probability (LP) notes correspond to high IC.
(A) A schematic illustration of the full experimental procedure. Opaque panels and text represent the parts used in the current paper, while transparency represents parts not used. (Β) The parts used in the current paper, including the respective analysis metric extracted from each of those parts. (C) Top: An illustration of the trial structure of the test and generalization sessions. Participants heard a melodic sequence and were asked to make judgements on target notes by pressing 1 or 2 on a computer keyboard. Bottom: An illustration of the trial structure of the three training sessions. Participants listened to a melody and needed to reproduce the notes of the melody on a sound keyboard, starting from the first two notes and adding an extra note incrementally.
Pleasantness ratings for notes with varying probability. (A) Proportion of notes rated as pleasant within each probability bin with a 50% overlap. The proportion was calculated per bin. The purple shades represent the bin width. (B) Proportion of notes rated as pleasant for each non-overlapping probability bin. Error bars represent 1 ± SEM *p < 0.05; **p < 0.01; ***p < 0.001.
ERPs in response to notes of low, medium and high-probability (probability ranges from 0–0.33, 0.33–0.66, 0.66–1). (A) ERP waveforms at the: fronto-central midline electrode (left-hand side) and at the centro-posterior midline (right-hand side) for notes of low, medium and high-probability. (B) Mean N100 and P200 peak amplitude for each non-overlapping probability bin. Error bars represent 1 ± S.E.M. *p < 0.05; **p < 0.01; ***p < 0.001. (C) Topographical distribution of the difference in N100 (left-hand side) and P200 (right-hand side) for the contrasts: medium vs. low, medium vs. high and high vs. low probability. The difference is quantified as the t-values at each electrode location.
Associations between pleasantness judgments, note probability, creativity and learning. (A) Proportion of notes with low, medium and high probability rated as pleasant in groups with lower (light purple) and higher (dark purple) creativity. (B) Same as in (A), but for weak and strong learners separately. Error bars represent 1 ± S.E.M.
The association between liking, learning and creativity in music

August 2024

·

132 Reads

·

1 Citation

·

Peter M. C. Harrison

·

·

[...]

·

Caroline Di Bernardi Luft

Aesthetic preference is intricately linked to learning and creativity. Previous studies have largely examined the perception of novelty in terms of pleasantness and the generation of novelty via creativity separately. The current study examines the connection between perception and generation of novelty in music; specifically, we investigated how pleasantness judgements and brain responses to musical notes of varying probability (estimated by a computational model of auditory expectation) are linked to learning and creativity. To facilitate learning de novo, 40 non-musicians were trained on an unfamiliar artificial music grammar. After learning, participants evaluated the pleasantness of the final notes of melodies, which varied in probability, while their EEG was recorded. They also composed their own musical pieces using the learned grammar which were subsequently assessed by experts. As expected, there was an inverted U-shaped relationship between liking and probability: participants were more likely to rate the notes with intermediate probabilities as pleasant. Further, intermediate probability notes elicited larger N100 and P200 at posterior and frontal sites, respectively, associated with prediction error processing. Crucially, individuals who produced less creative compositions preferred higher probability notes, whereas individuals who composed more creative pieces preferred notes with intermediate probability. Finally, evoked brain responses to note probability were relatively independent of learning and creativity, suggesting that these higher-level processes are not mediated by brain responses related to performance monitoring. Overall, our findings shed light on the relationship between perception and generation of novelty, offering new insights into aesthetic preference and its neural correlates.


Consonance in the carillon

August 2024

·

13 Reads

The Journal of the Acoustical Society of America

Previous psychological studies have shown that musical consonance is not only determined by the frequency ratios between tones, but also by the frequency spectra of those tones. However, these prior studies used artificial tones, specifically tones built from a small number of pure tones, which do not match the acoustic complexity of real musical instruments. The present experiment therefore investigates tones recorded from a real musical instrument, the Westerkerk Carillon, conducting a “dense rating” experiment where participants (N = 113) rated musical intervals drawn from the continuous range 0–15 semitones. Results show that the traditional consonances of the major third and the minor sixth become dissonances in the carillon and that small intervals (in particular 0.5–2.5 semitones) also become particularly dissonant. Computational modelling shows that these effects are primarily caused by interference between partials (e.g., beating), but that preference for harmonicity is also necessary to produce an accurate overall account of participants' preferences. The results support musicians' writings about the carillon and contribute to ongoing debates about the psychological mechanisms underpinning consonance perception, in particular disputing the recent claim that interference is largely irrelevant to consonance perception.


Perception of Chord Sequences Modeled with Prediction by Partial Matching, Voice-Leading Distance, and Spectral Pitch-Class Similarity: A New Approach for Testing Individual Differences in Harmony Perception

August 2024

·

18 Reads

Music & Science

The perception of harmony has been the subject of many studies in the research literature, though little is known regarding how individuals vary in their ability to discriminate between different chord sequences. The aim of the current study was to construct an individual-differences test for the processing of harmonic information. A stimulus database of 5076 harmonic sequences was constructed and several harmonic features were computed from these stimulus items. Participants were tasked with selecting which chord differed between two similar four-chord sequences, and their response data were modeled with explanatory item response models using the computational harmonic features as predictors. The final model suggests that participants’ responses can be modeled using transitional probabilities between chords, voice-leading distance, and spectral pitch-class distance cues, with participant ability correlated to three subscales from Goldsmiths Musical Sophistication Index. The item response model was used to create an adaptive test of harmonic progression discrimination ability (HPT) and validated in a second study showing substantial correlations with other tests of musical perception ability, self-reported musical abilities, and a working memory task. The HPT is a new free and open-source tool for assessing individual differences in harmonic sequence discrimination. Initial data suggest this harmonic discrimination ability relies heavily on transitional probabilities within harmonic progressions.


Figure 1. Mapping of selected emotion words to the VA space. Subscript numbers refer to sources. 1: Mehu & Scherer, 2015. 2: Eerola & Vuoskoski, 2011. 3: Hupont et al., 2013. 4: Vieillard et al., 2008. 5: Fontaine et al., 2007. 6: Richins, 1997. 7: Russell, 1980. 8: Morgan & Heise, 1988.
Figure 4. Levels of cues for Heinichen's examples. E1 (playful) is set in the minor mode, while E2 (furious) and E4 (love) are in the major.
Exploring the variability of musical-emotional expression over historical time

June 2024

·

26 Reads

Empirical Musicology Review

A listening experiment was designed to test whether modern listeners perceive the same affective content in Baroque music as the composer intended to portray. Listeners rated three musical examples from Johann David Heinichen’s 1728 treatise Der General-Bass in der Composition for valence and arousal. Examples were chosen based on descriptions by the composer in which he outlined their intended affective content. Results showed a significant mismatch between original descriptions and listener ratings, indicating a change in the perceived affective content of the music. The historical variability of musical-emotional expression in general, with a focus on the role of structural emotion cues (particularly mode), is discussed, closing with suggestions for future research in the area of historical musical emotion


Consonance in the carillon

March 2024

Recent research has confirmed that musical consonance is not only determined by the frequency ratios between tones, but also by the frequency spectra of the underlying tones (Marjieh et al., 2024). However, this prior research was limited to artificial tones, specifically tones built from a small number of pure tones, producing sounds that do not match the acoustic complexity of real musical instruments. Here we therefore investigate tones recorded from a real musical instrument, the Westerkerk Carillon, conducting a ‘dense rating’ experiment where participants (N = 113) rated musical intervals drawn from the continuous range 0-15 semitones. We show that the traditional consonances of the major third and the minor sixth become dissonances in the carillon, and we show that small intervals (in particular 0.5-2.5 semitones) also become particularly dissonant in the carillon. Through computational modelling we show that these effects are primarily caused by interference between partials (e.g. beating), but we also show that preference for harmonicity is also necessary to produce an accurate overall account of participants’ preferences. The results support musicians’ writings about the carillon and contribute to ongoing debates about the psychological mechanisms underpinning consonance perception.


Commonality and variation in mental representations of music revealed by a cross-cultural comparison of rhythm priors in 15 countries

March 2024

·

300 Reads

·

13 Citations

Nature Human Behaviour

Music is present in every known society but varies from place to place. What, if anything, is universal to music cognition? We measured a signature of mental representations of rhythm in 39 participant groups in 15 countries, spanning urban societies and Indigenous populations. Listeners reproduced random ‘seed’ rhythms; their reproductions were fed back as the stimulus (as in the game of ‘telephone’), such that their biases (the prior) could be estimated from the distribution of reproductions. Every tested group showed a sparse prior with peaks at integer-ratio rhythms. However, the importance of different integer ratios varied across groups, often reflecting local musical practices. Our results suggest a common feature of music cognition: discrete rhythm ‘categories’ at small-integer ratios. These discrete representations plausibly stabilize musical systems in the face of cultural transmission but interact with culture-specific traditions to yield the diversity that is evident when mental representations are probed across many cultures.


Timbral effects on consonance disentangle psychoacoustic mechanisms and suggest perceptual origins for musical scales

February 2024

·

213 Reads

·

9 Citations

The phenomenon of musical consonance is an essential feature in diverse musical styles. The traditional belief, supported by centuries of Western music theory and psychological studies, is that consonance derives from simple (harmonic) frequency ratios between tones and is insensitive to timbre. Here we show through five large-scale behavioral studies, comprising 235,440 human judgments from US and South Korean populations, that harmonic consonance preferences can be reshaped by timbral manipulations, even as far as to induce preferences for inharmonic intervals. We show how such effects may suggest perceptual origins for diverse scale systems ranging from the gamelan’s slendro scale to the tuning of Western mean-tone and equal-tempered scales. Through computational modeling we show that these timbral manipulations dissociate competing psychoacoustic mechanisms underlying consonance, and we derive an updated computational model combining liking of harmonicity, disliking of fast beats (roughness), and liking of slow beats. Altogether, this work showcases how large-scale behavioral experiments can inform classical questions in auditory perception.


Jazz Trio Database: Automated Annotation of Jazz Piano Trio Recordings Processed Using Audio Source Separation

January 2024

Recent advances in automatic music transcription have facilitated the creation of large databases of symbolic transcriptions of improvised music forms including jazz, where traditional notated scores are not normally available. In conjunction with music source separation models that enable audio to be “demixed” into separate signals for multiple instrument classes, these algorithms can also be applied to generate annotations for every musician in a performance. This can enable the analysis of interesting performer-level and ensemble-level features that have often been difficult to explore. To this end, we introduce JTD (Jazz Trio Database), a dataset of 44.5 hours of jazz piano solos accompanied by bass and drums, with automatically generated annotations for each performer. These annotations consist of onset, beat, and downbeat timestamps, alongside MIDI for the piano soloist. Suitable recordings, broadly representative of the “straight-ahead” jazz style, were identified by scraping user-based listening and discographic data; source separation models were applied to isolate audio for each performer in the trio; annotations were generated by applying appropriate algorithms to both the separated and mixed audio sources. Onset annotations generated by the pipeline achieved a mean F-measure of 0.94 when compared with ground truth annotations. We conduct several analyses of JTD, including with relation to swing and inter-performer synchronization. We anticipate that JTD will be useful in a variety of music information retrieval tasks, including artist identification and expressive performance modeling. We have made JTD, including the annotations and associated source code, available at https://github.com/HuwCheston/Jazz-Trio-Database.


Citations (39)


... Supplementary material is available online [56]. ...

Reference:

Rhythmic qualities of jazz improvisation predict performer identity and style in source-separated audio recordings
Rhythmic Qualities of Jazz Improvisation Predict Performer Identity and Style in Source-Separated Audio Recordings
  • Citing Preprint
  • January 2024

... In an earlier study, we demonstrated good results from using linear phase correction to model the interaction between a jazz pianist and drummer in an experiment [44], which led us to apply the same model to commercial recordings here. As in this previous study, to control for any global drift in performance tempo we expressed every quarter note inter-beat interval inputted into the model in terms of its difference from the preceding interval (figure 2d). ...

Trade-offs in Coordination Strategies for Duet Jazz Performances Subject to Network Delay and Jitter

... The rhythmic features we extracted were selected both due to their prevalence in the existing quantitative literature on jazz timing, as well as a sense arising from prior qualitative and ethnographic work that jazz performers commonly used these terms when evaluating their and others' improvisation styles [3,22]. We have released this dataset under a permissive, open-access license to facilitate further research [23]. ...

Jazz Trio Database: Automated Annotation of Jazz Piano Trio Recordings Processed Using Audio Source Separation

Transactions of the International Society for Music Information Retrieval

... Our second research question pertains to the genetic and environmental basis of interindividual variability in the musical sensibility dimensions: What is the relative contribution of genes and the environment? Psychological research shows that the perception and cognition of basic musical features and emotional responses to music are highly dependent on enculturation [22][23][24][25] and personal preferences 26 and that familiar music induces the strongest emotional and physiological responses. [27][28][29][30] Thus, the wide variability in music-induced emotional experiences appears to be strongly mediated by experiential and cultural factors. ...

Commonality and variation in mental representations of music revealed by a cross-cultural comparison of rhythm priors in 15 countries

Nature Human Behaviour

... This inability suggests that their deficit prevents them from expressing a preference for either consonant or dissonant chords. Notably, the harmonic makeup of individual periodic tones remains constant, along with the likely inherited initial neural wiring system of the pitch processor (though see Marjieh et al., 2024). ...

Timbral effects on consonance disentangle psychoacoustic mechanisms and suggest perceptual origins for musical scales

... Clearly, judgements of the beautiful and the reception of art works are complex neurodynamical and mental processes that require multivariate and multidimensional methodology (Consoli, 2020), which was substantially recognized in the experiments by Cheung et al (2019Cheung et al ( , 2024 on the appreciation of musical beauty. Specifically, Cheung et al (2019) demonstrated that pleasure varies nonlinearly as a function of two independent variables: context uncertainty and stimulus surprisal. ...

Cognitive and sensory expectations independently shape musical expectancy and pleasure

... Hypothesis 1: Left-handers are more creative (trait Openness) (Anstee et al., 2022;Newland, 1981) • Hypothesis 2: Left-handers have a big advantage at competitive sports for being more competitive (trait Agreeableness reversed) (Coren, 1994;Hadžić, 2023) • Hypothesis 3: Left-handers are more fearful (trait Neuroticism) (Ocklenburg, 2023;Orme, 1970) • Hypothesis 4: Left-handers are more likely to become leaders (trait Conscientiousness) ...

Handedness and Musicality in Secondary School Students

... Neuroscience techniques [37], [38], [41], [53], [54], [56], [60], [70] 4 Optimization algorithms [62], [67], [68] 5 Data analysis and bayesian models [44], [50]- [52], [59] 6 Computational algorithms [31], [32], [47], [61], [66] 7 Music processing and audio analysis [39], [40], [44]- [46], [55], [58], [63], [65] 3. RESULTS AND DISCUSSION In this part, a bibliometric review and a thorough analysis of previous research is presented. In the first section, the connections between the concepts of musical generation and understanding, as well as the visualization of density, are revealed. ...

Large-scale iterated singing experiments reveal oral transmission mechanisms underlying music evolution

Current Biology

... It should be low-dimensional to keep the parameter space manageable for users to explore, while remaining sufficiently expressive to ensure that a good approximation of the target exists. Previous studies have explored using speaker embeddings derived from a pretrained speaker verification (SV) or text-tospeech (TTS) model as the parametric voice representation [10], [11]. However, these representations are often highdimensional, requiring truncation to facilitate user exploration at the cost of reduced expressiveness. ...

VoiceMe: Personalized voice generation in TTS
  • Citing Conference Paper
  • September 2022

... Even though more recent work such as that of Sims (2018) has considered somewhat richer stimuli such as synthesized instrument timbres and vibrotactile patterns, these were still limited to small data sets on the scale of 10-20 stimuli. These limitations make it hard to draw conclusions about the status of the universal law of generalization in the high-dimensional regime of real-world stimuli, especially as fundamental problems in psychology continue to be reshaped by large-scale behavioral studies (see, e.g., Awad et al., 2018;Battleday et al., 2020;Marjieh et al., 2022;Peterson et al., 2021). ...

Reshaping musical consonance with timbral manipulations and massive online experiments
  • Citing Preprint
  • June 2022