Development of Simultaneous Pitch Encoding: Infants Show a High Voice Superiority Effect.

Department of Psychology, Neuroscience & Behaviour, McMaster University, Hamilton, Ontario L8S 4K1, Canada.
Cerebral Cortex (Impact Factor: 8.67). 01/2013; 23(3):660-669. DOI: 10.1093/cercor/bhs050
Source: PubMed

ABSTRACT Infants must learn to make sense of real-world auditory environments containing simultaneous and overlapping sounds. In adults, event-related potential studies have demonstrated the existence of separate preattentive memory traces for concurrent note sequences and revealed perceptual dominance for encoding of the voice with higher fundamental frequency of 2 simultaneous tones or melodies. Here, we presented 2 simultaneous streams of notes (15 semitones apart) to 7-month-old infants. On 50% of trials, either the higher or the lower note was modified by one semitone, up or down, leaving 50% standard trials. Infants showed mismatch negativity (MMN) to changes in both voices, indicating separate memory traces for each voice. Furthermore, MMN was earlier and larger for the higher voice as in adults. When in the context of a second voice, representation of the lower voice was decreased and that of the higher voice increased compared with when each voice was presented alone. Additionally, correlations between MMN amplitude and amount of weekly music listening suggest that experience affects the development of auditory memory. In sum, the ability to process simultaneous pitches and the dominance of the highest voice emerge early during infancy and are likely important for the perceptual organization of sound in realistic environments.

Download full-text


Available from: Laurel Trainor, Sep 03, 2015
  • Source
    • "However, as noted by recent investigators (e.g., Fujioka et al., 2005, 2008; Marie and Trainor, 2013), given the asymmetric shape of the auditory filters (i.e., peripheral tuning curves) and the well-known upward spread of masking (Egan and Hake, 1950; Delgutte, 1990a,b), these explanations would, on the contrary, predict a low voice superiority. As such, more recent theories have largely dismissed these cochlear explanations as they are inadequate to account for the high voice prominence reported in both perceptual (Palmer and Holleran, 1994; Crawley et al., 2002) and ERP data (Fujioka et al., 2008; Marie and Trainor, 2013). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Natural auditory environments contain multiple simultaneously-sounding objects and the auditory system must parse the incoming complex sound wave they collectively create into parts that represent each of these individual objects. Music often similarly requires processing of more than one voice or stream at the same time, and behavioral studies demonstrate that human listeners show a systematic perceptual bias in processing the highest voice in multi-voiced music. Here, we review studies utilizing event-related brain potentials (ERPs), which support the notions that (1) separate memory traces are formed for two simultaneous voices (even without conscious awareness) in auditory cortex and (2) adults show more robust encoding (i.e., larger ERP responses) to deviant pitches in the higher than in the lower voice, indicating better encoding of the former. Furthermore, infants also show this high-voice superiority effect, suggesting that the perceptual dominance observed across studies might result from neurophysiological characteristics of the peripheral auditory system. Although musically untrained adults show smaller responses in general than musically trained adults, both groups similarly show a more robust cortical representation of the higher than of the lower voice. Finally, years of experience playing a bass-range instrument reduces but does not reverse the high voice superiority effect, indicating that although it can be modified, it is not highly neuroplastic. Results of new modeling experiments examined the possibility that characteristics of middle-ear filtering and cochlear dynamics (e.g., suppression) reflected in auditory nerve firing patterns might account for the higher-voice superiority effect. Simulations show that both place and temporal AN coding schemes well-predict a high-voice superiority across a wide range of interval spacings and registers. Collectively, we infer an innate, peripheral origin for the higher-voice superiority observed in human ERP and psychophysical music listening studies.
    Hearing research 02/2014; 308:60-70. DOI:10.1016/j.heares.2013.07.014 · 2.85 Impact Factor
  • Source
    • "To visualize the waveforms, 72 electrodes were selected and divided into four groupings in each hemisphere, and averaged within each grouping to represent brain responses recorded at the frontal (16 electrodes), central (20 electrodes), parietal (18 electrodes), and occipital (18 electrodes) regions (Figure 3). This virtual electrode montage has been used successfully in previous EEG studies to illustrate the average responses observed across scalp regions (e.g., He and Trainor, 2009; Trainor et al., 2011; Marie and Trainor, 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Cues to pitch include spectral cues that arise from tonotopic organization and temporal cues that arise from firing patterns of auditory neurons. fMRI studies suggest a common pitch center is located just beyond primary auditory cortex along the lateral aspect of Heschl's gyrus, but little work has examined the stages of processing for the integration of pitch cues. Using electroencephalography, we recorded cortical responses to high-pass filtered iterated rippled noise (IRN) and high-pass filtered complex harmonic stimuli, which differ in temporal and spectral content. The two stimulus types were matched for pitch saliency, and a mismatch negativity (MMN) response was elicited by infrequent pitch changes. The P1 and N1 components of event-related potentials (ERPs) are thought to arise from primary and secondary auditory areas, respectively, and to result from simple feature extraction. MMN is generated in secondary auditory cortex and is thought to act on feature-integrated auditory objects. We found that peak latencies of both P1 and N1 occur later in response to IRN stimuli than to complex harmonic stimuli, but found no latency differences between stimulus types for MMN. The location of each ERP component was estimated based on iterative fitting of regional sources in the auditory cortices. The sources of both the P1 and N1 components elicited by IRN stimuli were located dorsal to those elicited by complex harmonic stimuli, whereas no differences were observed for MMN sources across stimuli. Furthermore, the MMN component was located between the P1 and N1 components, consistent with fMRI studies indicating a common pitch region in lateral Heschl's gyrus. These results suggest that while the spectral and temporal processing of different pitch-evoking stimuli involves different cortical areas during early processing, by the time the object-related MMN response is formed, these cues have been integrated into a common representation of pitch.
    Frontiers in Psychology 06/2012; 3:180. DOI:10.3389/fpsyg.2012.00180 · 2.80 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: A theory of listening to music is proposed. It suggests that, for listeners, the process of prediction is the starting point to experiencing music. This implies that perception of music starts through both a predisposed and an experience-based extrapolation into the future (this is labeled a priori listening). Indications for this proposal are discussed and defined using perspectives from the cognitive sciences, neuroscience, philosophy, and experimental psychology. Of most importance in this theory is that listening to music constantly interacts with the creative processes.
    Creativity Research Journal 08/2013; 25(3):259-265. DOI:10.1080/10400419.2013.813759 · 0.75 Impact Factor
Show more