Article

On cortical coding of vocal communication sounds in primates

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Understanding how the brain processes vocal communication sounds is one of the most challenging problems in neuroscience. Our understanding of how the cortex accomplishes this unique task should greatly facilitate our understanding of cortical mechanisms in general. Perception of species-specific communication sounds is an important aspect of the auditory behavior of many animal species and is crucial for their social interactions, reproductive success, and survival. The principles of neural representations of these behaviorally important sounds in the cerebral cortex have direct implications for the neural mechanisms underlying human speech perception. Our progress in this area has been relatively slow, compared with our understanding of other auditory functions such as echolocation and sound localization. This article discusses previous and current studies in this field, with emphasis on nonhuman primates, and proposes a conceptual platform to further our exploration of this frontier. It is argued that the prerequisite condition for understanding cortical mechanisms underlying communication sound perception and production is an appropriate animal model. Three issues are central to this work: (i) neural encoding of statistical structure of communication sounds, (ii) the role of behavioral relevance in shaping cortical representations, and (iii) sensory-motor interactions between vocal production and perception systems.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... The marmoset is an increasingly popular biomedical non-human primate model [20][21][22][23] . It is a highly vocal and social primate species with a sophisticated vocal repertoire for vocal communication [24][25] . It has become an ideal animal model to fill the translational gap between rodent models and humans [26][27] . ...
... The seven categories of sounds are marmoset vocalizations (MarV), macaque vocalizations (MacV), human speech (HumS), other animal vocalizations (AmV), natural sounds (NS), artificial sounds (AS), and scrambled marmoset vocalizations (SMarV). Marmosets have a rich vocal repertoire for vocal communications [24][25] . Several distinct classes of vocalizations have been described for this species 32 . ...
Preprint
Full-text available
Species-specific vocalizations are behaviorally critical sounds. Similar to faces, species-specific vocalizations are important for the survival and social interactions of both humans and vocal animals. Face patches have been found in the brains of both human and non-human primates. In humans, a voice patch system has been identified on the lateral superior temporal gurus (STG) that is selective to human voices over other sounds. In non-human primates, while vocalization-selective regions were found on the rostral portion of the temporal lobe outside of the auditory cortex in both macaques and marmosets using functional magnetic resonance imaging (fMRI), it is yet clear whether vocalization-selective regions are present in the auditory cortex. Using wide-field calcium imaging, a technique with both high temporal and high spatial resolution, we discovered two voice patches in the marmoset auditory cortex that preferentially respond to marmoset vocalizations over other sounds and carry call types and identity information. One patch is located on the posterior primary auditory cortex (A1), and the other one is located on the anterior non-core region of the auditory cortex. These voice patches are functionally connected and hierarchically organized as shown by latency and selectivity analyses. Our findings reveal the existence of voice patches in the auditory cortex of marmosets and support the notion that similar cortical architectures are adapted for recognizing communication signals for both vocalizations and faces in different primate species.
... These calls and sounds may refer to objects or events in the environment and may convey information about food, predators, social relationships, caller identity apart from the emotional state of the caller (Ghazanfar and Hauser, 1999). Thus the auditory system must segregate and analyse the spectrotemporal structure of the auditory objects to extract the invariant acoustic cues that convey meaning (Wang, 2000, Zoloth and Green, 1979, Beecher et al., 1979, May et al., 1989. In this work, I will focus on two aspects of auditory objects processing in monkeys namely auditory segregation and timbral analysis. ...
... However, macaques are more suited as an animal model than any other mammal given their shared evolutionary lineage with humans with the exception of great apes in whom invasive experiments are not permitted due to ethical considerations. Thus, there have been anatomical, neurophysiological and behavioural studies to identify the structure-function relationships in many nonhuman primates including old world primates like macaques, and New World monkeys like marmosets (Hackett et al., 2001, Kaas and Hackett, 2000b, Wang, 2000, Rauschecker, 1998. The anatomical homology of the human auditory cortex with macaques is more evident (Hackett et al., 2001) than other mammals however the exact functional homology is still under investigation (Baumann et al., 2013, Brewer andBarton, 2016) and disagreement on the extent of this homology continues . ...
Thesis
Full-text available
The anatomical organization of the auditory cortex in old world monkeys is similar to that in humans. But how good are monkeys as a model of human cortical analysis of auditory objects? To address this question I explore two aspects of auditory object processing: segregation and timbre. Auditory segregation concerns the ability of animals to extract an auditory object of relevance from a background of competing sounds. Timbre is an aspect of object identity distinct from pitch. In this work, I study these phenomena in rhesus macaques using behaviour and functional magnetic resonance imaging (fMRI). I specifically manipulate one dimension of timbre, spectral flux: the rate of change of spectral energy. In summary, I show that there is a functional homology between macaques and humans in the cortical processing of auditory figure-ground segregation. However, there is no clear functional homology in the processing of spectral flux between these species. So I conclude that, despite clear similarities in the organization of the auditory cortex and processing of auditory object segregation, there are important differences in how complex cues associated with auditory object identity are processed in the macaque and human auditory brains.
... Frequency integration can be effectively assessed with band-passed noise (BPN) stimuli, which can be well defined by a center frequency (CF) and bandwidth. BPN bursts are fundamental elements of many natural sounds, including those used for animal communications (Hauser 1996;Rauschecker and Tian 2000;Wang 2000; Wang and Kadia 2001;Akimov et al. 2017). ...
... BPN bursts are essential elements of many natural sounds, including those used for communication by many species (Rauschecker and Tian 2000;Wang 2000;Wang and Kadia 2001;Akimov et al. 2017). For processing these complex sounds, bandwidth-selective neurons with different center frequencies may play important roles at the first step of processing. ...
Article
Full-text available
Spatial size tuning in the visual cortex has been considered as an important neuronal functional property for sensory perception. However, an analogous mechanism in the auditory system has remained controversial. In the present study, cell-attached recordings in the primary auditory cortex (A1) of awake mice revealed that excitatory neurons can be categorized into three types according to their bandwidth tuning profiles in response to band-passed noise (BPN) stimuli: nonmonotonic (NM), flat, and monotonic, with the latter two considered as non-tuned for bandwidth. The prevalence of bandwidth-tuned (i.e., NM) neurons increases significantly from layer 4 to layer 2/3. With sequential cell-attached and whole-cell voltage-clamp recordings from the same neurons, we found that the bandwidth preference of excitatory neurons is largely determined by the excitatory synaptic input they receive, and that the bandwidth selectivity is further enhanced by flatly tuned inhibition observed in all cells. The latter can be attributed at least partially to the flat tuning of parvalbumin inhibitory neurons. The tuning of auditory cortical neurons for bandwidth of BPN may contribute to the processing of complex sounds.
... 39 Deutsch 1986;Povel & Essens 1985 40 Hung 2011 41 Klyn et al. 2015 42 The articulatory loop is a working-memory mechanism that prevents pronouncable memory content from decaying through repeated articulation. 43 Wang 2000 Finally, research on sensory motor integration in the auditory system 48 suggests another contributing factor. Vocal and instrumental rhythms are produced via different motor-effectors in the body: larynx, tongue, lips and jaws (mouth cavity) for vocal, and upper (and sometimes lower) limbs for instrumental rhythms. ...
... 53 Second, regularities also increase when speech is aligned with periodic motor movements as in gesturing and music making, in chant, song or rap. Even though speech, considered on its own, is not based on regular interval sequences, in the context of human communication it is frequently aligned with body movements, such as in gesturing and musicking, and this alignment 48 Pa & Hickok 2008;Wang 200049 Klyn et al. 2015 (experiments 2 and 3); see also: Wang 2000 50 Will et al. 201551 Dauer, 1983Cummins, 2012 52 For an additional argument from the perspective of system dynamics that underlines the relative independence of spoken language from other bodily functions see Moore (2012). 53 Cummins and Port, 1998 aids communication: temporally regular utterances of a speaker permit a listener's attentional periodicity to become aligned to them. ...
... The answer to this question is critical for pinpointing the evolutionary origin of the dorsal-ventral dual streams in the primate brain. The marmoset is a highly vocal and social primate species with a sophisticated vocal repertoire for vocal communication (29,30). In recent years, marmosets have gained increasing interest in neuroscience and preclinical research (31)(32)(33)(34). ...
Article
Full-text available
Auditory dorsal and ventral pathways in the human brain play important roles in supporting speech and language processing. However, the evolutionary root of the dual auditory pathways in the primate brain is unclear. By parcellating the auditory cortex of marmosets (a New World monkey species), macaques (an Old World monkey species), and humans using the same individual-based analysis method and tracking the pathways from the auditory cortex based on multi-shell diffusion-weighted MRI (dMRI), homologous auditory dorsal and ventral fiber tracks were identified in these primate species. The ventral pathway was found to be well conserved in all three primate species analyzed but extend to more anterior temporal regions in humans. In contrast, the dorsal pathway showed a divergence between monkey and human brains. First, frontal regions in the human brain have stronger connections to the higher-level auditory regions than to the lower-level auditory regions along the dorsal pathway, while frontal regions in the monkey brain show opposite connection patterns along the dorsal pathway. Second, the left lateralization of the dorsal pathway is only found in humans. Moreover, the connectivity strength of the dorsal pathway in marmosets is more similar to that of humans than macaques. These results demonstrate the continuity and divergence of the dual auditory pathways in the primate brains along the evolutionary path, suggesting that the putative neural networks supporting human speech and language processing might have emerged early in primate evolution.
... One advantage of such stimulus sensitivity could be for governing the perception and mediating the salience of specific natural sounds, like vocalizations, which have complex spectrotemporal trajectories, often with meanings that can be deciphered only after sound cessation (Wang, 2000;Bar-Yosef et al., 2002;Lewicki, 2002;Seyfarth and Robert, 2003). Additionally, sound tuned OFF neural activity could be a mechanism for the sensory system to maintain a brief echoic memory of the specific preceding stimulus, especially when the OFF response is more tonic or longer lasting (Nees, 2016;Kinukawa et al., 2019). ...
Article
Full-text available
In studying how neural populations in sensory cortex code dynamically varying stimuli to guide behavior, the role of spiking after stimuli have ended has been underappreciated. This is despite growing evidence that such activity can be tuned, experience-and context-dependent and necessary for sensory decisions that play out on a slower timescale. Here we review recent studies, focusing on the auditory modality, demonstrating that this so-called OFF activity can have a more complex temporal structure than the purely phasic firing that has often been interpreted as just marking the end of stimuli. While diverse and still incompletely understood mechanisms are likely involved in generating phasic and tonic OFF firing, more studies point to the continuing post-stimulus activity serving a short-term, stimulus-specific mnemonic function that is enhanced when the stimuli are particularly salient. We summarize these results with a conceptual model highlighting how more neurons within the auditory cortical population fire for longer duration after a sound’s termination during an active behavior and can continue to do so even while passively listening to behaviorally salient stimuli. Overall, these studies increasingly suggest that tonic auditory cortical OFF activity holds an echoic memory of specific, salient sounds to guide behavioral decisions.
... However, call selectivity of neurons in the FAF did not change when calls were presented within sequences at high repetition rate. It is possible that the mostly linear nonselective responses of the A1 provide a general and flexible mechanism for auditory stream analyses, whereas the very nonlinear call selectivity observed in FAF serves a more behaviorally specific discrimination of species-specific vocalizations, as proposed for the auditory cortex of the marmoset monkey (50). Physiological identification of a downstream auditory processing region in the bat frontal cortical areas may provide a path toward deciphering the cellular and circuit mechanisms that underlie vocal communication in the mammalian brain. ...
Article
Full-text available
In this study, we examined the auditory responses of a prefrontal area, the frontal auditory field (FAF) of an echolocating bat (Tadarida brasiliensis) and presented a comparative analysis of the neuronal response properties between the FAF and the primary auditory cortex (A1). We compared single-unit responses from the A1 and the FAF elicited by pure tones, downward frequency-modulated (dFM) sweeps, and species-specific vocalizations. Unlike the A1, FAF were not frequency tuned. However, progressive increases in downward FM sweep rate elicited a systematic increase of response precision, a phenomenon which does not take place in the A1. Call-selectivity was higher in the FAF versus A1. We calculated the neuronal spectrotemporal receptive fields (STRF) and spike-triggered averages (STAs), to predict responses to the communication calls and provide an explanation for the differences in call selectivity between the FAF and A1. In the A1, we found a high correlation between predicted and evoked responses. However, we did not generate reasonable STRFs in the FAF and the prediction based on the STAs showed lower correlation coefficient than that of the A1. This suggests non-linear response properties in the FAF that are stronger than the linear response properties in the A1. Stimulating with a call sequence increased call selectivity in the A1 but it remained unchanged in the FAF. These data are consistent with a role for the FAF in assessing distinctive acoustic features downstream of A1, similar to the role proposed for primate ventrolateral prefrontal cortex.
... For example, modulation of visual responses in layer VI corticothalamic neurons can alter the ratio of burst to tonic firing in the LGN (Godwin et al. 1996;Wang et al. 2006). Although sustained responses are crucial for temporal processing ) and especially important for speech and music processing (Rosen 1992;Wang 2000), the effect of corticothalamic modulation on sustained response remains unknown. Due to the relatively delayed effect of corticofugal modulation (in comparison to direct sensory response), the effects of corticothalamic modulation may be especially evident on the long-lasting sustained response, raising the possibility that there are temporal effects that have not yet been examined. ...
Article
Full-text available
Cortical feedback has long been considered crucial for the modulation of sensory perception and recognition. However, previous studies have shown varying modulatory effects of the primary auditory cortex (A1) on the auditory response of subcortical neurons, which complicate interpretations regarding the function of A1 in sound perception and recognition. This has been further complicated by studies conducted under different brain states. In the current study, we used cryo-inactivation in A1 to examine the role of corticothalamic feedback on medial geniculate body (MGB) neurons in awake marmosets. The primary effects of A1 inactivation were a frequency-specific decrease in the auditory response of most MGB neurons coupled with an increased spontaneous firing rate, which together resulted in a decrease in the signal-to-noise ratio. In addition, we report for the first time that A1 robustly modulated the long-lasting sustained response of MGB neurons, which changed the frequency tuning after A1 inactivation, e.g. some neurons are sharper with corticofugal feedback and some get broader. Taken together, our results demonstrate that corticothalamic modulation in awake marmosets serves to enhance sensory processing in a manner similar to center-surround models proposed in visual and somatosensory systems, a finding which supports common principles of corticothalamic processing across sensory systems.
... For aperiodic stimuli, onset units report the timing of discrete events. The communication sounds of primates, bats, and other species are rich in temporal structure [89,90] typically at low to moderate temporal modulation rates [91]. In zebra finch caudomedial nidopallium (NCM)AU : PleasespelloutNCMatitsfirstmentioninthesentenceInzebrafinchN , a higher-level auditory area important for recognition of songs, phasic neurons responded preferentially to rapid temporal features and were coherent with frequencies up to 20 to 30 Hz, whereas tonic neurons followed low frequency modulations [92]. ...
Article
Full-text available
Studies of the encoding of sensory stimuli by the brain often consider recorded neurons as a pool of identical units. Here, we report divergence in stimulus-encoding properties between subpopulations of cortical neurons that are classified based on spike timing and waveform features. Neurons in auditory cortex of the awake marmoset ( Callithrix jacchus ) encode temporal information with either stimulus-synchronized or nonsynchronized responses. When we classified single-unit recordings using either a criteria-based or an unsupervised classification method into regular-spiking, fast-spiking, and bursting units, a subset of intrinsically bursting neurons formed the most highly synchronized group, with strong phase-locking to sinusoidal amplitude modulation (SAM) that extended well above 20 Hz. In contrast with other unit types, these bursting neurons fired primarily on the rising phase of SAM or the onset of unmodulated stimuli, and preferred rapid stimulus onset rates. Such differentiating behavior has been previously reported in bursting neuron models and may reflect specializations for detection of acoustic edges. These units responded to natural stimuli (vocalizations) with brief and precise spiking at particular time points that could be decoded with high temporal stringency. Regular-spiking units better reflected the shape of slow modulations and responded more selectively to vocalizations with overall firing rate increases. Population decoding using time-binned neural activity found that decoding behavior differed substantially between regular-spiking and bursting units. A relatively small pool of bursting units was sufficient to identify the stimulus with high accuracy in a manner that relied on the temporal pattern of responses. These unit type differences may contribute to parallel and complementary neural codes.
... Brain areas for processing vocalizations parallel that for face processing, running dorsally through the temporal lobe in humans (Belin et al. 2000;Binder et al. 2000;von Kriegstein et al. 2003), chimpanzees (Taglialatela et al. 2008), macaques (Gil-da- Costa et al. 2004;Poremba et al. 2004;Petkov et al. 2008) and marmosets (Sadagopan et al. 2015). Vocalization-sensitive neurons were recorded in marmosets (Wang et al. 1995;Wang 2000;Wang & Kadia 2001), and in the macaque monkey (Rauschecker et al. 1995;Tian et al. 2001;Recanzone 2008;Russ et al. 2008;Kikuchi et al. 2010), principally clustered within superior temporal gyrus (Perrodin et al. 2011) but also in the insula, prefrontal and orbitofrontal cortex (Romanski & Goldman-Rakic 2002;Cohen et al. 2004;Rolls et al. 2006;Remedios et al. 2009). These areas also contribute to the perception of non-vocal auditory communicative signals: Macaque drumming (an auditory attention-getter) activates both the amygdala and cortical 'voice' areas in macaques (Remedios et al. 2009). ...
Article
Full-text available
Primates present a rich range of communication strategies in different modalities that evolved as signaling, perceiving and signaling back behaviors. This diversity benefits from specialized dedicated neural pathways for signaling and for perceiving communication signals. The brain areas for perceiving and producing communicative signals can be described separately, but form integrated neural loops, which coordinate perception and action in the signaler and receiver. Moreover, the different sensory modalities are initially processed separately by the brain but eventually share neural pathways for communication: a redundancy that might ensure proper signal transfer. Only a few primate species have been studied so far, including rhesus, long-tailed macaques, squirrel monkeys, marmosets, and humans. Yet, the evidence suggests that all primates possess specialized neural pathways coordinating a diverse range of communication systems for organizing their complex kin and friendship bonds.
... Many previous studies of auditory processing in non-human primates were performed in anesthetized preparations [13][14][15]. In this study, we adopted a fentanyl cocktail for light anesthesia [16], under which pure-tone responses were still robustly evoked in A1 [6], and the overall level of Cal-520AM fluorescence remained largely unchanged (Fig 5a). ...
Article
Full-text available
Marmosets are highly social non-human primates living in families. They exhibit rich vocalization, but neural basis underlying complex vocal communication is largely unknown. Here we report the existence of specific neuron populations in marmoset A1 that respond selectively to distinct simple or compound calls made by conspecific marmosets. These neurons were spatially dispersed within A1 but distinct from those responsive to pure tones. Call-selective responses were markedly diminished when individual domains of the call were deleted or the domain sequence was altered, indicating the importance of global rather than local spectral-temporal properties of the sound. Compound call-selective responses also disappeared when the sequence of the two simple-call components was reversed or their interval was extended beyond 1 second. Light anesthesia largely abolished call-selective responses. Our findings demonstrate extensive inhibitory and facilitatory interactions among call-evoked responses, and provide the basis for further study of circuit mechanisms underlying vocal communication in awake non-human primates.
... Modulation of signal amplitude is a fundamental acoustic cue that is present in speech, nonhuman vocalizations, and many other natural sounds (Shannon et al. 1995;Wang 2000;Singh and Theunissen 2003;Zeng et al. 2005;Elliott and Theunissen 2009). Although neural responses to amplitude modulated (AM) sounds are well characterized (Joris et al. 2004;Malone et al. 2010), their relationship to perceptual judgments is less certain. ...
Article
Core auditory cortex (AC) neurons encode slow fluctuations of acoustic stimuli with temporally patterned activity. However, whether temporal encoding is necessary to explain auditory perceptual skills remains uncertain. Here, we recorded from gerbil AC neurons while they discriminated between a 4-Hz amplitude modulation (AM) broadband noise and AM rates >4 Hz. We found a proportion of neurons possessed neural thresholds based on spike pattern or spike count that were better than the recorded session's behavioral threshold, suggesting that spike count could provide sufficient information for this perceptual task. A population decoder that relied on temporal information outperformed a decoder that relied on spike count alone, but the spike count decoder still remained sufficient to explain average behavioral performance. This leaves open the possibility that more demanding perceptual judgments require temporal information. Thus, we asked whether accurate classification of different AM rates between 4 and 12 Hz required the information contained in AC temporal discharge patterns. Indeed, accurate classification of these AM stimuli depended on the inclusion of temporal information rather than spike count alone. Overall, our results compare two different representations of time-varying acoustic features that can be accessed by downstream circuits required for perceptual judgments.
... Marmosets are an ideal species for investigating vocal processing because they exhibit rich vocal behaviors (Epple, 1968;Stevenson and Poole, 1976;Snowdon 2001) and possess a large, wellcharacterized vocal repertoire that is retained in captivity (Wang, 2000;DiMattina and Wang, 2006;Pistorio et al., 2006). The marmoset auditory system is well-studied -at the cortical level, the anatomy and connectivity of primary and higher auditory cortices are well-characterized (e.g., Aitkin et al., 1993;de la Mothe et al., 2006de la Mothe et al., , 2012, basic neural response properties are known (e.g., Aitkin et al., 1986;Kajikawa et al., 2005;Philibert et al., 2005;Wang et al., 2005;Bendor and Wang, 2008), the neural basis for responses to more complex stimuli have been studied (Kadia and Wang, 2003;Sadagopan and Wang, 2009) and the neural representation of conspecific vocalizations by neurons in the primary and belt auditory cortex has been explored (Wang et al., 1995;Wang and Kadia, 2001;Nagarajan et al., 2002;Kajikawa et al., 2008). ...
Preprint
Vocalizations are behaviorally critical sounds, and this behavioral importance is reflected in the ascending auditory system, where conspecific vocalizations are increasingly over-represented at higher processing stages. Recent evidence suggests that, in macaques, this increasing selectivity for vocalizations might culminate in a cortical region that is densely populated by vocalization-preferring neurons. Such a region might be a critical node in the representation of vocal communication sounds, underlying the recognition of vocalization type, caller and social context. These results raise the questions of whether cortical specializations for vocalization processing exist in other species, their cortical location, and their relationship to the auditory processing hierarchy. To explore cortical specializations for vocalizations in another species, we performed high-field fMRI of the auditory cortex of a vocal New World primate, the common marmoset (Callithrix jacchus). Using a sparse imaging paradigm, we discovered a caudal-rostral gradient for the processing of conspecific vocalizations in marmoset auditory cortex, with regions of the anterior temporal lobe close to the temporal pole exhibiting the highest preference for vocalizations. These results demonstrate similar cortical specializations for vocalization processing in macaques and marmosets, suggesting that cortical specializations for vocal processing might have evolved before the lineages of these species diverged.
... -Both species have complex, albeit fairly different, social behaviours that they regulate using very different sets of complex vocalizations well characterized acoustically (macaque: (Fukushima et al., 2015;Green, 1975;Hauser, 1991;Kalin et al., 1992); marmoset (Agamaite et al., 2015;Miller et al., 2010a;Pistorio et al., 2006;Turesson et al., 2016)). -Both models are widely studied in neuroscience, in particularly in the auditory domain, providing large amounts of physiological, anatomical and neuroimaging data for reference (e.g., macaque: (Gil-da-Costa et al., 2006;Hackett, 2011;Kaas et al., 1999;Petkov et al., 2015;Poremba et al., 2004;Rauschecker and Tian, 2000;Recanzone, 2008;Tian et al., 2001); marmoset: (Bendor and Wang, 2005;Wang, 2008, 2013;Newman et al., 2009;Nummela et al., 2017;Roy et al., 2016;Wang, 2000;Wang and Kadia, 2001;Wang et al., 1995). In particular, the marmoset is highly promising for the application of gene editing techniques in a primate model (Marx, 2016;Miller et al., 2016;Okano et al., 2016). ...
... In this case, individuals must have the capacities to recognise, modify, and assess the stored representation of a voice as it changes. The neural mechanism which would allow animals to develop such capacities has been shown to be present in mammals as well as the ability to memorise different versions of individual calls (Wang 2000;Charrier et al. 2003). ...
... This challenge is not uniquely human. Animals produce species-specific vocalizations (calls) with large within-and between-caller variability 3 , and must classify these calls into distinct categories to produce appropriate behaviors. For example, in common marmosets (Callithrix jacchus), a highly vocal New World primate species, critical behaviors such as finding other marmosets when isolated depend on accurate extraction of call-type and caller information [4][5][6][7][8] . ...
Article
Full-text available
Humans and vocal animals use vocalizations to communicate with members of their species. A necessary function of auditory perception is to generalize across the high variability inherent in vocalization production and classify them into behaviorally distinct categories (‘words’ or ‘call types’). Here, we demonstrate that detecting mid-level features in calls achieves production-invariant classification. Starting from randomly chosen marmoset call features, we use a greedy search algorithm to determine the most informative and least redundant features necessary for call classification. High classification performance is achieved using only 10–20 features per call type. Predictions of tuning properties of putative feature-selective neurons accurately match some observed auditory cortical responses. This feature-based approach also succeeds for call categorization in other species, and for other complex classification tasks such as caller identification. Our results suggest that high-level neural representations of sounds are based on task-dependent features optimized for specific computational goals.
... Ce réseau a été identifié chez l'Homme (DéMonet et al., 1992 ;Belin et al., 2000 ;Binder et al., 2000 ;Belin et al., 2002), les chimpanzés (Taglialatela Et al., 2009), les macaques (Gil-da-Costa et al., 2004 ;Poremba et al., 2004 ;Gil-da-Costa et al., 2006 ;Ghazanfar et Rendall, 2008 ;Petkov et al., 2008) et les ouistitis (Sadagopan et al., 2014). Les neurones sélectifs aux voix, aussi appelés neurones « détecteurs d'appel », ont été enregistrés pour la première fois chez les saïmiris et chez les ouistitis (Wang et al., 1995 ;Wang, 2000, Wang et Kadia, 2001. Par la suite des neurones sélectifs aux voix ont été décrits chez les macaques (Rauschecker et al., 1995;Tian et al., 2001;Recanzone, 2008 ;Russ et al., 2008, Kikuchi et al., 2010, principalement localisés dans des régions corticales identifiées par IRMf (Perrodin et al., 2011), mais aussi dans l'insula (Remedios et al., 2009a) et dans le cortex préfrontal et orbitofrontal (Romanski et Goldman-Rakic, 2002 ;Cohen et al., 2004 ;Rolls et al., 2006). ...
Article
Full-text available
Primates, like all animals live in an environment that includes others. They can be detected by others and can influence the likelihood (and consequences) of this detection by sending signals. Signals are bodily features or behaviors of the signaler that trigger specific behaviors in the receiver. The receiver, signaler, signal and medium are the four basic building blocks of any communication cycle. Each component can be considered separately, but in the service of communication they are interdependent and defined only in relation to one other. Cycles of reciprocal signal exchange mediate social interactions, but even “asocial” species coordinate reproduction, manage conflict over territory, and may anticipate and influence potential predators and prey. Communication arose long before the evolution of primates, animals and even neurons, yet is a crucial aspect of primate behavior and of their nervous system evolution. In this review, we examine how exchanges take place among primates and how neural systems act to mediate them.
... In conclusion, we propose a hierarchical model for solving a central problem in auditory 550 perception -the goal-oriented categorization of sounds that show high within-category variability such as speech 1, 2 or animal vocalizations 3 . Our work has broad implications as to where in the auditory pathway categorization begins to emerge, and what features are optimal to learn in categorization tasks. ...
Preprint
Humans and vocal animals use vocalizations (human speech or animal "calls") to communicate with members of their species. A necessary function of auditory perception is to generalize across the high variability inherent in the production of these sounds and classify them into perceptually distinct categories ("words" or "call types"). Here, we demonstrate using an information-theoretic approach that production-invariant classification of calls can be achieved by detecting mid-level acoustic features. Starting from randomly chosen marmoset call features, we used a greedy search algorithm to determine the most informative and least redundant set of features necessary for call classification. Call classification at >95% accuracy could be accomplished using only ~10 features per call type. Most importantly, predictions of the tuning properties of putative neurons selective for such features accurately matched previously observed responses of superficial layer neurons in primary auditory cortex. Such a feature-based approach could also solve other complex classification tasks such as caller identification. Our results suggest that high-level neural representations of sounds are based on task-dependent features optimized for specific computational goals.
... -Both species have complex, albeit fairly different, social behaviours that they regulate using very different sets of complex vocalizations well characterized acoustically (macaque: (Fukushima et al., 2015;Green, 1975;Hauser, 1991;Kalin et al., 1992); marmoset (Agamaite et al., 2015;Miller et al., 2010a;Pistorio et al., 2006;Turesson et al., 2016)). -Both models are widely studied in neuroscience, in particularly in the auditory domain, providing large amounts of physiological, anatomical and neuroimaging data for reference (e.g., macaque: (Gil-da-Costa et al., 2006;Hackett, 2011;Kaas et al., 1999;Petkov et al., 2015;Poremba et al., 2004;Rauschecker and Tian, 2000;Recanzone, 2008;Tian et al., 2001); marmoset: (Bendor and Wang, 2005;Wang, 2008, 2013;Newman et al., 2009;Nummela et al., 2017;Roy et al., 2016;Wang, 2000;Wang and Kadia, 2001;Wang et al., 1995). In particular, the marmoset is highly promising for the application of gene editing techniques in a primate model (Marx, 2016;Miller et al., 2016;Okano et al., 2016). ...
Article
Full-text available
We review behavioural and neural evidence for the processing of information contained in conspecific vocalizations (CVs) in three primate species: humans, macaques and marmosets. We focus on abilities that are present and ecologically relevant in all three species: the detection and sensitivity to CVs; and the processing of identity cues in CVs. Current evidence, although fragmentary, supports the notion of a "voice patch system" in the primate brain analogous to the face patch system of visual cortex: a series of discrete, interconnected cortical areas supporting increasingly abstract representations of the vocal input. A central question concerns the degree to which the voice patch system is conserved in evolution. We outline challenges that arise and suggesting potential avenues for comparing the organization of the voice patch system across primate brains.
... These results obtained from A1 of awake marmosets provide valuable insights into cortical processing of acoustic information at the cellular level in non-human primates. The intracellular recording technique developed in our study opens the door for further studies of cellular mechanisms underlying complex and natural sound processing in population of neurons in auditory cortex of marmosets or other animal models (Wang 2000;Mizrahi et al. 2014). ...
Article
Extracellular recording studies have revealed diverse and selective neural responses in the primary auditory cortex (A1) of awake animals. However, we have limited knowledge on subthreshold events that give rise to these responses, especially in non-human primates, as intracellular recordings in awake animals pose substantial technical challenges. We developed a novel intracellular recording technique in awake marmosets to systematically study subthreshold activity of A1 neurons that underlies their diverse and selective spiking responses. Our findings showed that in contrast to predominantly transient depolarization observed in A1 of anesthetized animals, both transient and sustained depolarization (during or beyond the stimulus period) were observed. Comparing with spiking responses, subthreshold responses were often longer lasting in duration and more broadly tuned in frequency, and showed narrower intensity tuning in non-monotonic neurons and lower response threshold in monotonic neurons. These observations demonstrated the enhancement of stimulus selectivity from subthreshold to spiking responses in individual A1 neurons. Furthermore, A1 neurons classified as regular- or fast-spiking subpopulation based on their spike shapes exhibited distinct response properties in frequency and intensity domains. These findings provide valuable insights into cortical integration and transformation of auditory information at the cellular level in auditory cortex of awake non-human primates.
... The processing of species-specific communication signals has been the focus of neuroethological studies in both invertebrates and vertebrates for more than three decades. Neuroethological studies on vertebrates are typically conducted on anurans (Fuzessery and Feng 1983;Mudry and Capranica 1987), birds (Bonke 1979;Koppl et al. 2000;Marler and Doupe 2000;Scheich 1977b;Solis and Doupe 1999;Theunissen and Doupe 1998), primates (Glass and Wollberg 1983;Rauschecker and Tian 2000;Wang 2000;Winter and Funckenstein 1973), and bats (Esser et al. 1997;Kanwal et al. 1994;Ohlemiller et al. 1994Ohlemiller et al. , 1996, because these animals are highly vocal and their communication calls have been cataloged and studied behaviorally. With few exceptions, these studies have focused on forebrain structures and have largely left lower auditory nuclei unexplored. ...
Article
Here we show that inhibition shapes diverse responses to species-specific calls in the inferior colliculus (IC) of Mexican free-tailed bats. We presented 10 calls to each neuron of which 8 were social communication and 2 were echolocation calls. We also measured excitatory response regions: the range of tone burst frequencies that evoked discharges at a fixed intensity. The calls evoked highly selective responses in that IC neurons responded to some calls but not others even though those calls swept through their excitatory response regions. By convolving activity in the response regions with the spectrogram of each call, we evaluated whether responses to tone bursts could predict discharge patterns evoked by species-specific calls. The convolutions often predicted responses to calls that evoked no responses and thus were inaccurate. Blocking inhibition at the IC reduced or eliminated selectivity and greatly improved the predictive accuracy of the convolutions. By comparing the responses evoked by two calls with similar spectra, we show that each call evoked a unique spatiotemporal pattern of activity distributed across and within isofrequency contours and that the disparity in the population response was greatly reduced by blocking inhibition. Thus the inhibition evoked by each call can shape a unique pattern of activity in the IC population and that pattern may be important for both the identification of a particular call and for discriminating it from other calls and other signals.
... In mammals, tonotopy is established in the receptor organ, the cochlea, and maintained as a systematic spatial separation of different frequencies in different areas of the ascending auditory pathway, and in the auditory cortex. While temporal modulation is recognised as an essential perceptual component of communication sounds such as human speech and animal vocalisations (Rosen, 1992;Drullman et al., 1994;Shannon et al., 1995;Wang, 2000;Chi et al., 2005;Elliott and Theunissen, 2009), its representation in the auditory system is poorly understood. In contrast to sound frequency, amplitude modulation rate is not spatially organised in the cochlea but represented in the temporal dynamics of neuronal firing patterns. ...
Article
Full-text available
ELife digest The arrival of sound waves at the ear causes the fluid inside a part of the ear known as the cochlea to vibrate. These vibrations are detected by tiny hair cells and transformed into electrical signals, which travel along the auditory nerve to the brain. After processing in the brainstem and other regions deep within the brain, the signals reach a region called the auditory cortex, where they undergo further processing. The cells in the cochlea that respond to sounds of similar frequencies are grouped together, forming what is known as a tonotopic map. This also happens in the auditory cortex. However, the temporal properties of sounds—such as how quickly the volume of a sound changes over time—are represented differently. In the cochlea these properties are instead encoded by the rate at which the cochlear nerve fibres ‘fire’ (that is, the rate at which they generate electrical signals). However, it is not clear how the temporal properties of sound waves are represented in auditory cortex. Baumann et al. have now addressed this question by scanning the brains of three awake macaque monkeys as the animals listened to bursts of white noise with varying properties. This revealed that just as neurons that respond to sounds of similar frequencies are grouped together within auditory cortex, so too are neurons that respond to sounds with similar temporal properties. When these temporal preferences are plotted on a map of auditory cortex, they form a series of concentric rings lying at right angles to the frequency map in certain areas. Recent brain imaging studies in humans have also suggested the existence of a ‘temporal map’. Further experiments are now required to determine exactly how neurons within the auditory cortex encode the temporal characteristics of sounds. DOI: http://dx.doi.org/10.7554/eLife.03256.002
... Voice processing pathways parallels that for face processing, running dorsally through the temporal lobe in humans (Belin, Zatorre, & Ahad, 2002;Belin, Zatorre, Lafaille, Ahad, & Pike, 2000;Binder et al., 2000;DéMonet, Jiang, Shuman, & Kanwisher, 1992), chimpanzees (Taglialatela, Russell, Schaeffer, & Hopkins, 2009), macaques (Ghazanfar & Rendall, 2008;Gil-da-Costa et al., 2004Petkov et al., 2008;Poremba et al., 2004) and marmosets (Sadagopan, Temiz-Karayol, & Voss, 2015). Voice-selective "call detector" neurons were first recorded in squirrel and marmoset monkeys (Wang, 2000;Wang & Kadia, 2001;Wang, Merzenich, Beitel, & Schreiner, 1995). Voice-selective neurons were next described in the macaque monkey (Kikuchi, Horwitz, & Mishkin, 2010;Rauschecker, Tian, & Hauser, 1995;Recanzone, 2008;Russ, Ackelson, Baker, & Cohen, 2008;Tian, Reser, Durham, Kustov, & Rauschecker, 2001), principally clustered within fMRIidentified voice-selective areas (Perrodin, Kayser, Logothetis, & Petkov, 2011) but also in the insula (Remedios, Logothetis, & Kayser, 2009a) and in prefrontal and orbitofrontal cortex (Cohen, Russ, Gifford, Kiringoda, & MacLean, 2004;Rolls, Critchley, Browning, & Inoue, 2006;Romanski & Goldman-Rakic, 2002). ...
Chapter
Primates present a rich range of communication strategies in different modalities that evolved as signaling, perceiving, and signaling back behaviors. This chapter reviews general principles of signaling exchange, and then details how these exchanges take place among nonhuman primates and how neural systems act to mediate them. Signals can be sent through diverse media: Tactile signals are primarily short range, instantaneous, and reciprocal. The chapter discusses neural mechanisms by which two specific, relatively well-studied primate signals are produced: facial expressions and vocalizations. The chapter discusses how the mechanisms of primate signaling generalize, first across vertebrates and then across all species. There are relatively few points of contact between systems neuroscience descriptions of primate social interactions and those of invertebrates, despite many interesting examples of sophisticated communicative behavior, such as the dances of bees. The chapter outlines commonalities between primate and nonprimate communication systems, and discusses broader implications for the study of social neuroscience.
... Some of the reasons for this popularity are that marmosets have a similar disease susceptibility profile to humans, are relatively easy to handle, have a high reproductive rate, and that important genetic and neuroscience research tools already exist [7]. In particular, marmosets are an excellent model for the neurophysiological study of vocal communication [8]. A pubmed search on "Callithrix jacchus" shows between 113 and 164 publications per year during the last 10 years, with most of the publications in biomedicine. ...
Article
Full-text available
Automatic classification of vocalization type could potentially become a useful tool for acoustic the monitoring of captive colonies of highly vocal primates. However, for classification to be useful in practice, a reliable algorithm that can be successfully trained on small datasets is necessary. In this work, we consider seven different classification algorithms with the goal of finding a robust classifier that can be successfully trained on small datasets. We found good classification performance (accuracy > 0.83 and F 1-score > 0.84) using the Optimum Path Forest classifier. Dataset and algorithms are made publicly available.
... The common marmoset (Callithrix jacchus) is a small-bodied New World primate species that has emerged in recent years as a promising model system for studies of auditory and vocal processing (Wang, 2000(Wang, , 2007Miller et al., 2016), including several recent studies examining the behavioral and neural mechanisms of pitch perception for complex sounds (Bendor and Wang, 2005;Bendor et al., 2012;Osmanski et al., 2013;Song et al., 2016). However, no previous psychoacoustic studies have examined frequency discrimination for pure tones across a broad frequency range in the common marmoset, and such data are critical for future work exploring the underlying mechanisms of complex sound processing, including vocal perception, in this species. ...
Article
The common marmoset (Callithrix jacchus) is a highly vocal New World primate species that has emerged in recent years as a promising model system for studies of auditory and vocal processing. Our recent studies have examined perceptual mechanisms related to the pitch of harmonic complex tones in this species. However, no previous psychoacoustic work has measured marmosets' frequency discrimination abilities for pure tones across a broad frequency range. Here we systematically examined frequency difference limens (FDLs), which measure the minimum discriminable frequency difference between two pure tones, in marmosets across most of their hearing range. Results show that marmosets' FDLs are comparable to other New World primates, with lowest values in the frequency range of ∼3.5-14 kHz. This region of lowest FDLs corresponds with the region of lowest hearing thresholds in this species measured in our previous study and also with the greatest concentration of spectral energy in the major types of marmoset vocalizations. These data suggest that frequency discrimination in the common marmoset may have evolved to match the hearing sensitivity and spectral characteristics of this species' vocalizations.
... They have a similar hearing range (Osmanski and Wang, 2011) and pitch perception behaviors (Song et al., 2015) as humans and a cortical organization similar to other non-human primates (Aitkin et al., 1986;de la Mothe et al., 2006;Bendor and Wang, 2008). They have a relatively flat and easily accessible auditory cortex (Aitkin et al., 1986;Wang, 2000) that is well suited for intracellular recordings. Using this technique, we have investigated the cellular mechanisms underlying the rate-coding by the two distinct populations of neurons in marmoset auditory cortex. ...
Article
Full-text available
A key computational principle for encoding time-varying signals in auditory and somatosensory cortices of monkeys is the opponent model of rate coding by two distinct populations of neurons. However, the subthreshold mechanisms that give rise to this computation have not been revealed. Because the rate-coding neurons are only observed in awake conditions, it is especially challenging to probe their underlying cellular mechanisms. Using a novel intracellular recording technique that we developed in awake marmosets, we found that the two types of rate-coding neurons in auditory cortex exhibited distinct subthreshold responses. While the positive-monotonic neurons (monotonically increasing firing rate with increasing stimulus repetition frequency) displayed sustained depolarization at high repetition frequency, the negative-monotonic neurons (opposite trend) instead exhibited hyperpolarization at high repetition frequency but sustained depolarization at low repetition frequency. The combination of excitatory and inhibitory subthreshold events allows the cortex to represent time-varying signals through these two opponent neuronal populations.
... O canto de antifoniaé identificado pelo surgimento de 2 padrões diferentes, indicando a comunicação entre dois anuros, como mostra a figura 3. A análise estatística de variáveis físicas para diferenciação das vocalizações de diversas espécies animais tem sido focada apenas na descrição do comportamento das variáveis através do cálculo de medidas resumo, apresentação de gráficos e aplicações de testes não paramétricos [3], [10],. [16]. Como se vê nas figuras 2 e 3, a análise acústica detalhadaé impossível apenas por inspeção visual destes espectogramas. ...
... TITW is a basic psychological concept in sound perception in human (Bregman, 1990;Moore, 2003;Grondin, 2010;Grahn, 2012) and in animals, mostly in monkeys (Kojima, 1985;Lu and Wang, 2000;Wang, 2000;Mustovic et al., 2003;Fritz et al., 2005). We assume the TITW as the time during which multiple events are integrated to form a single percept of the time interval. ...
Article
Full-text available
The guinea pig (GP) is an often-used species in hearing research. However, behavioral studies are rare, especially in the context of sound recognition, because of difficulties in training these animals. We examined sound recognition in a social competitive setting in order to examine whether this setting could be used as an easy model. Two starved GPs were placed in the same training arena and compelled to compete for food after hearing a conditioning sound (CS), which was a repeat of almost identical sound segments. Through a two-week intensive training, animals were trained to demonstrate a set of distinct behaviors solely to the CS. Then, each of them was subjected to generalization tests for recognition of sounds that had been modified from the CS in spectral, fine temporal and tempo (i.e., intersegment interval, ISI) dimensions. Results showed that they discriminated between the CS and band-rejected test sounds but had no preference for a particular frequency range for the recognition. In contrast, sounds modified in the fine temporal domain were largely perceived to be in the same category as the CS, except for the test sound generated by fully reversing the CS in time. Animals also discriminated sounds played at different tempos. Test sounds with ISIs shorter than that of the multi-segment CS were discriminated from the CS, while test sounds with ISIs longer than that of the CS segments were not. For the shorter ISIs, most animals initiated apparently positive food-access behavior as they did in response to the CS, but discontinued it during the sound-on period probably because of later recognition of tempo. Interestingly, the population range and mean of the delay time before animals initiated the food-access behavior were very similar among different ISI test sounds. This study, for the first time, demonstrates a wide aspect of sound discrimination abilities of the GP and will provide a way to examine tempo perception mechanisms using this animal species.
... More recently, these areas have been viewed as a more general purpose acoustic problem solver, with neural tuning to complex sound patterns such as speech (Belin et al. 2002(Belin et al. , 2004Fecteau et al. 2004;Mesgarani et al. 2014). The LTL/STG show selective responses to species-specific vocalizations in humans and other mammals, such as the marmoset (Wang 2000;Belin et al. 2002). Recent studies utilizing fMRI and implanted recording electrodes suggest that phonemes, words, and phrases elicit localized responses within the human LTL/STG (DeWitt & Rauschecker 2012; Grodzinsky & Nelken 2014;Mesgarani et al. 2014). ...
Article
Full-text available
Objectives: Cochlear implants are a standard therapy for deafness, yet the ability of implanted patients to understand speech varies widely. To better understand this variability in outcomes, the authors used functional near-infrared spectroscopy to image activity within regions of the auditory cortex and compare the results to behavioral measures of speech perception. Design: The authors studied 32 deaf adults hearing through cochlear implants and 35 normal-hearing controls. The authors used functional near-infrared spectroscopy to measure responses within the lateral temporal lobe and the superior temporal gyrus to speech stimuli of varying intelligibility. The speech stimuli included normal speech, channelized speech (vocoded into 20 frequency bands), and scrambled speech (the 20 frequency bands were shuffled in random order). The authors also used environmental sounds as a control stimulus. Behavioral measures consisted of the speech reception threshold, consonant-nucleus-consonant words, and AzBio sentence tests measured in quiet. Results: Both control and implanted participants with good speech perception exhibited greater cortical activations to natural speech than to unintelligible speech. In contrast, implanted participants with poor speech perception had large, indistinguishable cortical activations to all stimuli. The ratio of cortical activation to normal speech to that of scrambled speech directly correlated with the consonant-nucleus-consonant words and AzBio sentences scores. This pattern of cortical activation was not correlated with auditory threshold, age, side of implantation, or time after implantation. Turning off the implant reduced the cortical activations in all implanted participants. Conclusions: Together, these data indicate that the responses the authors measured within the lateral temporal lobe and the superior temporal gyrus correlate with behavioral measures of speech perception, demonstrating a neural basis for the variability in speech understanding outcomes after cochlear implantation.
... Neurons with selectivity for the center frequency and bandwidth of BPN bursts could participate in the decoding of communication sounds, which contain many instances of BPN bursts in a variety of species (Wang, 2000), including humans. ...
Article
Auditory cortex in the superior temporal lobe consists of three major subdivisions, which are apparent cytoarchitectonically and histochemically: core, belt (which surrounds the core), and parabelt (PB), all of which contain several subfields. The three subdivisions can also be distinguished functionally: Neurons in the core show primary-like responses with narrow tuning to tone frequency, belt neurons respond best to band-passed noise of a specific frequency and bandwidth, and PB neurons respond to increasingly complex sounds. The belt areas give rise to two major pathways, one that is anteroventrally directed and projects to ventrolateral prefrontal cortex and another that is posterodorsally directed and projects to dorsolateral prefrontal cortex. The ventral stream underlies auditory pattern and object recognition, including the decoding of speech sounds at the level of phonemes, words, and short phrases. The dorsal stream is involved in the processing of auditory space and motion and is generally considered an audiomotor pathway for sensorimotor integration and control. As such, it is involved in functions of sentence comprehension, silent speech, and processing of musical sequences. Inferior parietal and premotor cortices are all part of this dorsal stream network.
... The neural representation of time-varying signals in auditory cortex is of special interest to our understanding of mechanisms underlying speech processing. Time-varying signals are fundamental components of communication sounds such as human speech and animal vocalizations, as well as musical sounds (Rosen, 1992;Wang, 2000). Low-frequency modulations are important for speech perception and melody recognition, while higher-frequency modulations produce other types of sensations such as pitch and roughness (Houtgast and Steeneken, 1973;Rosen, 1992). ...
Article
How the brain processes temporal information embedded in sounds is a core question in auditory research. This article synthesizes recent studies from our laboratory regarding neural representations of time-varying signals in auditory cortex and thalamus in awake marmoset monkeys. Findings from these studies show that 1) the primary auditory cortex (A1) uses a temporal representation to encode slowly varying acoustic signals and a firing rate-based representation to encode rapidly changing acoustic signals, 2) the dual temporal-rate representation in A1 represent a progressive transformation from the auditory thalamus, 3) firing rate-based representations in the form of a monotonic rate-code are also found to encode slow temporal repetitions in the range of acoustic flutter in A1 and more prevalently in the cortical fields rostral to A1 in the core region of the marmoset auditory cortex, suggesting further temporal-to-rate transformations in higher cortical areas. These findings indicate that the auditory cortex forms internal representations of temporal characteristic structures. We suggest that such transformations are necessary for the auditory cortex to perform a wide range of functions including sound segmentation, object processing and multi-sensory integration.
... The neural representation of time-varying signals in auditory cortex is of special interest to our understanding of mechanisms underlying speech processing. Time-varying signals are fundamental components of communication sounds such as human speech and animal vocalizations, as well as musical sounds (Rosen 1992, Wang 2000. Low-frequency modulations are important for speech perception and melody recognition, while higherfrequency modulations produce other types of sensations such as pitch and roughness (Houtgast and Steeneken 1973;Rosen 1992). ...
Article
How the brain processes temporal information embedded in sounds is a core question in auditory research. This article synthesizes recent studies from our laboratory regarding neural representations of time-varying signals in auditory cortex and thalamus in awake marmoset monkeys. Findings from these studies show that 1) the primary auditory cortex (A1) uses a temporal representation to encode slowly varying acoustic signals and a firing rate-based representation to encode rapidly changing acoustic signals, 2) the dual temporal-rate representations in A1 represent a progressive transformation from the auditory thalamus, 3) firing rate-based representations in the form of monotonic rate-code are also found to encode slow temporal repetitions in the range of acoustic flutter in A1 and more prevalently in the cortical fields rostral to A1 in the core region of marmoset auditory cortex, suggesting further temporal-to-rate transformations in higher cortical areas. These findings indicate that the auditory cortex forms internal representations of temporal characteristics of sounds that are no longer faithful replicas of their acoustic structures. We suggest that such transformations are necessary for the auditory cortex to perform a wide range of functions including sound segmentation, object processing and multi-sensory integration.
... Marmosets are an ideal species for investigating vocal processing because they exhibit rich vocal behaviors [15][16][17] and possess a large, well-characterized vocal repertoire that is retained in captivity [18][19][20] . The marmoset auditory system is well-studied -at the cortical level, the anatomy and connectivity of primary and higher auditory cortices are well-characterized [21][22][23] , basic neural response properties are known [24][25][26][27][28] , the neural basis for responses to more complex stimuli have been studied 6,29 and the neural representation of conspecific vocalizations by neurons in the primary and belt auditory cortex has been explored 5,[30][31][32] . ...
Article
Full-text available
Vocalizations are behaviorally critical sounds, and this behavioral importance is reflected in the ascending auditory system, where conspecific vocalizations are increasingly over-represented at higher processing stages. Recent evidence suggests that, in macaques, this increasing selectivity for vocalizations might culminate in a cortical region that is densely populated by vocalization-preferring neurons. Such a region might be a critical node in the representation of vocal communication sounds, underlying the recognition of vocalization type, caller and social context. These results raise the questions of whether cortical specializations for vocalization processing exist in other species, their cortical location, and their relationship to the auditory processing hierarchy. To explore cortical specializations for vocalizations in another species, we performed high-field fMRI of the auditory cortex of a vocal New World primate, the common marmoset (Callithrix jacchus). Using a sparse imaging paradigm, we discovered a caudal-rostral gradient for the processing of conspecific vocalizations in marmoset auditory cortex, with regions of the anterior temporal lobe close to the temporal pole exhibiting the highest preference for vocalizations. These results demonstrate similar cortical specializations for vocalization processing in macaques and marmosets, suggesting that cortical specializations for vocal processing might have evolved before the lineages of these species diverged.
... Les premières études décrivant des neurones auditifs sélectifs aux vocalisations ont été menées chez des espèces de singes du Nouveau Monde (parvordre des Platyrrhiniens) : le singe écureuil (genre Saïmiri) et le marmouset (genre Callithrix). Dès les années 1970, les chercheurs se sont attelés à chercher des neurones sélectifs aux vocalises, appelés call detectors, et en ont trouvé quelques exemplaires non clairement définis (Wang, 2000). Les enregistrements des neurones auditifs du marmouset ont ensuite démontré l'existence de deux classes de neurones, l'une répondant de façon sélective aux types de vocalises ou aux émetteurs, l'autre répondant à un large panel de sons incluant des vocalises et d'autres bruits (Wang et al., 1995;Wang and Kadia, 2001). ...
Thesis
Humans can individually recognize some hundreds of persons and therefore operate within a rich and complex society. Individual recognition can be achieved by identifying distinct elements such as the face or voice as belonging to one individual. In humans, those different cues are linked into one conceptual representation of individual identity. I demonstrated that rhesus monkeys, like humans, recognize familiarpeers but also familiar humans individually and that they match their voice to their corresponding memorized face. Thus it shows that fine individual recognition is a skill shared across a range of primate species, which may serve as the basis of a sophisticated social network. It also suggests that animals' brains flexibly adapt to recognize individuals of other species when socio-ecologically relevant. Following at the neuronal level, this project put in light that social knowledge about other individuals is represented by hippocampal neurons as well as by inferotemporal neurons. For instance I observed the existence of face preferring neurons not only in the inferotemporal cortex as previously described but also in the hippocampus. Comparison of their properties across both structures, suggests that they could play complementary roles in recognition of individuals. Finally, because the hippocampus is a structure that evolved in various degrees to support autobiographical memory and spatial information in different mammals, I characterized the different subtypes of neurons and their network connectivity in the monkey hippocampus to provide a common anatomical framework to discuss hippocampal functions across species
Preprint
Cortical feedback has long been considered crucial for modulation of sensory processing. In the mammalian auditory system, studies have suggested that corticofugal feedback can have excitatory, inhibitory, or both effects on the response of subcortical neurons, leading to controversies regarding the role of corticothalamic influence. This has been further complicated by studies conducted under different brain states. In the current study, we used cryo-inactivation in the primary auditory cortex (A1) to examine the role of corticothalamic feedback on medial geniculate body (MGB) neurons in awake marmosets. The primary effects of A1 inactivation were a frequency-specific decrease in the auditory response of MGB neurons coupled with an increased spontaneous firing rate, which together resulted in a decrease in the signal-to-noise ratio. In addition, we report for the first-time that A1 robustly modulated the long-lasting sustained response of MGB neurons which changed the frequency tuning after A1 inactivation, e.g., neurons with sharp tuning increased tuning bandwidth whereas those with broad tuning decreased tuning bandwidth. Taken together, our results demonstrate that corticothalamic modulation in awake marmosets serves to enhance sensory processing in a way similar to center-surround models proposed in visual and somatosensory systems, a finding which supports common principles of corticothalamic processing across sensory systems.
Chapter
Animal vocal communication serves various purposes. It can mediate both intraspecific and interspecific, or intrasexual and intersexual communication in several different contexts (e.g., in antipredator, reproductive, and cohesion events). Increasing our knowledge of the physiological mechanisms and acoustic principles underlying sound production and perception in nonhuman primates is important for analyzing primate vocalizations at the light of their commonalities with other species, allowing broader comparative studies with other vertebrates. This chapter reviews recent progress in the study of auditory processing, from vocal production to vocal perception in nonhuman primates, with particular focus on vocal plasticity in different behavioral contexts. The chapter also reviews new advances in laying a framework for understanding the interplay of hormones, experience, perception, and learning on vocal production.
Chapter
Synopsis The auditory cortex of primates, including humans, consists of several functionally specialized fields that are organized into a core with primary-like response properties and surrounding belt and parabelt regions. The anterior belt region gives rise to a ventral pathway that projects all the way to ventrolateral prefrontal cortex and is specialized for the decoding of complex sounds, including species-specific communication sounds. Posterior belt gives rise to a dorsal pathway that includes regions specialized for the processing of auditory space and motion and for auditory-motor processing more generally. In humans, these auditory-motor regions evolved into regions for speech production.
Article
Bats use a large repertoire of calls for social communication. In the bat Phyllostomus discolor, social communication calls are often characterized by sinusoidal amplitude and frequency modulations with modulation frequencies in the range of 100-130 Hz. However, peaks in mammalian auditory cortical modulation transfer functions are typically limited to modulation frequencies below 100 Hz. We investigated the coding of sinusoidally amplitude modulated sounds in auditory cortical neurons in P. discolor by constructing rate and temporal modulation transfer functions. Neuronal responses to playbacks of various communication calls were additionally recorded and compared with the neurons' responses to sinusoidally amplitude-modulated sounds. Cortical neurons in the posterior dorsal field of the auditory cortex were tuned to unusually high modulation frequencies: rate modulation transfer functions often peaked around 130 Hz (median: 87 Hz), and the median of the highest modulation frequency that evoked significant phase-locking was also 130 Hz. Both values are much higher than reported from the auditory cortex of other mammals, with more than 51% of the units preferring modulation frequencies exceeding 100 Hz. Conspicuously, the fast modulations preferred by the neurons match the fast amplitude and frequency modulations of prosocial, and mostly of aggressive, communication calls in P. discolor. We suggest that the preference for fast amplitude modulations in the P. discolor dorsal auditory cortex serves to reliably encode the fast modulations seen in their communication calls. NEW & NOTEWORTHY Neural processing of temporal sound features is crucial for the analysis of communication calls. In bats, these calls are often characterized by fast temporal envelope modulations. Because auditory cortex neurons typically encode only low modulation frequencies, it is unclear how species-specific vocalizations are cortically processed. We show that auditory cortex neurons in the bat Phyllostomus discolor encode fast temporal envelope modulations. This property improves response specificity to communication calls and thus might support species-specific communication.
Article
Background: Large animal models, such as the transgenic (tg) Huntington disease (HD) minipig, have been proposed to improve translational reliability and assessment of safety, efficacy and tolerability in preclinical studies. Minipigs are characterised by high genetic homology and comparable brain structures to humans. In addition, behavioural assessments successfully applied in humans could be explored in minipigs to establish similar endpoints in preclinical and clinical studies. Recently, analysis of voice and speech production was established to characterise HD patients. Objective: The aim of this study was to investigate whether vocalisation could also serve as a viable marker for phenotyping minipigs transgenic for Huntington's disease (tgHD) and whether tgHD minipigs reveal changes in this domain compared to wildtype (wt) minipigs. Methods: While conducting behavioural testing, incidence of vocalisation was assessed for a cohort of 14 tgHD and 18 wt minipigs. Statistical analyses were performed using Fisher's Exact Test for group comparisons and McNemar's Test for intra-visit differences between tgHD and wt minipigs. Results: Vocalisation can easily be documented during phenotyping assessments of minipigs. Differences in vocalisation incidences across behavioural conditions were detected between tgHD and wt minipigs. Influence of the genotype on vocalisation was detectable during a period of 1.5 years. Conclusion: Vocalisation may be a viable marker for phenotyping minipigs transgenic for the Huntington gene. Documentation of vocalisation provides a non-invasive opportunity to capture potential disease signs and explore phenotypic development including the age of disease manifestation.
Article
Full-text available
Frequency modulation (FM) is a common acoustic feature of natural sounds and is known to play a role in robust sound source recognition. Auditory neurons show precise stimulus-synchronized discharge patterns that may be used for the representation of low-rate FM. However, it remains unclear whether this representation is based on synchronization to slow temporal envelope (ENV) cues resulting from cochlear filtering or phase locking to faster temporal fine structure (TFS) cues. To investigate the plausibility of those encoding schemes, single units of the ventral cochlear nucleus of guinea pigs of either sex were recorded in response to sine FM tones centered at the unit’s best frequency (BF). The results show that, in contrast to high-BF units, for modulation depths within the receptive field, low-BF units (<4 kHz) demonstrate good phase locking to TFS. For modulation depths extending beyond the receptive field, the discharge patterns follow the ENV and fluctuate at the modulation rate. The receptive field proved to be a good predictor of the ENV responses for most primary-like and chopper units. The current in vivo data also reveal a high level of diversity in responses across unit types. TFS cues are mainly conveyed by low-frequency and primary-like units and ENV cues by chopper and onset units. The diversity of responses exhibited by cochlear nucleus neurons provides a neural basis for a dual-coding scheme of FM in the brainstem based on both ENV and TFS cues.
Article
Hearing research has long been facilitated by rodent models, although in some diseases, human symptoms cannot be recapitulated. The common marmoset (Callithrix jacchus) is a small, easy-to-handle New World monkey which has a similar anatomy of the temporal bone, including the middle ear ossicular chains and inner ear to humans, than in comparison with that of rodents. Here, we report a reproducible, safe, and rational surgical approach to the cochlear round window niche for the drug delivery to the inner ear of the common marmoset. We adopted posterior tympanotomy, a procedure used clinically in human surgery, to avoid manipulation of the tympanic membrane that may cause conductive hearing loss. This surgical procedure did not lead to any significant hearing loss. This approach was possible due to the large bulla structure of the common marmoset, although the lateral semicircular canal and vertical portion of the facial nerve should be carefully considered. This surgical method allows us to perform the safe and accurate administration of drugs without hearing loss, which is of great importance in obtaining pre-clinical proof of concept for translational research.
Article
Full-text available
We investigated neural coding of sinusoidally modulated tones (sAM and sFM) in the primary auditory cortex (A1) of awake marmoset monkeys, demonstrating that there are systematic cortical representations of embedded temporal features that are based on both average discharge rate and stimulus-synchronized discharge patterns. The rate-representation appears to be coded alongside the stimulus-synchronized discharges, such that the auditory cortex has access to both rate and temporal representations of the stimulus at high and low frequencies, respectively. Furthermore, we showed that individual auditory cortical neurons, as well as populations of neurons, have common features in their responses to both sAM and sFM stimuli. These results may explain the similarities in the perception of sAM and sFM stimuli as well as the different perceptual qualities effected by different modulation frequencies. The main findings include the following. 1) Responses of cortical neurons to sAM and sFM stimuli in awake marmosets were generally much stronger than responses to unmodulated tones. Some neurons responded to sAM or sFM stimuli but not to pure tones. 2) The discharge rate-based modulation transfer function typically had a band-pass shape and was centered at a preferred modulation frequency (rBMF). Population-averaged mean firing rate peaked at 16- to 32-Hz modulation frequency, indicating that the A1 was maximally excited by this frequency range of temporal modulations. 3) Only approximately 60% of recorded units showed statistically significant discharge synchrony to the modulation waveform of sAM or sFM stimuli. The discharge synchrony-based best modulation frequency (tBMF) was typically lower than the rBMF measured from the same neuron. The distribution of rBMF over the population of neurons was approximately one octave higher than the distribution of tBMF. 4) There was a high degree of similarity between cortical responses to sAM and sFM stimuli that was reflected in both discharge rate- or synchrony-based response measures. 5) Inhibition appeared to be a contributing factor in limiting responses at modulation frequencies above the rBMF of a neuron. And 6) neurons with shorter response latencies tended to have higher tBMF and maximum discharge synchrony frequency than those with longer response latencies. rBMF was not significantly correlated with the minimum response latency.
Chapter
Neurons selective for signal duration have been reported from the auditory midbrain and cortex in a variety of echolocating bats. The first part of this chapter discusses the importance of signal duration to echolocation by bats. It examines the different types of auditory duration-tuned neurons that have been described, explores the neural mechanisms that create their temporally selective response properties in the auditory midbrain or inferior colliculus, and ends by speculating on the possible function(s) of duration tuning to hearing and echolocation by bats. The second part of this chapter describes the neural representation of echoes of complex objects and species-specific vocalizations in the auditory cortex of echolocating bats. It highlights recent findings on how the coding of complex spectrotemporal echo features is related to important tasks in object recognition such as the normalization of object size or the processing of time-variant echoes from complex moving targets. To close the loop between neural processing mechanisms and perception, electrophysiological findings are related to the behavioral performance of bats in psychophysical tasks. The chapter concludes with a section on neural processing of conspecific vocalizations in the auditory cortex and amygdala of echolocating bats.
Chapter
Understanding how the brain processes vocal communication sounds remains one of the most challenging problems in neuroscience. Species-specific vocalizations of nonhuman primates are communication sounds used in intraspecies interactions, analogous to speech in humans. Primate vocalizations are of special interest to us because, compared with other animal species, primates share the most similarities with humans in the anatomical structures of their central nervous systems, including the cerebral cortex. Therefore, neural mechanisms underlying perception and production of speciesspecific primate vocalizations may have direct implications for those operating in the human brain for speech processing. Although field studies provide full access to the natural behavior of primates, it is difficult to combine them with physiological studies at the single neuron level in the same animals. The challenge is to develop appropriate primate models for laboratory studies where both vocal behavior and underlying physiological structures and mechanisms can be systematically investigated. This is a crucial step in understanding how the brain processes vocal communication sounds at the cellular and systems levels. Most primates have a well-developed and sophisticated vocal repertoire in their natural habitats; however, for many primate species such as macaque monkeys, vocal activities largely diminish under the captive conditions commonly found in research institutions, due in part to the lack of proper social housing environments. Fortunately, some primate species such as New World monkeys (e.g., marmosets, squirrel monkeys) remain highly vocal in properly configured captive conditions. These primate species can serve as excellent models to study neural mechanisms responsible for processing species-specific vocalizations.
Chapter
Communication involves transmitting a signal encoded with information that can be interpreted by a receiver and used to mediate behavioral responses and decisions. In order for communication to function properly, some level of co-evolution between signal producer and receiver must have occurred; otherwise, signals would either be ignored or misinterpreted. Although this relationship is evident in every communication system, specialized systems offer a unique opportunity to observe how the specific features of a system interact to facilitate communication. For example, in vocal communication systems, individuals communicate by emitting vocalizations that are then interpreted by the auditory system of conspecifics. The information content and structure of the signals are manifested in a suite of acoustic variables that conform to species-typical boundaries. In a specialized system, particular acoustic features are encoded in the vocal signal and transmitted to the receiver who, in turn, has evolved a perceptual system to interpret particular features of the signal. Subtle differences in acoustic structure can be interpreted to indicate vastly different pieces of information. To decipher these signals, researchers must understand the interaction between acoustic features within the call and the behaviors that are elicited by such features. Specialized systems of vocal communication offer us an important opportunity to investigate this relationship most effectively.
Article
The explanation of animal communication by means of concepts like information, meaning and reference is one of the central foundational issues in animal behaviour studies. This book explores these issues, revolving around questions such as: • What is the nature of information? • What theoretical roles does information play in animal communication studies? • Is it justified to employ these concepts in order to explain animal communication? • What is the relation between animal signals and human language? The book approaches the topic from a variety of disciplinary perspectives, including ethology, animal cognition, theoretical biology and evolutionary biology, as well as philosophy of biology and mind. A comprehensive introduction familiarises non-specialists with the field and leads on to chapters ranging from philosophical and theoretical analyses to case studies involving primates, birds and insects. The resulting survey of new and established concepts and methodologies will guide future empirical and theoretical research.
Article
The main functions of hearing are to (a) to identify sounds, much of it for the purpose of auditory communication, and to (b) localize sounds in space, mostly for the purpose of tracking and navigation. The brain seems to solve these two tasks in largely segregated cortical processing streams, a ventral and a dorsal stream. Besides processing of space and motion, the dorsal stream also participates in other important forms of audio-motor behavior, including sensorimotor control and integration for speech and music in humans.
Article
Full-text available
We review studies that identify midbrain mechanisms in fish, amphibians, and reptiles that solve acoustic problems common to all vertebrates, including humans. The homologue of the inferior colliculus (IC) in fish, amphibians, and reptiles is the torus semicircularis (TS) (Nieuwenhuys et al. 1998). The TS, like the IC, "is a nexus of the auditory system because it processes and integrates almost all ascending acoustic information from lower centers, and it determines the form in which information is conveyed to higher regions in the forebrain" (Pollak et al. 2003). To remind the readers of this homology, we will use the term TS/IC. © 2005 Springer Science+Business Media, Inc. All rights reserved.
Article
Full-text available
Recent data obtained by various methods of clinical investigations suggest an organization of language in the human brain involving compartmentalization into separate systems subserving different language functions. Each system includes multiple essential areas localized in the frontal and temporoparietal cortex of the dominant hemisphere, as well as widely dispersed neurons. All components of a system are activated in parallel, possibly by ascending thalamocortical circuits. The features peculiar to cerebral language organization include not only the lateralization of essential areas to one hemisphere, but also a substantial variance in the individual patterns of localization within that hemisphere, a variance that in part relates to individual differences in verbal skills.
Article
Full-text available
Cerebral activation was measured with positron emission tomography in ten human volunteers. The primary auditory cortex showed increased activity in response to noise bursts, whereas acoustically matched speech syllables activated secondary auditory cortices bilaterally. Instructions to make judgments about different attributes of the same speech signal resulted in activation of specific lateralized neural systems. Discrimination of phonetic structure led to increased activity in part of Broca's area of the left hemisphere, suggesting a role for articulatory recoding in phonetic perception. Processing changes in pitch produced activation of the right prefrontal cortex, consistent with the importance of right-hemisphere mechanisms in pitch perception.
Article
Full-text available
Cerebral activation was measured with positron emission tomography in ten human volunteers. The primary auditory cortex showed increased activity in response to noise bursts, whereas acoustically matched speech syllables activated secondary auditory cortices bilaterally. Instructions to make judgments about different attributes of the same speech signal resulted in activation of specific lateralized neural systems. Discrimination of phonetic structure led to increased activity in part of Broca's area of the left hemisphere, suggesting a role for articulatory recoding in phonetic perception. Processing changes in pitch produced activation of the right prefrontal cortex, consistent with the importance of right-hemisphere mechanisms in pitch perception.
Article
Full-text available
Ten Japanese macaques were trained to discriminate between two types of Japanese macaque coo vocalizations before and after auditory cortex ablation. Five of the animals were tested following left unilateral ablation, whereas the other five were tested following right unilateral ablation. After postoperative testing, symmetrical lesions were made in the remaining hemisphere in two animals from each group and the effect of bilateral lesions was assessed. The animals were tested using a shock avoidance procedure. Unilateral ablation of left auditory cortex consistently resulted in an initial impairment in the ability to discriminate between the vocalizations with the animals regaining normal performance in 5-15 sessions. In contrast, right unilateral ablation had no detectable effect on the discrimination. Bilateral auditory cortex ablation rendered the animals permanently unable to discriminate between the coos. Although the monkeys could learn to discriminate the coos from noise and from 2- and 4-kHz tones, they had great difficulty in discriminating between the coos and tones in the same frequency range as the coos (i.e., 500 Hz and 1 kHz). The initial impairment following left unilateral lesions indicates that the ability to perceive species-specific vocalizations is lateralized to the left hemisphere. The observation that bilateral lesions abolish the discrimination indicates that the recovery in the left lesion cases was the result of the right hemisphere mediating the discrimination.
Article
Full-text available
The primate somatosensory cortex, which processes tactile stimuli, contains a topographic representation of the signals it receives, but the way in which such maps are maintained is poorly understood. Previous studies of cortical plasticity indicated that changes in cortical representation during learning arise largely as a result of hebbian synaptic change mechanisms. Here we show, using owl monkeys trained to respond to specific stimulus sequence events, that serial application of stimuli to the fingers results in changes to the neuronal response specificity and maps of the hand surfaces in the true primary somatosensory cortical field (S1 area 3b). In this representational remodelling stimuli applied asychronously to the fingers resulted in these fingers being integrated in their representation, whereas fingers to which stimuli were applied asynchronously were segregated in their representation. Ventroposterior thalamus response maps derived in these monkeys were not equivalently reorganized. This representational plasticity appears to be cortical in origin.
Article
Full-text available
Neurons in the superior temporal gyrus of anesthetized rhesus monkeys were exposed to complex acoustic stimuli. Bandpassed noise bursts with defined center frequencies evoked responses that were greatly enhanced over those evoked by pure tones. This finding led to the discovery of at least one new cochleotopic area in the lateral belt of the nonprimary auditory cortex. The best center frequencies of neurons varied along a rostrocaudal axis, and the best bandwidths of the noise bursts varied along a mediolateral axis. When digitized monkey calls were used as stimuli, many neurons showed a preference for some calls over others. Manipulation of the calls' frequency structure and playback of separate components revealed different types of spectral integration. The lateral areas of the monkey auditory cortex appear to be part of a hierarchical sequence in which neurons prefer increasingly complex stimuli and may form an important stage in the preprocessing of communication sounds.
Article
Full-text available
Mustached bats, Pteronotus parnellii parnellii spend most of their lives in the dark and use their auditory system for acoustic communication as well as echolocation. The sound spectrograms of their communication sounds or "calls" revealed that this species produces a rich variety of calls. These calls consist of one or more of the 33 different types of discrete sounds or "syllables" that are emitted singly and/or in combination. These syllables can be further classified as 19 simple syllables, 14 composites, and three subsyllables. Simple syllables consist of characteristic geometric patterns of CF (constant frequency), FM (frequency modulation), and NB (noise burst) sounds that are defined quantitatively using statistical criteria. Composites consist of simple syllables or subsyllables conjoined without any silent interval. Most syllable types exhibit a large intrinsic variation in their physical structure compared to the stereotypic echolocation pulses. Syllable domains are defined on the basis of multiple parameters, although these can be collapsed onto three dimensions that capture 99% of the measured variation among different types of syllables. Temporal analysis of multisyllabic constructs reveals several syntactical rules for syllable transitions.
Article
Full-text available
Previous studies have shown that the tonotopic organization of primary auditory cortex is altered subsequent to restricted cochlear lesions (Robertson and Irvine, 1989) and that the topographic reorganization of the primary somatosensory cortex is correlated with changes in the perceptual acuity of the animal (Recanzone et al., 1992a-d). Here we report an increase in the cortical area of representation of a restricted frequency range in primary auditory cortex of adult owl monkeys that is correlated with the animal's performance at a frequency discrimination task. Monkeys trained for several weeks to discriminate small differences in the frequency of sequentially presented tonal stimuli revealed a progressive improvement in performance with training. At the end of the training period, the tonotopic organization of Al was defined by recording multiple-unit responses at 70-258 cortical locations. These responses were compared to those derived from three normal monkeys and from two monkeys that received the same auditory stimuli but that were engaged in a tactile discrimination task. The cortical representation, the sharpness of tuning, and the latency of the response were greater for the behaviorally relevant frequencies of trained monkeys when compared to the same frequencies of control monkeys. The cortical area of representation was the only studied parameter that was correlated with behavioral performance. These results demonstrate that attended natural stimulation can modify the tonotopic organization of Al in the adult primate, and that this alteration is correlated with changes in perceptual acuity.
Article
Full-text available
Functional magnetic resonance imaging (FMRI) was used to identify candidate language processing areas in the intact human brain. Language was defined broadly to include both phonological and lexical-semantic functions and to exclude sensory, motor, and general executive functions. The language activation task required phonetic and semantic analysis of aurally presented words and was compared with a control task involving perceptual analysis of nonlinguistic sounds. Functional maps of the entire brain were obtained from 30 right-handed subjects. These maps were averaged in standard stereotaxic space to produce a robust "average activation map" that proved reliable in a split-half analysis. As predicted from classical models of language organization based on lesion data, cortical activation associated with language processing was strongly lateralized to the left cerebral hemisphere and involved a network of regions in the frontal, temporal, and parietal lobes. Less consistent with classical models were (1) the existence of left hemisphere temporoparietal language areas outside the traditional "Wernicke area," namely, in the middle temporal, inferior temporal, fusiform, and angular gyri; (2) extensive left prefrontal language areas outside the classical "Broca area"; and (3) clear participation of these left frontal areas in a task emphasizing "receptive" language functions. Although partly in conflict with the classical model of language localization, these findings are generally compatible with reported lesion data and provide additional support for ongoing efforts to refine and extend the classical model.
Article
Full-text available
Syntax denotes a rule system that allows one to predict the sequencing of communication signals. Despite its significance for both human speech processing and animal acoustic communication, the representation of syntactic structure in the mammalian brain has not been studied electrophysiologically at the single-unit level. In the search for a neuronal correlate for syntax, we used playback of natural and temporally destructured complex species-specific communication calls-so-called composites-while recording extracellularly from neurons in a physiologically well defined area (the FM-FM area) of the mustached bat's auditory cortex. Even though this area is known to be involved in the processing of target distance information for echolocation, we found that units in the FM-FM area were highly responsive to composites. The finding that neuronal responses were strongly affected by manipulation in the time domain of the natural composite structure lends support to the hypothesis that syntax processing in mammals occurs at least at the level of the nonprimary auditory cortex.
Article
Full-text available
'What' and 'where' visual streams define ventrolateral object and dorsolateral spatial processing domains in the prefrontal cortex of nonhuman primates. We looked for similar streams for auditory-prefrontal connections in rhesus macaques by combining microelectrode recording with anatomical tract-tracing. Injection of multiple tracers into physiologically mapped regions AL, ML and CL of the auditory belt cortex revealed that anterior belt cortex was reciprocally connected with the frontal pole (area 10), rostral principal sulcus (area 46) and ventral prefrontal regions (areas 12 and 45), whereas the caudal belt was mainly connected with the caudal principal sulcus (area 46) and frontal eye fields (area 8a). Thus separate auditory streams originate in caudal and rostral auditory cortex and target spatial and non-spatial domains of the frontal lobe, respectively.
Article
Full-text available
The present study investigated neural responses to rapid, repetitive stimuli in the primary auditory cortex (A1) of cats. We focused on two important issues regarding cortical coding of sequences of stimuli: temporal discharge patterns of A1 neurons as a function of inter-stimulus interval and cortical mechanisms for representing successive stimulus events separated by very short intervals. These issues were studied using wide- and narrowband click trains with inter-click intervals (ICIs) ranging from 3 to 100 ms as a class of representative sequential stimuli. The main findings of this study are 1) A1 units displayed, in response to click train stimuli, three distinct temporal discharge patterns that we classify as regions I, II, and III. At long ICIs nearly all A1 units exhibited typical stimulus-synchronized response patterns (region I) consistent with previously reported observations. At intermediate ICIs, no clear temporal structures were visible in the responses of most A1 units (region II). At short ICIs, temporal discharge patterns are characterized by the presence of either intrinsic oscillations (at approximately 10 Hz) or a change in discharge rate that was a monotonically decreasing function of ICI (region III). In some A1 units, temporal discharge patterns corresponding to region III were absent. 2) The boundary between regions I and II (synchronization boundary) had a median value of 39.8 ms ICI ([25%, 75%] = [20.4, 58. 8] ms ICI; n = 131). The median boundary between regions II and III was estimated at 6.3 ms ([25%, 75%] = [5.2, 9.7] ms ICI; n = 47) for units showing rate changes (rate-change boundary). 3) The boundary values between different regions appeared to be relatively independent of stimulus intensity (at modest sound levels) or the bandwidth of the clicks used. 4) There is a weak correlation between a unit's synchronization boundary and its response latency. Units with shorter latencies appeared to also have smaller boundary values. And 5) based on these findings, we proposed a two-stage model for A1 neurons to represent a wide range of ICIs. In this model, A1 uses a temporal code for explicitly representing long ICIs and a rate code for implicitly representing short ICIs.
Article
Aphasia studies by the complementary techniques of direct cortical stimulation in conscious man, and of cortical excision (mostly for relief of epilepsy). Results suggest a paramount importance of the left hemisphere for speech, even in left-handers; less dependence of speech on cortex, and great dependence on certain subcortical structures. Harvard Book List (edited) 1964 #159 (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Sixty-three cells in the superior temporal gyrus of awake squirrel monkeys were tested with 8 species-specific vocalizations plus noise, clicks and tones. Identical series of stimuli were repeatedly presented over 1-5 hour intervals. The responses elicited by both vocalizations and artificial stimuli in primary and secondary cortical neurons often varied over time. In several cases the selectivity of a cell to specific vocalizations appeared to change, i.e., a vocalization which was effective in eliciting a response at one point in the experiment, later became ineffective. In the primary cortex 50% of the cells gave variable responses to one or more of the vocalizations. Twenty percent of the primary cortical cells appeared to change the selectivity of their responses to specific vocalizations. In the secondary cortex 62% of the cells varied in their responses to vocalizations; 42% showing apparent changes in selectivity.
Article
This paper is concerned with the representation of the spectra of synthesized steady-state vowels in the temporal aspects of the discharges of auditory-nerve fibers. The results are based on a study of the responses of large numbers of single auditory-nerve fibers in anesthetized cats. By presenting the same set of stimuli to all the fibers encountered in each cat, we can directly estimate the population response to those stimuli. Period histograms of the responses of each unit to the vowels were constructed. The temporal response of a fiber to each harmonic component of the stimulus is taken to be the amplitude of the corresponding component in the Fourier transform of the unit's period histogram. At low sound levels, the temporal response to each stimulus component is maximal among units with CFs near the frequency of the component (i.e., near its place). Responses to formant components are larger than responses to other stimulus components. As sound level is increased, the responses to the formants, particularly the first formant, increase near their places and spread to adjacent regions, particularly toward higher CFs. Responses to nonformant components, exept for harmonics and intermodulation products of the formants (2F1,2F2,F1 + F2, etc), are suppressed; at the highest sound levels used (approximately 80 dB SPL), temporal responses occur almost exclusively at the first two or three formants and their harmonics and intermodulation products. We describe a simple calculation which combines rate, place, and temporal information to provide a good representation of the vowels' spectra, including a clear indication of at least the first two formant frequencies. This representation is stable with changes in sound level at least up to 80 dB SPL; its stability is in sharp contrast to the behavior of the representation of the vowels' spectra in terms of discharge rate which degenerates at stimulus levels within the conversational range.
Article
Responses of large populations of auditory-nerve fibers to synthesized steady-state vowels were recorded in anesthetized cats. Driven discharge rate to vowels, normalized by dividing by saturation rate (estimated from the driven rate to CF tones 50 dB above threshold), was plotted versus fiber CF for a number of vowel levels. For the vowels /I/ and /e/, such rate profiles showed a peak in the region of the first formant and another in the region of the second and third formants, for sound levels below about 70 dB SPL. For /a/ at levels below about 40 dB SPL there are peaks in the region of the first and second formants. At higher levels these peaks disappear for all the vowels because of a combination of rate saturation and two-tone suppression. This must be qualified by saying that rate profiles plotted separately for units with spontaneous rates less than one spike per second may retain peaks at higher levels. Rate versus level functions for units with CFs above the first formant can saturate at rates less than the saturation rate to CF to-es or they can be nonmonotonic; these effects are most likely produced by the same mechanism as that involved in two-tone suppression.
Article
The organization and connections of auditory cortex in owl monkeys, Aotus trivirgatus, were investigated by combining microelectrode mapping methods with studies of architecture and connections in the same animals. In most experiments, portions of auditory cortex were first explored with microelectrodes, neurons were characterized as responsive or not to auditory stimuli, and best frequencies were determined whenever possible. Most recordings were in cortex previously designated as primary (A-I) and rostral (R) auditory fields (Imig et al. J Comp Neurol 171:111, '77) and in a newly defined rostrotemporal field (RT) located rostral to R. Injections of wheat germ agglutinin-horseradish peroxidase (WGAx-HRP) and fluorescent tracers were placed in electrophysiologically identified locations of Ax-I, R, and RT; the posterolateral (PL) and anterolateral (AL) divisions of a narrow belt of auditory cortex lateral and adjacent to Ax-I and R; cortex of the superior temporal gyrus lateral and rostrolateral to PL and AL; and regions of prefrontal cortex that receive inputs from auditory cortex. There were several major findings:
Article
Single cells in the primary auditory cortex of the awake squirrel monkey were tested for their responses to intraspecific communication calls presented to the monkey normally ("calls") and backwards ("llacs"). These two groups of signals were similarly effective in eliciting responses, and response patterns were of the same nature and equally diverse. In about 2% of the cells the time structure of a response to at least one "llac" was virtually a "mirror image" of the response to the corresponding "call". In about 34% of the cells, for at least one vocalization, at one intensity or other, the time distribution of response peaks closely approximated in time with the envelope of a particular spectral component of the call, corresponding with the cell's best frequency. These results suggest that complex sounds may be represented in the auditory cortex by the synchronized activity of functional cell ensembles in which differently tuned individual members are distributed throughout the cochleotopic space according to their best frequencies.
Article
Recent data obtained by various methods of clinical investigations suggest an organization of language in the human brain involving compartmentalization into separate systems subserving different language functions. Each system includes multiple essential areas localized in the frontal and temporoparietal cortex of the dominant hemisphere, as well as widely dispersed neurons. All components of a system are activated in parallel, possibly by ascending thalamocortical circuits. The features peculiar to cerebral language organization include not only the lateralization of essential areas to one hemisphere, but also a substantial variance in the individual patterns of localization within that hemisphere, a variance that in part relates to individual differences in verbal skills.
Article
1. We have recorded the responses of neurons in the anteroventral cochlear nucleus (AVCN) of barbiturate-anesthetized cats to the synthetic, steady-state-vowel sound /e/, presented over a range of stimulus intensities. 2. The responses of (putative) spherical bushy cells [primary-like (Pri) units] to the vowel resemble those of auditory-nerve fibers (ANFs) in terms of both rate and temporal encoding at low and moderate stimulus levels. It was not possible to study the responses of most Pri units at the highest stimulus level because of the large neurophonic component present in recordings from most primarylike units at higher stimulus levels. 3. The responses of many (putative) globular bushy cells [primarylike with notch (PN) units] to the vowel resemble those of ANFs; however, there appears to be greater heterogeneity in the responses of units in the PN population than in the Pri population in terms of both temporal and rate encoding. 4. Populations of stellate cells (chopper units) have degraded representations of the temporal information in ANF population discharge patterns in response to the vowel; this is consistent with the responses of these units to pure tones. Both regular (ChS) and irregular (ChT) chopper subpopulations, however, maintain better rate-place representations of the vowel spectrum than does the population of ANFs as a whole. The rate-place representations of the vowel spectrum by both chopper populations closely resemble those of low and medium spontaneous rate ANFs at most stimulus levels. 5. The data presented in this paper suggest that a functional partition of the AVCN chopper population could yield two distinct rate representations in response to a complex stimulus: one that is graded with stimulus level (over a 30 to 40 dB range) and that, even at rate saturation, maintains a "low contrast" stimulus representation; and a second that maintains a robust, "high contrast" stimulus representation at all levels but that confers less information about stimulus level.
Article
The location and characteristics of the primary auditory cortex of the common marmoset, Callithrix jacchus jacchus, were determined in five anesthetized male adult animals by mapping the responses of cortical units and unit clusters to pure tone stimuli presented to the contralateral ear. The primary auditory cortex lies largely ventral to the lateral sulcus, the only major fissure on the lateral cortex of this smooth-brained primate, but in some animals it may extend significantly down the ventral bank of this sulcus. Responses are distributed such that low best frequencies are found rostroventrally whereas high best frequencies occur caudally. The disposition of frequency-band contours is fan-shaped, with contours separating low-frequency octaves nearly parallel to the lateral sulcus and high-frequency (greater than 8 kHz) contours perpendicular to that sulcus. Best frequencies range from 0.6 to 30 kHz across the primary field, but there is a disproportionate representation of the three octaves between 2 and 16 kHz. The most sensitive thresholds (as low as -2 dB SPL) are found between 7 and 9 kHz. The primary auditory cortex is similar in cytoarchitecture to that reported for the cat, showing a blurring of lamination in the middle layers (II-IV) and a preponderance of small cells in these merged layers, giving a highly granular appearance. The accessibility of the cochlear representation on the gyral surface makes the marmoset an attractive animal for studies of primate auditory cortex.
Article
Action potentials of single auditory cortical neurons of the squirrel monkey were recorded in a chronic, unanesthetized preparation. The responsiveness of units was tested with various types of simple and complex acoustic stimuli in a free field situation. As simple auditory stimuli, bursts of pure tones, clicks, and white noise were utilized. Species-specific vocalizations served as complex, biologically significant stimuli.The data are based on 48 neurons which showed a discrete response to speciesspecific vocalizations. In 63% the response to calls could be predicted from the units' responses to simple stimuli. Thirty-seven percent of the neurons were classified as unpredictable with respect to their responsiveness to vocalizations. The response of most units was restricted to call stimuli which showed similarities in their frequency-time characteristics. About 7% of the 116 units responding to calls were classified as selective responders because they were not excited by any other stimulus tested. It was not possible to single out the acoustic features to which these units responded.
Article
Most of the neurons tested in the superior temporal cortex of awake squirrel monkeys responded to recorded species-specific vocalizations. Some cells responded with temporally complex patterns to many vocalizations. Other cells responded with simpler patterns to only one call. Most cells lay between these two extremes. On-line deletion of parts of a vocalization revealed the role of temporal interactions in determining the nature of some responses.
Article
Two hundred and fifty vocalizations of the squirrel monkey (Saimiri sciureus) were selected for spectrographic analysis from a total of 200 hrs. of tape recordings. The vocalizations were classified into six groups according to their physical characteristics. Both intra and intergroup variability of calls was observed. Calls of similar shape were found to have similar functions. Thus each group of calls could be characterized by a functional designation. The functional significance of calls was determined by qualitative and quantitative observations. Four methods were employed: 1. stereotyped vocalizations were elicited by visual stimuli; 2. motor and vocal reactions were evoked through adequate vocal signals; 3. vocalizations were observed when external conditions were held constant and internal factors were permitted to vary; 4. vocal events were related to the total social situation. By these methods the complexity as well as the specificity of the vocal communication system is demonstrated and its evolutionary significance is discussed.
Article
In an earlier study (Neuroscience8, 33–55, 1983), we found that the cortex representing the skin of the median nerve within parietal somatosensory fields 3b and 1 was completely occupied by ‘new’ inputs from the ulnar and radial nerves, 2–9 months after the median nerve was cut and tied in adult squirrel and owl monkeys. In this report, we describe the results of studies directed toward determining the time course and likely mechanisms underlying this remarkable plasticity. Highly detailed maps of the hand surface representation were derived in monkeys before, immediately after, and at subsequent short and intermediate time stages after median nerve section. In one monkey, maps were derived before nerve section, immediately after nerve section, and 11, 22 and 144 days later. Thus, direct comparisons in cortical map structure could be made over time in this individual monkey. In other experiments, single maps were derived at given post-section intervals.
Article
The cortical representations of the hand in area 3b in adult owl monkeys were defined with use of microelectrode mapping techniques 2–8 months after surgical amputation of digit 3, or of both digits 2 and 3. Digital nerves were tied to prevent their regeneration within the amputation stump. Successive maps were derived in several monkeys to determine the nature of changes in map organization in the same individuals over time. In all monkeys studied, the representations of adjacent digits and palmar surfaces expanded topographically to occupy most or all of the cortical territories formerly representing the amputated digit(s). With the expansion of the representations of these surrounding skin surfaces (1) there were severalfold increases in their magnification and (2) roughly corresponding decreases in receptive field areas. Thus, with increases in magnification, surrounding skin surfaces were represented in correspondingly finer grain, implying that the rule relating receptive field overlap to separation in distance across the cortex (see Sur et al., '80) was dynamically maintained as receptive fields progressively decreased in size. These studies also revealed that: (1) the discontinuities between the representations of the digits underwent significant translocations (usually by hundreds of microns) after amputation, and sharp new discontinuous boundaries formed where usually separated, expanded digital representations (e.g., of digits 1 and 4) approached each other in the reorganizing map, implying that these map discontinuities are normally dynamically maintained. (2) Changes in receptive field sizes with expansion of representations of surrounding skin surfaces into the deprived cortical zone had a spatial distribution and time course similar to changes in sensory acuity on the stumps of human amputees. This suggests that experience-dependent map changes result in changes in sensory capabilities. (3) The major topographic changes were limited to a cortical zone 500–700 μm on either side of the initial boundaries of the representation of the amputated digits. More distant regions did not appear to reorganize (i.e., were not occupied by inputs from surrounding skin surfaces) even many months after amputation. (4) The representations of some skin surfaces moved in entirety to locations within the former territories of representation of amputated digits in every monkey studied. In man, no mislocation errors or perceptual distortions result from stimulation of surfaces surrounding a digital amputation. This constitutes further evidence that any given skin surface can be represented by many alternative functional maps at different times of life in these cortical fields (Merzenich et al., '83b). These studies further demonstrate that basic features of somatosensory cortical maps (receptive field sizes, cortical sites of representation of given skin surfaces, representational discontinuities, and probably submodality column boundaries) are dynamically maintained. They suggest that cortical skin surface maps are alterable by experience in adults, and that experience-dependent map changes reflect and possibly account for concomitant changes in tactual abilities. Finally, these results bear implications for mechanisms underlying these cortical map dynamics.
Article
1. The temporal and spectral characteristics of neural representations of a behaviorally important species-specific vocalization were studied in neuronal populations of the primary auditory cortex (A1) of barbiturate-anesthetized adult common marmosets (Callithrix jacchus), using both natural and synthetic vocalizations. The natural vocalizations used in electrophysiological experiments were recorded from the animals under study or from their conspecifics. These calls were frequently produced in vocal exchanges between members of our marmoset colony and are part of the well-defined and highly stereotyped vocal repertoire of this species. 2. The spectrotemporal discharge pattern of spatially distributed neuron populations in cortical field A1 was found to be correlated with the spectrotemporal acoustic pattern of a complex natural vocalization. However, the A1 discharge pattern was not a faithful replication of the acoustic parameters of a vocalization stimulus, but had been transformed into a more abstract representation than that in the auditory periphery. 3. Subpopulations of A1 neurons were found to respond selectively to natural vocalizations as compared with synthetic variations that had the same spectral but different temporal characteristics. A subpopulation responding selectively to a given monkey's call shared some but not all of its neuronal memberships with other individual-call-specific neuronal subpopulations. 4. In the time domain, responses of individual A1 units were phase-locked to the envelope of a portion of a complex vocalization, which was centered around a unit's characteristic frequency (CF). As a whole, discharges of A1 neuronal populations were phase-locked to discrete stimulus events but not to their rapidly changing spectral contents. The consequence was a reduction in temporal complexity and an increase in cross-population response synchronization. 5. In the frequency domain, major features of the stimulus spectrum were reflected in rate-CF profiles. The spectral features of a natural call were equally or more strongly represented by a subpopulation of A1 neurons that responded selectively to that call as compared with the entire responding A1 population. 6. Neuronal responses to a complex call were distributed very widely across cortical field A1. At the same time, the responses evoked by a vocalization scattered in discrete cortical patches were strongly synchronized to stimulus events and to each other. As a result, at any given time during the course of a vocalization, a coherent representation of the integrated spectrotemporal characteristics of a particular vocalization was present in a specific neuronal population. 7. These results suggest that the representation of behaviorally important and spectrotemporally complex species-specific vocalizations in A1 is 1) temporally integrated and 2) spectrally distributed in nature, and that the representation is carried by spatially dispersed and synchronized cortical cell assemblies that correspond to each individual's vocalizations in a specific and abstracted way.
Article
Auditory information is relayed from the ventral nucleus of the medial geniculate complex to a core of three primary or primary-like areas of auditory cortex that are cochleotopically organized and highly responsive to pure tones. Auditory information is then distributed from the core areas to a surrounding belt of about seven areas that are less precisely cochleotopic and generally more responsive to complex stimuli than tones. Recent studies indicate that the belt areas relay to the rostral and caudal divisions of a parabelt region at a third level of processing in the cortex lateral to the belt. The parabelt and belt regions have additional inputs from dorsal and magnocellular divisions of the medial geniculate complex and other parts of the thalamus. The belt and parabelt regions appear to be concerned with integrative and associative functions involved in pattern perception and object recognition. The parabelt fields connect with regions of temporal, parietal, and frontal cortex that mediate additional auditory functions, including space perception and auditory memory.
Article
Two fundamental aspects of frequency analysis shape the functional organization of primary auditory cortex. For one, the decomposition of complex sounds into different frequency components is reflected in the tonotopic organization of auditory cortical fields. Second, recent findings suggest that this decomposition is carried out in parallel for a wide range of frequency resolutions by neurons with frequency receptive fields of different sizes (bandwidths). A systematic representation of the range of frequency resolution and, equivalently, spectral integration shapes the functional organization of the iso-frequency domain. Distinct subregions, or "modules," along the iso-frequency domain can be demonstrated with various measures of spectral integration, including pure-tone tuning curves, noise masking, and electrical cochlear stimulation. This modularity in the representation of spectral integration is expressed by intrinsic cortical connections. This organization has implications for our understanding of psychophysical spectral integration measures such as the critical band and general cortical coding strategies.
  • J S Kanwal
  • S Matsumura
  • K Ohlemiller
  • N Suga
Kanwal, J. S., Matsumura, S., Ohlemiller, K. & Suga, N. (1994) J. Acoust. Soc. Am. 96, 1229–1254.
  • P Winter
  • H H Funkenstein
Winter, P. & Funkenstein, H. H. (1973) Exp. Brain Res. 18, 489–504.
  • X Wang
  • M M Merzenich
  • R Beitel
  • C E Schreiner
Wang, X., Merzenich, M. M., Beitel, R. & Schreiner, C. E. (1995) J. Neurophysiol. 74, 2685–2706.
  • L M Romanski
  • B Tian
  • J Fritz
  • M Mishkin
  • P S Goldman-Rakic
  • J P Rauschecker
Romanski, L. M., Tian, B., Fritz, J., Mishkin, M., Goldman-Rakic, P. S. & Rauschecker, J. P. (1999) Nat. Neurosci. 2, 1131–1136.
  • G H Recanzone
  • C E Schreiner
  • M M Merzenich
Recanzone, G. H., Schreiner, C. E. & Merzenich, M. M. (1992) J. Neurosci. 13, 87–104.
  • J H Kaas
  • T A Hackett
  • M J Tramo
Kaas, J. H., Hackett, T. A. & Tramo, M. J. (1999) Curr. Opin. Neurobiol. 9, 164–170.
  • T Lu
  • X Wang
Lu, T. & Wang, X. (2000) J. Neurophysiol. 84, 236–246.
  • J D Newman
  • Z Wollberg
Newman, J. D. & Wollberg, Z. (1973a) Exp. Neurol. 40, 821–824.
  • Zatorre