Giving speech a hand: Gesture modulates activity in auditory cortex during speech perception

Ahmanson-Lovelace Brain Mapping Center, University of California, Los Angeles, California 90095-7085, USA.
Human Brain Mapping (Impact Factor: 5.97). 03/2009; 30(3):1028-37. DOI: 10.1002/hbm.20565
Source: PubMed


Viewing hand gestures during face-to-face communication affects speech perception and comprehension. Despite the visible role played by gesture in social interactions, relatively little is known about how the brain integrates hand gestures with co-occurring speech. Here we used functional magnetic resonance imaging (fMRI) and an ecologically valid paradigm to investigate how beat gesture-a fundamental type of hand gesture that marks speech prosody-might impact speech perception at the neural level. Subjects underwent fMRI while listening to spontaneously-produced speech accompanied by beat gesture, nonsense hand movement, or a still body; as additional control conditions, subjects also viewed beat gesture, nonsense hand movement, or a still body all presented without speech. Validating behavioral evidence that gesture affects speech perception, bilateral nonprimary auditory cortex showed greater activity when speech was accompanied by beat gesture than when speech was presented alone. Further, the left superior temporal gyrus/sulcus showed stronger activity when speech was accompanied by beat gesture than when speech was accompanied by nonsense hand movement. Finally, the right planum temporale was identified as a putative multisensory integration site for beat gesture and speech (i.e., here activity in response to speech accompanied by beat gesture was greater than the summed responses to speech alone and beat gesture alone), indicating that this area may be pivotally involved in synthesizing the rhythmic aspects of both speech and gesture. Taken together, these findings suggest a common neural substrate for processing speech and gesture, likely reflecting their joint communicative role in social interactions.

Download full-text


Available from: Daniel Callan,
  • Source
    • "If the deltatheta rhythmic aspects in the auditory signal can play the role of anchors for predictive coding during speech segmentation (Arnal and Giraud, 2012; Peelle and Davis, 2012; Park et al., 2015), then preceding visual gestural information, naturally present in face to face conversations, may convey very useful information for decoding the signal and thus, be taken into account. For instance, beats are not only exquisitely tuned to the prosodic aspects of the auditory spectro-temporal structure, but also engage language-related brain areas during continuous AV speech perception (Hubbard et al., 2009). This idea is in "
    [Show abstract] [Hide abstract]
    ABSTRACT: During social interactions, speakers often produce spontaneous gestures to accompany their speech. These coordinated body movements convey communicative intentions, and modulate how listeners perceive the message in a subtle, but important way. In the present perspective, we put the focus on the role that congruent non-verbal information from beat gestures may play in the neural responses to speech. Whilst delta-theta oscillatory brain responses reflect the time-frequency structure of the speech signal, we argue that beat gestures promote phase resetting at relevant word onsets. This mechanism may facilitate the anticipation of associated acoustic cues relevant for prosodic/syllabic-based segmentation in speech perception. We report recently published data supporting this hypothesis, and discuss the potential of beats (and gestures in general) for further studies investigating continuous AV speech processing through low-frequency oscillations.
    Frontiers in Human Neuroscience 10/2015; 9:527. DOI:10.3389/fnhum.2015.00527 · 3.63 Impact Factor
  • Source
    • "speech and gesture processing as demonstrated using patient samples and functional neuroimaging [Green et al., 2009; Holle et al., 2008; Hubbard et al., 2009; Willems and Hagoort, 2007; Xu et al., 2009]. Indeed, one landmark study observed that people who are blind from birth use gesture when they speak in the same manner that sighted people do [Iverson and Goldin-Meadow, 1998], and gesture is also impacted in callosotomy patients [Lausberg et al., 2003]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Gestures represent an integral aspect of interpersonal communication, and they are closely linked with language and thought. Brain regions for language processing overlap with those for gesture processing. Two types of gesticulation, beat gestures and metaphoric gestures are particularly important for understanding the taxonomy of co-speech gestures. Here, we investigated gesture production during taped interviews with respect to regional brain volume. First, we were interested in whether beat gesture production is associated with similar regions as metaphoric gesture. Second, we investigated whether cortical regions associated with metaphoric gesture processing are linked to gesture production based on correlations with brain volumes. We found that beat gestures are uniquely related to regional volume in cerebellar regions previously implicated in discrete motor timing. We suggest that these gestures may be an artifact of the timing processes of the cerebellum that are important for the timing of vocalizations. Second, our findings indicate that brain volumes in regions of the left hemisphere previously implicated in metaphoric gesture processing are positively correlated with metaphoric gesture production. Together, this novel work extends our understanding of left hemisphere regions associated with gesture to indicate their importance in gesture production, and also suggests that beat gestures may be especially unique. This provides important insight into the taxonomy of co-speech gestures, and also further insight into the general role of the cerebellum in language. Hum Brain Mapp, 2015. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
    Human Brain Mapping 07/2015; 36(10). DOI:10.1002/hbm.22894 · 5.97 Impact Factor
  • Source
    • "Point-light display contains little or no static spatial information and enables complex manipulation of different features such as temporal coordination (Bertenthal and Pinto, 1994) or position of points (Cutting, 1981; Verfaillie, 1993). We chose point-lights over full-body displays to avoid any emotional bias that could be associated with cues such as identity, clothing or body shape, and to make sure we are primarily looking at the effects of body movement with visual displays (Hill et al., 2003). Point-light displays also enable us to easily manipulate various parameters of displays (e.g., viewpoint, number of points), and therefore help us to " future proof " our stimuli set for other studies without the need to re-capture a new interactions. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Audiovisual perception of emotions has been typically examined using displays of a solitary character (e.g., the face-voice and/or body-sound of one actor). However, in real life humans often face more complex multisensory social situations, involving more than one person. Here we ask if the audiovisual facilitation in emotion recognition previously found in simpler social situations extends to more complex and ecological situations. Stimuli consisting of the biological motion and voice of two interacting agents were used in two experiments. In Experiment 1, participants were presented with visual, auditory, auditory filtered/noisy, and audiovisual congruent and incongruent clips. We asked participants to judge whether the two agents were interacting happily or angrily. In Experiment 2, another group of participants repeated the same task, as in Experiment 1, while trying to ignore either the visual or the auditory information. The findings from both experiments indicate that when the reliability of the auditory cue was decreased participants weighted more the visual cue in their emotional judgments. This in turn translated in increased emotion recognition accuracy for the multisensory condition. Our findings thus point to a common mechanism of multisensory integration of emotional signals irrespective of social stimulus complexity.
    Frontiers in Psychology 05/2015; 9. DOI:10.3389/fpsyg.2015.00611 · 2.80 Impact Factor
Show more