Book

Crossmodal Space and Crossmodal Attention

Authors:

Abstract

Many organisms possess multiple sensory systems, such as vision, hearing, touch, smell, and taste. The possession of such multiple ways of sensing the world offers many benefits. These benefits arise not only because each modality can sense different aspects of the environment, but also because different senses can respond jointly to the same external object or event, thus enriching the overall experience - for example, looking at an individual while listening to them speak. However, combining information from different senses also poses many challenges for the nervous system. In recent years there has been dramatic progress in understanding how information from different sensory modalities gets integrated in order to construct useful representations of external space; and in how such multimodal representations constrain spatial attention. Such progress has involved numerous different disciplines, including neurophysiology, experimental psychology, neurological work with brain-damaged patients, neuroimaging studies, and computational modelling. This volume brings together the leading researchers from all these approaches, to present aan integrative overview of this central topic in cognitive neuroscience.
... Auditory cues presented in space facilitate the perception of a visual target in the vicinity, which is known as the audio-visual cross-modal spatial cuing effect Spence, 2015, 2017;Santangelo et al., 2006;Spence and Driver, 2004;Spence et al., 1998Spence et al., , 2000. The audio-visual cross-modal spatial cuing effect has been studied for a long time in applied psychology and human factors (Spence and Soto-Faraco, 2020). ...
... The larger the distance between an auditory cue and a visual target, the weaker the audio-visual cross-modal spatial cuing effect (Mock et al., 2015;Mondor and Zatorre, 1995;Schmitt et al., 2001;Teder-Sälejärvi and Hillyard, 1998). By contrast, some studies argued that this effect can be obtained when the auditory cue and visual target are approximately in the same functional field, i.e., both events are presented from the near field Ho et al., 2006;Lee and Spence, 2017;Schmitt et al., 2001;Spence and Driver, 2004). The auditory cue, namely, is not necessarily in the same direction as the visual target. ...
... When the auditory cues were presented within 40˚ of the visual targets, the participants could respond at speeds similar to those at which the auditory cue and visual target were presented from the same direction regardless of the location of the visual targets. This result is consistent with the implications of the previous studies, suggesting that auditory cues can draw visual spatial attention when the auditory cues are approximately in the same functional field with the visual target, i.e., both events are not presented from completely same location but from near field Ho et al., 2006;Lee and Spence, 2017;Schmitt et al., 2001;Spence and Driver, 2004). This study could demonstrate quantitatively angular differences between the auditory cue and visual target that elicit responses similar to those when the auditory cue and visual target were in the same direction, at a condition where the participants were continuously deprived of their attentional resources to frontal view. ...
Article
Full-text available
Auditory cues can draw individuals’ spatial attention to visual targets and it enables individuals to find visual targets quickly. Some previous studies suggested that auditory cues should be in approximately the same functional field with visual targets (i.e., both events are presented within near field) to obtain quick responses even if in real situations such as driving. However, quantitative angular differences between auditory cues and visual targets are little investigated. This present study aimed to explore the angular differences that can elicit similar responses with auditory cues and visual targets presented from similar directions under workload conditions. Twenty-two participants were asked to perform visual search and tracking tasks simultaneously. The visual targets were at ±20°, ±40°, and ±60° in azimuth as 0° of frontal view with distance of 1.0 m from participants. The auditory cues were presented simultaneously with visual targets from 0°, ±20°, ±40°, and ±60°. The results of the response time analysis using a fitting approach revealed that the auditory cues within 40° from the visual targets could elicit similar responses similar with those when the auditory cues and visual targets were presented from the same direction even if the participants were deprived of their attentional resources by other tasks. When the angular difference between auditory cues and visual targets were larger than 60°, the participants’ spatial attention was drawn to directions different from visual targets. It elicited a delay in finding visual targets because the participants should reallocate their spatial attention to visual targets. These results have meaningful implications for audio-visual user interface designs.
... The reason may be the fact that textual and visual information represent distinct modalities which undergo separate cognitive processing pathways. Cross-modal processing involves the integration of information from diverse sensory modalities, and research indicates that the distinct modalities (e.g., auditory, textual, visual) undergo largely separate cognitive processing pathways in the brain, without substantial mutual interference (Spence, 2011;Spence & Driver, 2004). Thus, the relative length of textual responses may not interact with the volume of pictorial content in influencing usefulness appraisals. ...
... Because humans process information through different sensory modalities (Spence, 2011;Spence & Driver, 2004), an opportunity exists for tour managers to provide review responses that incorporate visual or video evidence, where appropriate, and thereby to offer multidimensional substantiation to aid consumers in processing the product information. Further research could explore the viability and impacts of enabling pictorial and video-based managerial responses. ...
... Specifically, multimodal monitoring displays allow the monitoring of information even when visual contact with the display is not maintained. Furthermore, they may lead to multimodal facilitation, as in aurally-aided visual search Spatial Audio for Multimodal Location Monitoring 565 (Lee & Spence, 2008, Ngo et al., 2012, Spence & Driver, 2004, Van der Burg et al., 2008. Finally, they minimize perceptual interference with concurrent, often visual, tasks that can help users perform monitoring in parallel with other activities (Neumann, 1996, Stanney et al., 2004, Wickens, 1991, 2002. ...
... Dual-task facilitation may be reduced due to interference in higher stages. Sound localization precision was affected significantly when localization was performed in parallel with visual tasks that require spatial memory resources (Klauer & Stegmaier, 1997, Merat & Groeger, 2003, in line with supramodal views of spatial attention (Driver & Spence, 1998, Eimer, 1999, Eimer & Driver, 2001, Katus & Eimer, 2016, Spence, 2007, Spence & Driver, 2004, Stahl & Marentakis, 2017. Nevertheless, the cost due to between-modality competition for spatial attention is smaller (or even insignificant) in comparison to within-modality competition (Arrighi et al., 2011, Santangelo et al., 2010, Soto-Faraco et al., 2005. ...
Article
Full-text available
Location monitoring is a common task that is typically performed using visual displays that may constrain user location and visual attention. Using spatial audio to present the location of the monitored target could help relax such constraints. To evaluate this hypothesis, we conducted three experiments in which the location monitoring display modality, location, cognitive load and the task spatial resolution were varied. Visual and audiovisual location monitoring resulted in a higher location monitoring accuracy and speed but it induced a significantly higher dual-task cost compared to auditory when both displays were not within peripheral vision. Furthermore, auditory location monitoring accuracy approximated visual when the spatial resolution required by the task was adapted to auditory localization accuracy. The results show that using spatial audio to create multimodal location monitoring can reduce visual attention load and increase the flexibility of user placement relative to the monitoring display without incurring an additional location monitoring cost.
... Recently, the tendencies of cross-modal information processing and design Driver, 2004 andSpence, 1998) have emerged as major research topics in the design of automotive warning system. Presenting information via multiple modalities such as vision, audition, and touch has been expected to be a promising means to reduce transmission errors and enhance safety. ...
... The nature of cross-modal links in spatial attention demonstrates that responses to a target presented in one sensory modality can be facilitated by the prior presentation of a cue (warning) by another sensory modality (Spence and Driver, 2004). On the basis of the results, it is expected that a vibrotactile warning would be very promising as a warning signal especially under noisy environment. ...
Conference Paper
If the warning signal is presented via visual or auditory stimulus, the auditory or visual interference with other information might arise. On the other hand, if vibrotactile cue is used, such interference would be surely reduced. Therefore, it is expected that a vibrotactile signal would be very promising as a warning signal especially under noisy environment. In order to clarify the most suitable modality of cue (warning) to a visual hazard under noisy environment, the following two cues were used in the experiment: (1) auditory cue and (2) vibrotactile cue. The condition of SOA (Stimulus Onset Asynchrony) was set to 0s, 0.5s, and 1s. The noose level inside the experimental chamber was 60dB(A), 70dB(A), 80dB(A), and 90dB(A). It was hypothesized that vibrotactile cue under noisy environment is more effective for quickening the reaction to a hazard than auditory cue. As a result, it was verified that the vibrotactile warning got more and more effective with the increase of noise level. The reaction time to the auditory warning was remarkably affected by the noise level, while the reaction time to the vibrotactile warning was not affected by the noise level at all. Moreover, the SOA condition did not remarkably affect the reaction time to the auditory or the vibrotactile warnings.
... The influence of visual, proprioceptive, and vestibular systems on sound localization has been investigated in various studies [1][2][3][4][5][6][7][8], including a good review to overview this issue [2]. For instance, Wallach [7] reported that sound-source localization requires an interaction of auditory-spatial and head-position cues. ...
... These evidences [1][2][3][4][5][6][7][8] demonstrate that sound localization is a multisensory integration process involving self-motion [10,11]. In fact, previous studies have demonstrated that a listener's head/body movement facilitates sound localization by leveraging dynamic changes in the information input into the ears [10][11][12][13][14][15][16][17][18]. ...
Article
Full-text available
The deterioration of sound localization accuracy during a listener’s head/body rotation is independent of the listener’s rotation velocity (Honda et al., 2016). However, whether this deterioration occurs only during physical movement in a real environment remains unclear. In this study, we addressed this question by subjecting physically stationary listeners to visually induced self-motion, i.e., vection. Two conditions—one with a visually induced perception of self-motion (vection) and the other without vection (control)—were adopted. Under both conditions, a short noise burst (30 ms) was presented via a loudspeaker in a circular array placed horizontally in front of a listener. The listeners were asked to determine whether the acoustic stimulus was localized relative to their subjective midline. The results showed that in terms of detection thresholds based on the subjective midline, the sound localization accuracy was lower under the vection condition than under the control condition. This indicates that sound localization can be compromised under visually induced self-motion perception. These findings support the idea that self-motion information is crucial for auditory space perception and can potentially enable the design of dynamic binaural displays requiring fewer computational resources.
... Fortunately, however, such a disjointed and partial experience would appear not to be the case. This is presumably because the stimulation of specific areas of the skin surface leads to a person's attention being captured by (and thus directed towards) those parts of the body that are stimulated [37][38][39]. This may help to make even limited tactile feedback feel particularly immediate and immersive [8]. ...
Article
Full-text available
In this narrative historical review, we take a closer look at the role of tactile/haptic stimulation in enhancing people’s immersion (and sense of presence) in a variety of entertainment experiences, including virtual reality (VR). An important distinction is highlighted between those situations in which digital tactile stimulation and/or haptic feedback are delivered to those (i.e., users/audience members) who passively experience the stimulation and those cases, including VR, where the user actively controls some aspects of the tactile stimulation/haptic feedback that they happen to be experiencing. A further distinction is drawn between visual and/or auditory VR, where some form of tactile/haptic stimulation is added, and what might be classed as genuinely haptic VR, where the active user/player experiences tactile/haptic stimulation that is effortlessly interpreted in terms of the objects and actions in the virtual world. We review the experimental evidence that has assessed the impact of adding a tactile/haptic element to entertainment experiences, including those in VR. Finally, we highlight some of the key challenges to the growth of haptic VR in the context of multisensory entertainment experiences: these include those of a technical, financial, psychological (namely, the fact that tactile/haptic stimulation often needs to be interpreted and can reduce the sense of immersion in many situations), psycho-physiological (such as sensory overload or fatigue), physiological (e.g., relating to the large surface area of the skin that can potentially be stimulated), and creative/artistic nature.
... Although the literature documents certain cognitive benefits of immersive visuohaptic object exploration, a research gap remains regarding understanding the memorability of such interactions. While previous studies have primarily focused on different aspects of sensory processing, attention, and perception, the impact of visuohaptic encoding on the retention of object characteristics is not sufficiently understood and requires further investigation [37,117]. ...
Conference Paper
Full-text available
Although Virtual Reality (VR) has undoubtedly improved human interaction with 3D data, users still face difficulties retaining important details of complex digital objects in preparation for physical tasks. To address this issue, we evaluated the potential of visuohaptic integration to improve the memorability of virtual objects in immersive visualizations. In a user study (N=20), participants performed a delayed match-to-sample task where they memorized stimuli of visual, haptic, or visuohaptic encoding conditions. We assessed performance differences between these encoding modalities through error rates and response times. We found that visuohaptic encoding significantly improved memorization accuracy compared to unimodal visual and haptic conditions. Our analysis indicates that integrating haptics into immersive visualizations enhances the memorability of digital objects. We discuss its implications for the optimal encoding design in VR applications that assist professionals who need to memorize and recall virtual objects in their daily work.
... Integrating visual cues with airflow significantly improved airflow perception accuracy, especially for lateral directions. This enhanced detection might result from multisensory integration processes that allow the brain to form a more comprehensive understanding by combining information from multiple modalities [49,50]. Multimodal feedback also reduced the cognitive workload required for the task by distributing sensory processing [51]. ...
Article
Virtual reality (VR) technology has been increasingly focusing on incorporating multimodal outputs to enhance the sense of immersion and realism. In this work, we developed AirWhisper, a modular wearable device that provides dynamic airflow feedback to enhance VR experiences. AirWhisper simulates wind from multiple directions around the user's head via four micro fans and 3D-printed attachments. We applied a Just Noticeable Difference study to support the design of the control system and explore the user's perception of the characteristics of the airflow in different directions. Through multimodal comparison experiments, we find that vision-airflow multimodality output can improve the user's VR experience from several perspectives. Finally, we designed scenarios with different airflow change patterns and different levels of interaction to test AirWhisper's performance in various contexts and explore the differences in users' perception of airflow under different virtual environment conditions. Our work shows the importance of developing human-centered multimodal feedback adaptive learning models that can make real-time dynamic changes based on the user's perceptual characteristics and environmental features.
... First, the multimodal facilitation effect is not wholly accounted for by semantics, as the NV condition improves RTs with respect to the unimodal conditions in the two weapons categories, acting as a trigger. This finding fits well with recent proposals that consider attention orientation as a multi-sensory construction [97] rather than a primarily visual process [98], a consideration supported by neurophysiological investigations [99,100]. Secondly, the influence of white noise may be conditioned by the level of familiarity. ...
Article
Full-text available
The label-feedback hypothesis states that language can modulate visual processing. In particular, hearing or reading aloud target names (labels) speeds up performance in visual search tasks by facilitating target detection and such advantage is often measured against a condition where the target name is shown visually (i.e. via the same modality as the search task). The current study conceptually complements and expands previous investigations. The effect of a multimodal label presentation (i.e., an audio+visual, AV, priming label) in a visual search task is compared to that of a multimodal (i.e. white noise+visual, NV, label) and two unimodal (i.e. audio, A, label or visual, V, label) control conditions. The name of a category (i.e. a label at the superordinate level) is used as a cue, instead of the more commonly used target name (a basic level label), with targets belonging to one of three categories: garments, improper weapons, and proper weapons. These categories vary for their structure, improper weapons being an ad hoc category (i.e. context-dependent), unlike proper weapons and garments. The preregistered analysis shows an overall facilitation of visual search performance in the AV condition compared to the NV condition, confirming that the label-feedback effect may not be explained away by the effects of multimodal stimulation only and that it extends to superordinate labels. Moreover, exploratory analyses show that such facilitation is driven by the garments and proper weapons categories, rather than improper weapons. Thus, the superordinate label-feedback effect is modulated by the structural properties of a category. These findings are consistent with the idea that the AV condition prompts an "up-regulation" of the label, a requirement for enhancing the label’s beneficial effects, but not when the label refers to an ad hoc category. They also highlight the peculiar status of the category of improper weapons and set it apart from that of proper weapons.
... Although the literature documents certain cognitive benefits of immersive visuohaptic object exploration, a research gap remains regarding understanding the memorability of such interactions. While previous studies have majorly focused on different aspects of sensory processing, attention, and perception, the impact of visuohaptic encoding on the retention of an object is not sufficiently understood and requires further investigation [36,112]. ...
Preprint
Full-text available
Although Virtual Reality (VR) has undoubtedly improved human interaction with 3D data, users still face difficulties retaining important details of complex digital objects in preparation for physical tasks. To address this issue, we evaluated the potential of visuohaptic integration to improve the memorability of virtual objects in immersive visualizations. In a user study (N=20), participants performed a delayed match-to-sample task where they memorized stimuli of visual, haptic, or visuohaptic encoding conditions. We assessed performance differences between the conditions through error rates and response time. We found that visuohaptic encoding significantly improved memorization accuracy compared to unimodal visual and haptic conditions. Our analysis indicates that integrating haptics into immersive visualizations enhances the memorability of digital objects. We discuss its implications for the optimal encoding design in VR applications that assist professionals who need to memorize and recall virtual objects in their daily work.
... The greater the consistency among the different stimuli perceived through various senses, the more likely the user's perceptual system will recognize virtual objects and environments as real, providing the sensation of being in a genuine world. To elaborate, integrating diverse sensory stimuli is particularly critical since research in neuroscience and psychology indicates that human perception is inherently multisensory, incorporating vision, touch, hearing, and other senses [16]. Whenever we interact with the external world, our brain integrates information from various sensory modalities. ...
Conference Paper
Nowadays, industrial training is gaining popularity, and in particular, Virtual Reality (VR) technologies are often adopted to recreate training experiences that can transform the learning process by exploiting the concept of learning by doing. Moreover, VR allows us to recreate realistic and immersive environments that provide trainees with hands-on experience in a safe and controlled setting. One of the advantages of using VR technologies to perform training simulations lies in the ability that these tools offer to mix different senses as if they were independent variables and to manipulate them one at a time. In this way, the effects of using one sense on the specific task under analysis can be discovered in a controlled scenario in which the others are controlled. One of the least used senses in VR is undoubtedly that of smell. In this paper, we intend to use a new olfactory device specifically designed to be adopted with VR technologies, integrate it with an immersive VR training application, and propose a series of uses allowed by this combination. In the paper, we report the details of the multisensory approach that integrates the different senses. Additionally, the paper presents the findings of a pilot experiment conducted to validate the adopted approach and assess the results. The study demonstrates that the developed Olfactory Display had no adverse impact on users’ interaction with VR content.
... In the AV condition, there was a significant IOR in both the high-reward and low-reward conditions, and the IOR effect size in the high-reward condition was significantly lower than that in the low-reward condition, that is, the high reward weakened the IOR effect. Some studies have suggested that spatial attention might be cross-modal and that visual and auditory attentional orienting might have the same mechanisms (Spence & Driver, 2004;Störmer, 2019), and studies on IOR have validated this conclusion (Pierce et al., 2018;Spence et al., 2000). From an attentional orientation perspective, orienting networks direct attention to an attended (cued) location (Corbetta et al., 2000;Thiel et al., 2004), whereas reorienting networks direct attention to unattended (uncued) locations (Corbetta et al., 2000). ...
Article
Previous studies have shown that rewards weaken visual inhibition of return (IOR). However, the specific mechanisms underlying the influence of rewards on cross-modal IOR remain unclear. Based on the Posner exogenous cue-target paradigm, the present study was conducted to investigate the effect of rewards on exogenous spatial cross-modal IOR in both visual cue with auditory target (VA) and auditory cue with visual target (AV) conditions. The results showed the following: in the AV condition, the IOR effect size in the high-reward condition was significantly lower than that in the low-reward condition. However, in the VA condition, there was no significant IOR in either the high- or low-reward condition and there was no significant difference between the two conditions. In other words, the use of rewards modulated exogenous spatial cross-modal IOR with visual targets; specifically, high rewards may have weakened IOR in the AV condition. Taken together, our study extended the effect of rewards on IOR to cross-modal attention conditions and demonstrated for the first time that higher motivation among individuals under high-reward conditions weakened the cross-modal IOR with visual targets. Moreover, the present study provided evidence for future research on the relationship between reward and attention.
... Such examples show the dominance of vision over audition. The sense of hearing is highly affected not only by vision but also by other senses such as touch and proprioception [6,7]. As an example, it has been shown that tactile information in the form of a puff of air facilitates speech intelligibility [8]. ...
Article
Full-text available
Virtual Reality (VR) technologies have the potential to be applied in a clinical context to improve training and rehabilitation for individuals with hearing impairment. The introduction of such technologies in clinical audiology is in its infancy and requires devices that can be taken out of laboratory settings as well as a solid collaboration between researchers and clinicians. In this paper, we discuss the state of the art of VR in audiology with applications to measurement and monitoring of hearing loss, rehabilitation, and training, as well as the development of assistive technologies. We review papers that utilize VR delivered through a head-mounted display (HMD) and used individuals with hearing impairment as test subjects, or presented solutions targeted at individuals with hearing impairments, discussing their goals and results, and analyzing how VR can be a useful tool in hearing research. The review shows the potential of VR in testing and training individuals with hearing impairment, as well as the need for more research and applications in this domain.
... Spatial coincidence of multisensory stimuli leads to enhanced evoked potentials. Behavioral studies have shown that the better multisensory stimuli coincide spatially, the stronger the attentional effects are between them [29]. This carries over to ERPs too. ...
Chapter
Full-text available
Electroencephalography (EEG) is one of the major tools to non-invasively investigate cortical activations from somatosensation in humans. EEG is useful for delineating influences on the processing pathways of tactile stimulation and for mapping the dynamics between the cortical areas involved in and linked to tactile perception. This chapter focuses on the process of recording somatosensory EEG from mechanical tactile stimulation, including affective touch, and their related cortical activations. Practical and participant-specific challenges are detailed, and best practices are shared. In addition, the main areas of research in tactile perception using EEG are discussed. These include perception, attention, and multisensory perception, as well as emotional and self-other processing. We discuss the major considerations when conducting these types of research.Key wordsTactileSomatosensoryElectroencephalography EEG ERPs SEPs MultisensoryAttentionAffect
... Meredith and Stein 1986b;Sparks 1986;Stein and Meredith 1993) and especially in recent years (e.g. Spence and Driver 2004;Stein 2012b;Trommershäuser et al. 2011). Models of MSI and in particular dSC models with a focus on the integration of stimuli by now form a large and diverse corpus (see Ursino et al. 2014, for a review with a more general perspective). ...
... When human senses are simultaneously stimulated, objects are appreciated more intensely. From a neurological perspective, this effect is obtained by integrating the information from the different sensory modalities, which results in a final integrated experience [31]. ...
Chapter
eXtended Reality (XR) technology can enhance the visitors’ experience of museums. Due to the variety of XR technologies available that differ in performance, quality of the experience they provide, and cost, it is helpful to refer to the evaluation of the various technologies performed through user studies to select the most suitable ones. This paper presents a set of empirical studies on XR application for museums to select the appropriate technologies to meet visitors’ expectations and maximise the willingness of repeating and recommending the experience. They provide valuable insights for developing Virtual Museum applications increasing the level of presence and experience economy.KeywordsVirtual museumUser experienceExtended realityMultisensory experienceSense of smell
... Although the McGurk effect clearly displayed that vision can modulate what we hear (Mcgurk and MacDonald, 1976), the modulation of auditory information on visual perception has also been extensively reported (Morein-Zamir et al., 2003;Sekuler et al., 1997;Shams et al., 2002;Stein et al., 1996), and the integration of auditory and visual information is rearranged according to the auditory signal during AVI (Spence and Squire, 2003). Wahn and König reported that shared or separate attentional resources across sensory modalities are task-dependent (Wahn and König, 2017), where auditory and visual attentional resources are separated during the discrimination of stimulus attributes (Alais et al., 2006;Arrighi et al., 2011) but are shared during stimulus location Spence, 1998a, 1998b;Spence, 2010aSpence, , 2010bSpence and Driver, 2004). The rapid serial visual presentation (RSVP) task, the stimulus-attribution discrimination task, was applied by the studies of Alsius et al., 2005and Ren et al. (Ren et al., 2020, during which the visual and auditory attentional resources were separate. ...
Article
Studies have revealed that visual attentional load modulated audiovisual integration (AVI) greatly; however, auditory and visual attentional resources are separate to some degree, and task-irrelevant auditory information could arouse much faster and larger attentional alerting effects than visible information. Here, we aimed to explore how auditory attentional load influences AVI and how aging could have an effect. Thirty older and 30 younger adults participated in an AV discrimination task with an additional auditory distractor competing for attentional resources. The race model analysis revealed highest AVI in the low auditory attentional load condition (low > no > medium > high, pairwise comparison, all p ≤ 0.047) for younger adults and a higher AVI under the no auditory attentional-load condition (p = 0.008), but there was a lower AVI under the low (p = 0.019), medium (p < 0.001), and high (p = 0.021) auditory attentional-load conditions for older adults than for younger adults. The time-frequency analysis revealed higher theta- and alpha-band AVI oscillation under no and low auditory attentional-load conditions than under medium and high auditory attentional-load conditions for both older (all p ≤ 0.011) and younger (all p ≤ 0.024) adults. Additionally, Weighted Phase lag index (WPLI) analysis revealed higher theta-band and lower alpha-band global functional connectivity for older adults during AV stimuli processing (all p ≤ 0.031). These results suggested that the AVI was higher in the low attentional-load condition than in the no attentional-load condition but decreased inversely with increasing of attentional load and that there was a significant aging effect in older adults. In addition, the strengthened theta-band global functional connectivity in older adults during AV stimuli processing might be an adaptive phenomenon for age-related perceptual decline.
... Research on crossmodal spatial attention has shown that spatial cueing works even when cue and stimulus are from different modalities [16][17][18] and, in more realistic conditions. For instance, Begault [19] found that the performance in a visual search task of airline crews improved when using spatial auditory cues in the same location of the target, compared to when the cue was a warning message ("traffic, traffic!") with no spatial information. ...
Article
Full-text available
Making decisions is an important aspect of people’s lives. Decisions can be highly critical in nature, with mistakes possibly resulting in extremely adverse consequences. Yet, such decisions have often to be made within a very short period of time and with limited information. This can result in decreased accuracy and efficiency. In this paper, we explore the possibility of increasing speed and accuracy of users engaged in the discrimination of realistic targets presented for a very short time, in the presence of unimodal or bimodal cues. More specifically, we present results from an experiment where users were asked to discriminate between targets rapidly appearing in an indoor environment. Unimodal (auditory) or bimodal (audio-visual) cues could shortly precede the target stimulus, warning the users about its location. Our findings show that, when used to facilitate perceptual decision under time pressure, and in condition of limited information in real-world scenarios, spoken cues can be effective in boosting performance (accuracy, reaction times or both), and even more so when presented in bimodal form. However, we also found that cue timing plays a critical role and, if the cue-stimulus interval is too short, cues may offer no advantage. In a post-hoc analysis of our data, we also show that congruency between the response location and both the target location and the cues, can interfere with the speed and accuracy in the task. These effects should be taken in consideration, particularly when investigating performance in realistic tasks.
... Forming rapid and accurate perceptual decisions in our everyday life benefits from the use of complementary information coming from multiple sensory modalities. The synthesis of signals from different senses has been shown to improve perceptual performance, leading to more accurate (Gingras et al., 2009;Lippert et al., 2007;Spence and Driver, 2004) and faster responses (Diederich and Colonius, 2004;Hershenson, 1962). Previous research has shown that crossmodal interactions are governed by neural oscillations in different frequency bands that can occur at both early and late stages of processing and involve bottom-up and top-down mechanisms (Bauer et al., 2020;Keil and Senkowski, 2018). ...
Article
The combination of signals from different sensory modalities can enhance perception and facilitate behavioral responses. While previous research described crossmodal influences in a wide range of tasks, it remains unclear how such influences drive performance enhancements. In particular, the neural mechanisms underlying performance-relevant crossmodal influences, as well as the latency and spatial profile of such influences are not well understood. Here, we examined data from high-density electroencephalography (N = 30) recordings to characterize the oscillatory signatures of crossmodal facilitation of response speed, as manifested in the speeding of visual responses by concurrent task-irrelevant auditory information. Using a data-driven analysis approach, we found that individual gains in response speed correlated with larger beta power difference (13-25 Hz) between the audiovisual and the visual condition, starting within 80 ms after stimulus onset in the secondary visual cortex and in multisensory association areas in the parietal cortex. In addition, we examined data from electrocorticography (ECoG) recordings in four epileptic patients in a comparable paradigm. These ECoG data revealed reduced beta power in audiovisual compared with visual trials in the superior temporal gyrus (STG). Collectively, our data suggest that the crossmodal facilitation of response speed is associated with reduced early beta power in multisensory association and secondary visual areas. The reduced early beta power may reflect an auditory-driven feedback signal to improve visual processing through attentional gating. These findings improve our understanding of the neural mechanisms underlying crossmodal response speed facilitation and highlight the critical role of beta oscillations in mediating behaviorally relevant multisensory processing.
... For example, we often immediately turn our heads if someone calls our name in a crowded area, indicating that both vision and auditory sensory information accelerates detection. Many studies have highlighted the importance of MSI and how it shapes perceptual processes (Driver & Spence, 1998;Calvert & Thesen, 2004;Spence, 2010;Spence & Driver, 2012;Shi & Müller, 2013) For example, Calvert & Thesen (2004) suggested that interaction of the senses is vital to maximize how efficiently and effectively individuals interact with the environment. Therefore, for an individual to benefit from the simultaneous stimulation of multiple sensory sources, such as visual, tactile and auditory information, is the core of successful perceptual experiences i.e. human interaction. ...
Article
In order to have a comprehensible representation of scenes and events, the human brain must combine information from different sensory sources. Integration of visual, tactile, and proprioceptive information is considered vital to this process as it underpins the subjective sense of self and body ownership; which has been linked to the development of social processes such as empathy and imitation. This issue has been investigated using sensory illusions and suggests that individuals with autism are less prone to multimodal illusions due to atypical sensory integration, i.e. they tend to rely on a single sensory source more, rather than integrating concurrent sources of information (i.e. over-reliance on proprioception). Studies that have measured illusion susceptibility and ownership, especially in regards to body ownership have provided mixed results. Therefore, it is important to understand and advance our knowledge on illusion susceptibility using sensory illusions. In order to conduct this research, it was first required to identify typically developing individuals who have high and low autism tendencies using the Autistic Spectrum Quotient (Baron-Cohen et al., 2001b). This was important because previous research has indicated behavioral similarities between individuals with high autism traits and those with high-functioning autism (HfA). The primary aim of this research was to investigate whether individuals with high autism traits and those with a diagnosis of autism perform in a similar way in terms of illusion susceptibility and illusion ownership, as previous research has stated differences in illusion susceptibility (Palmer et al., 2013; Paton et al., 2012). Three different multisensory illusions were presented to all the participants using the MIRAGE mediated reality device. This device enables the experimenter to presentvarious illusions on the participants’ limbs, where manipulations can be applied over the hand. Illusion ownership and susceptibility statements were used to measure the subjective experience of the participants, whereas, finger localization tasks were used as an objective measure of susceptibility to the illusions. Experiments One and Two investigated the effects of crawling skin illusion which is a visual illusion that can produce somatosensory sensations without any tactile input- as this illusory percept manipulates an individual’s existing knowledge regarding their own hand (McKenzie & Newport, 2015). The results indicated that individuals with high AQ scores (compared to low AQ, Experiment 1) and HfA (compared to typically developing adults, Experiment 2) showed less influence of visual context. They reported reduced effects of the illusion, which could be due to a higher reliance on top- down knowledge. However, all the participating groups showed high ownership of their hand as viewed through the MIRAGE. Participants with high and low autism traits (Experiment 3) and adults with HfA as well as typically developing adults (Experiment 4) were presented with the finger stretching illusion (Newport et al., 2015) which involves an interplay of vision, touch and proprioception. The results obtained showed that participants across all groups had high ownership score, however, only the low AQ group and the control group were susceptible to the illusion. An estimation task was used to measure whether participants embodied the illusion, adults with high AQ scores and HfA showed superior performance during the estimation task, however, the control groups estimates were significantly further, hence, making them more susceptible to the visuo- tactile manipulation. The third illusion measured visuo- proprioceptive integration in individuals with high and low AQ scores (Experiment 5) and adults with HfA as well as typically developing adults (Experiment 6). The task involved participants estimating the location of their hidden index finger under different conditions i.e. participants were able to view their hand or the view of their hand was hidden. Participants first took part in an adaptation procedure (Newport & Gilpin, 2011) which involved relocating the hand from where the participants last saw their hand. This was to test whether individuals with high autism traits and those with HfA showed superior proprioceptive performance in estimating their index finger location. The results indicated that the HfA and the high AQ groups were less affected by the visuo- proprioceptive misalignment caused during the adaptation procedure. Participants with low AQ scores and the typically developing group’s estimates were more influenced by the visual input. In conclusion, none of the experiments found strong evidence of over-reliance on proprioception in individuals with high AQ or those with HfA, however, they showed superior estimation abilities than the control group. My findings suggest that there is a preference, but not over- reliance on, for proprioception as opposed to visual and tactile information in the high AQ scoring group and the HfA group. Over- relying on a single sensory source, while not integrating multisensory information could have a detrimental impact on sensory processing and social interactions, especially the visuo- tactile system as it enables an individual to experience the environment through touch and understand everyday sensations such as temperature, pressure, itching, pain, etc. For future research, this research highlights the importance of studying the visual-tactile domain. An individual’s ability to process tactile input is related to their ability to visually discriminate and to have appropriate body awareness, which in turn helps in developing emotional security, academic learning, and social skills that are some of the core issues often reported in individuals with autism (Corbett et al., 2009; Happé & Frith, 2006; Piek & Dyck, 2004; Tager-Flusberg, 2008). More so, research investigating such processes should involve the whole spectrum of autism rather than focusing on a smaller subset.
... This behavioral benefit extends to spatial cueing studies in which subjects are cued to covertly attend (i.e., without saccading to the target location and hold fixation). In general, subjects respond more rapidly and more accurately when the cue and target are presented from the same rather than opposite sides of fixation (Spence and McDonald 2004) and these results can be interpreted either in terms of the spatial rule, or that there is a robust link in crossmodal spatial attention (Spence and Driver 2004). ...
Chapter
Visual cues help listeners follow conversation in a complex acoustic environment. Many audiovisual research studies focus on how sensory cues are combined to optimize perception, either in terms of minimizing the uncertainty in the sensory estimate or maximizing intelligibility, particularly in speech understanding. From an auditory perception perspective, a fundamental question that has not been fully addressed is how visual information aids the ability to select and focus on one auditory object in the presence of competing sounds in a busy auditory scene. In this chapter, audiovisual integration is presented from an object-based attention viewpoint. In particular, it is argued that a stricter delineation of the concepts of multisensory integration versus binding would facilitate a deeper understanding of the nature of how information is combined across senses. Furthermore, using an object-based theoretical framework to distinguish binding as a distinct form of multisensory integration generates testable hypotheses with behavioral predictions that can account for different aspects of multisensory interactions. In this chapter, classic multisensory illusion paradigms are revisited and discussed in the context of multisensory binding. The chapter also describes multisensory experiments that focus on addressing how visual stimuli help listeners parse complex auditory scenes. Finally, it concludes with a discussion of the potential mechanisms by which audiovisual processing might resolve competition between concurrent sounds in order to solve the cocktail party problem.
... Studies of multisensory collicular neurons suggest that their crossmodal receptive fields (RF) often overlap (Spence, Driver, and Driver 2004). This pattern is also found in multisensory neurons present in other brain regions. ...
Preprint
Full-text available
Information integration from different modalities is an active area of research. Human beings and, in general, biological neural systems are quite adept at using a multitude of signals from different sensory perceptive fields to interact with the environment and each other. Recent work in deep fusion models via neural networks has led to substantial improvements over unimodal approaches in areas like speech recognition, emotion recognition and analysis, captioning and image description. However, such research has mostly focused on architectural changes allowing for fusion of different modalities while keeping the model complexity manageable. Inspired by recent neuroscience ideas about multisensory integration and processing, we investigate the effect of synergy maximizing loss functions. Experiments on multimodal sentiment analysis tasks: CMU-MOSI and CMU-MOSEI with different models show that our approach provides a consistent performance boost.
... The results revealed that audiovisual integration was comparable under both the high and low sustained perceptual load conditions, indicating that sustained visual attentional load does not significantly affect audiovisual integration. reported that shared or distinct attentional resources across sensory modalities are task dependent, where auditory and visual attentional resources are distinct during the discrimination of stimulus attributes (Alais et al., 2006;Arrighi et al., 2011) but are shared during stimulus location (Driver & Spence, 1998a, 1998bSpence, 2010aSpence, , 2010bSpence & Driver, 2004). In studies by , Alsius et al. (2005;Alsius et al., 2014), Wahn & König, 2015), the auditory/visual discrimination task involved the discrimination of stimulus attributes, and the second distractor task was from a visual modality. ...
Article
Full-text available
Attention modulates numerous stages of audiovisual integration, and studies have shown that audiovisual integration is higher in attended conditions than in unattended conditions. However, attentional resources are limited for each person, and it is not yet clear how audiovisual integration changes under different attentional loads. Here, we explored how auditory attentional load affects audiovisual integration by applying an auditory/visual discrimination task to evaluate audiovisual integration and a rapid serial auditory presentation (RSAP) task to manipulate auditory attentional resources. The results for peak benefit and positive area under the curve of different probability showed that audiovisual integration was highest in the low attentional load condition and lowest in the high attentional load condition (low > no = medium > high). The peak latency and time window revealed that audiovisual integration was delayed as the attentional load increased (no < low < medium < high). Additionally, audiovisual depression was found in the no, medium, and high attentional load conditions but not in the low attentional load condition. These results suggest that mild auditory attentional load increases audiovisual integration, and high auditory attentional load decreases audiovisual integration.
... In recent decades, a number of studies have highlighted how crossmodal interactions can affect stimulus selection and stimulus processing (see, for reviews, [1][2][3][4][5][6][7]). For instance, a stimulus presented in a given sensory modality (e.g., audition) and in a given spatial location (e.g., on the right) can affect the processing of a target stimulus presented in a different sensory modality (e.g., visual) in the same spatial location, even when the presentation of the former stimulus happens to be entirely nonpredictive with regard to the location of the target, as evidenced by research using the orthogonal cuing paradigm [8,9]. Crossmodal signals that are task irrelevant can also affect perceptual processing in nonspatial settings. ...
Article
Full-text available
Object sounds can enhance the attentional selection and perceptual processing of semantically-related visual stimuli. However, it is currently unknown whether crossmodal semantic congruence also affects the post-perceptual stages of information processing, such as short-term memory (STM), and whether this effect is modulated by the object consistency with the background visual scene. In two experiments, participants viewed everyday visual scenes for 500 ms while listening to an object sound, which could either be semantically related to the object that served as the STM target at retrieval or not. This defined crossmodal semantically cued vs. uncued targets. The target was either in- or out-of-context with respect to the background visual scene. After a maintenance period of 2000 ms, the target was presented in isolation against a neutral background, in either the same or different spatial position as in the original scene. The participants judged the same vs. different position of the object and then provided a confidence judgment concerning the certainty of their response. The results revealed greater accuracy when judging the spatial position of targets paired with a semantically congruent object sound at encoding. This crossmodal facilitatory effect was modulated by whether the target object was in- or out-of-context with respect to the background scene, with out-of-context targets reducing the facilitatory effect of object sounds. Overall, these findings suggest that the presence of the object sound at encoding facilitated the selection and processing of the semantically related visual stimuli, but this effect depends on the semantic configuration of the visual scene.
... In everyday life, using complementary information from multiple sensory modalities is 43 often critical to make rapid and accurate perceptual decisions. The synthesis of signals 44 from different senses has been shown to improve perceptual performance, leading to 45 more accurate (Spence and Driver, 2004;Lippert et al., 2007) and faster responses 46 shorter RTs for unisensory and bisensory audiovisual stimuli. In an electrocorticography 68 (ECoG) study, Mercier et al. (2015) observed that delta band (<4 Hz) phase alignment 69 in a sensorimotor network was related to crossmodal facilitation of response speed. ...
Preprint
Full-text available
The combination of signals from different sensory modalities can enhance perception and facilitate behavioral responses. While previous research described crossmodal influences in a wide range of tasks, it remains unclear how such influences drive performance enhancements. In particular, the neural mechanisms underlying performance-relevant crossmodal influences, as well as the latency and spatial profile of such influences are not well understood. Here, we examined data from high-density electroencephalography (N = 30) and electrocorticography (N = 4) recordings to characterize the oscillatory signatures of crossmodal facilitation of response speed, as manifested in the speeding of visual responses by concurrent task-irrelevant auditory information. Using a data-driven analysis approach, we found that individual gains in response speed correlated with reduced beta power (13-25 Hz) in the audiovisual compared with the visual condition, starting within 80 ms after stimulus onset in multisensory association and secondary visual areas. In addition, the electrocorticography data revealed a beta power suppression in audiovisual compared with visual trials in the superior temporal gyrus (STG). Our data suggest that the crossmodal facilitation of response speed is associated with early beta power in multisensory association and secondary visual areas, presumably reflecting the enhancement of early sensory processing through selective attention. This finding furthers our understanding of the neural correlates underlying crossmodal response speed facilitation and highlights the critical role of beta oscillations in mediating behaviorally relevant audiovisual processing. Significance Statement The use of complementary information across multiple senses can enhance perception. Previous research established a central role of neuronal oscillations in multisensory perception, but it remains poorly understood how they relate to multisensory performance enhancement. To address this question, we recorded electrophysiological signals from scalp and intracranial electrodes (implanted for presurgical monitoring) in response to simple visual and audiovisual stimuli. We then associated the difference in oscillatory power between the two conditions with the speeding of responses in the audiovisual trials. We demonstrate, that the crossmodal facilitation of response speed is associated with beta power in multisensory association areas during early stages of sensory processing. This finding highlights the importance of beta oscillations in mediating behaviorally relevant audiovisual processing.
... Importantly, one should note here that the specific effects reported in the literature might be due, at least in part, to the way in which the stimuli were presented. In fact, in some of the studies mentioned above (Cann and Ross 21 ), the odours were released in the room rather than presented close to the participant's nostrils (leading to a reduced spatial congruency between the two sources of information), this might have reduced the perceived association between the odour and the face presented (Spence and Driver 25,26 ). ...
Article
Facial attractiveness plays an important role in everyday social interactions. In the present study, we investigated whether people's evaluation of attractiveness can be modulated by odours. In Experiment 1, twelve participants rated a series of odours on several perceptual and synaesthetic characteristics (gender, pleasantness, cheerfulness, intensity, arousal and association with food), along visual analogue scales. In Experiment 2, the participants judged the attractiveness of female and male faces, while smelling an odour that was rated in Experiment 1 as more feminine (caramel), masculine (licorice) or when odourless water was presented. The results showed that the participants evaluated female faces as more attractive when the caramel odour was concurrently presented. By contrast, the participants evaluated the male faces as more attractive when the licorice odour was presented. These results highlight the importance of the synaesthetic associations between “gender” and odours on people's judgements of facial attractiveness. Male and female faces attractiveness enhanced by the odours.
... Although there has been increasing interest in how various sensory cues are weighted and integrated to enable a multisensory representation of peripersonal space (e.g. Rizzolatti et al.1997;Spence and Driver 2004), this is the first study to investigate this interaction for obstacle avoidance. Our aim was, therefore, to investigate whether predicting the tactile consequences of contact with the object is not only relevant for goal-directed actions but also for avoiding potential collisions with obstacles. ...
Article
Full-text available
Multisensory coding of the space surrounding our body, the peripersonal space, is crucial for motor control. Recently, it has been proposed that an important function of multisensory coding is that it allows anticipation of the tactile consequences of contact with a nearby object. Indeed, performing goal-directed actions (i.e. pointing and grasping) induces a continuous visuotactile remapping as a function of on-line sensorimotor requirements. Here, we investigated whether visuotactile remapping can be induced by obstacles, e.g. objects that are not the target of the grasping movement. In the current experiment, we used a cross-modal obstacle avoidance paradigm, in which participants reached past an obstacle to grasp a second object. Participants indicated the location of tactile targets delivered to the hand during the grasping movement, while a visual cue was sometimes presented simultaneously on the to-be-avoided object. The tactile and visual stimulation was triggered when the reaching hand passed a position that was drawn randomly from a continuous set of predetermined locations (between 0 and 200 mm depth at 5 mm intervals). We observed differences in visuotactile interaction during obstacle avoidance dependent on the location of the stimulation trigger: visual interference was enhanced for tactile stimulation that occurred when the hand was near the to-be-avoided object. We show that to-be-avoided obstacles, which are relevant for action but are not to-be-interacted with (as the terminus of an action), automatically evoke the tactile consequences of interaction. This shows that visuotactile remapping extends to obstacle avoidance and that this process is flexible.
... [1] There have been numerous neurophysiological and behavioral studies demonstrating extensive interactions among senses, as well as researches indicating the processes underlying multisensory information processing. [2] In this study, we focus on information integration between visual inputs and vestibular inputs. They both convey information about heading direction. ...
Preprint
Full-text available
Neurons in visual and vestibular information integration areas of macaque brain such as medial superior temporal (MSTd) and ventral intraparietal (VIP) have been classified into congruent neurons and opposite neurons, which prefer congruent inputs and opposite inputs from the two sensory modalities, respectively. In this work, we propose a mechanistic spiking neural model that can account for the emergence of congruent and opposite neurons and their interactions in a neural circuit for multi-sensory integration. The spiking neural circuit model is adopted from an established model for the circuits of the primary visual cortex with little changes in parameters. The network can learn, based on the basic Hebbian learning principle, the correct topological organization and behaviors of the congruent and opposite neurons that have been proposed to play a role in multi-sensory integration. This work explore the constraints and the conditions that lead to the development of a proposed neural circuit for cue integration. It also demonstrates that such neural circuit might indeed be a canonical circuit shared by computations in many cortical areas.
... Research on crossmodal spatial attention has shown that spatial cueing works even when cue and stimulus are from different modalities [16][17][18] and, in more realistic conditions. For instance, Begault [19] found that the performance in a visual search task of airline crews improved when using spatial auditory cues in the same location of the target, compared to when the cue was a warning message ("traffic, traffic!") with no spatial information. ...
Article
In an era characterized by the increasing complexity of products and the rapid turnover of the workforce across different companies, there is a growing need to invest significantly in quick and efficient training methods. Concurrently, the advancement of digitalization has rendered certain training practices anchored to paper-based materials obsolete. Numerous companies are directing their investments toward digital training, yet the optimal format to exploit the full advantages of digitalization remains unclear. This study undertakes a comparison of four distinct digital versions of the same training process with the aim of comprehending the tangible benefits. The findings indicate that to fully capitalize on the advantages of digital technology, a complete rethinking of training practices is necessary.
Article
Full-text available
We asked whether, in the first year of life, the infant brain can support the dynamic crossmodal interactions between vision and somatosensation that are required to represent peripersonal space. Infants aged 4 (n = 20, 9 female) and 8 (n = 20, 10 female) months were presented with a visual object that moved towards their body or receded away from it. This was presented in the bottom half of the screen and not fixated upon by the infants, who were instead focusing on an attention getter at the top of the screen. The visual moving object then disappeared and was followed by a vibrotactile stimulus occurring later in time and in a different location in space (on their hands). The 4-month-olds’ somatosensory evoked potentials (SEPs) were enhanced when tactile stimuli were preceded by unattended approaching visual motion, demonstrating that the dynamic visual-somatosensory cortical interactions underpinning representations of the body and peripersonal space begin early in the first year of life. Within the 8-month-olds’ sample, SEPs were increasingly enhanced by (unexpected) tactile stimuli following receding visual motion as age in days increased, demonstrating changes in the neural underpinnings of the representations of peripersonal space across the first year of life.
Chapter
Representation of object features can help visually impaired people better comprehend their surrounding environment. Tacton (Tactile Icon) is an effective method to extract and express information non-visually, utilizing users’ tactile perception capacities. However existing vibrotactile displays mainly place emphasis on directional guidance, and the number of representable object features is very limited. To leverage the egocentric spatial cognition habit and high tactile perception sensitivity of visually impaired users, this research proposes a user-centered vibrotactile cueing strategy to convey 30 kinds of spatial information through 30 tactons played by 4 vibrators on the back and front side of a pair of gloves. Three parameters including vibration sequence, stimulus location, and intensity are used to encode 10 typical objects located in 3 directions with 2 alert levels. User tests in both laboratory and natural settings are conducted to evaluate the validity of the strategy. The recognition accuracy of the designed tacton has reached 98.99% within a recognition time of less than 0.6s, indicating that this strategy can provide practical assistance for visually impaired users to perceive and respond to the pre-defined spatial information. The multi-parameter tactons provide possibility to encode a wide variety of spatial information by exploiting the communication capacities of the tactile channel of visually impaired users.
Chapter
This chapter provides an overview of eXtended Reality (XR), a term that encompasses technologies such as Virtual Reality (VR), Augmented Reality (AR), and Mixed Reality (MR). XR allows the real world to be enriched with virtual data, objects, and information, offering varying levels of immersion and interaction. The chapter explores the components of XR and discusses classification frameworks, including Milgram’s Reality-Virtuality Continuum and the 3I model. It further delves into the developments and technologies of VR, highlighting stereoscopic vision, VR headsets and immersive walls. The chapter also explores AR, its accessibility through everyday devices, and its applications. The importance of considering all sensory modalities for creating immersive XR experiences is emphasized, and the role of multimodal interaction in enhancing user experiences is discussed. The chapter highlights the significance of a multisensory approach in achieving a sense of presence and explores examples of early immersive and multisensory technology. Finally, it discusses the impact of touch in human interaction and the incorporation of tactile sensations in VR applications, as well as of olfaction, which plays a significant role in human perception and memory, as certain smells can evoke strong emotional responses and trigger vivid recollections of past experiences. Incorporating odors into XR applications can enhance the overall sense of presence and realism by providing users with scent cues that align with the virtual environment. Olfactory displays are devices that release specific odors or scents to accompany virtual content.
Chapter
Fuzzy set theory and its role in psychology are introduced in this chapter. Fuzzy models accept that the human perception of the world is not black and white but includes a degree of grayness (e.g., in diagnosis where the presence or absence symptoms may or may not lead to a diagnosis of a particular illness and in the use of language where, for example, a person saying “I am famous” is only their personal subjective opinion). These models attempt to account for this uncertainty in perception when model building by using a fuzzy layer based upon expert perception expressed in language and quantifying these opinions using probabilities in a fuzzy perceptual map. Properties of the set of perceptions are presented and simple mathematical distributions used to illustrate the membership of these fuzzy sets of perceptions are defined and illustrated, such as cardinality, support, core, height, normalization, and crossover points. Finally, a worked example with code using procedures in R is given, looking at the relationship between depression and multiple sclerosis.KeywordsFuzzy setsIndeterminacyLanguageExpert opinionFuzzy inference systemFuzzy mapsT-normsS-normsMamdani systems
Article
Full-text available
The adult brain demonstrates remarkable multisensory plasticity by dynamically recalibrating itself based on information from multiple sensory sources. After a systematic visual-vestibular heading offset is experienced, the unisensory perceptual estimates for subsequently presented stimuli are shifted toward each other (in opposite directions) to reduce the conflict. The neural substrate of this recalibration is unknown. Here, we recorded single-neuron activity from the dorsal medial superior temporal (MSTd), parietoinsular vestibular cortex (PIVC), and ventral intraparietal (VIP) areas in three male rhesus macaques during this visual-vestibular recalibration. Both visual and vestibular neuronal tuning curves in MSTd shifted - each according to their respective cues' perceptual shifts. Tuning of vestibular neurons in PIVC also shifted in the same direction as vestibular perceptual shifts (cells were not robustly tuned to the visual stimuli). By contrast, VIP neurons demonstrated a unique phenomenon: both vestibular and visual tuning shifted in accordance with vestibular perceptual shifts. Such that, visual tuning shifted, surprisingly, contrary to visual perceptual shifts. Therefore, while unsupervised recalibration (to reduce cue conflict) occurs in early multisensory cortices, higher-level VIP reflects only a global shift, in vestibular space.
Article
Full-text available
Although multisensory integration (MSI) has been extensively studied, the underlying mechanisms remain a topic of ongoing debate. Here we investigate these mechanisms by comparing MSI in healthy controls to a clinical population with spinal cord injury (SCI). Deafferentation following SCI induces sensorimotor impairment, which may alter the ability to synthesize cross-modal information. We applied mathematical and computational modeling to reaction time data recorded in response to temporally congruent cross-modal stimuli. We found that MSI in both SCI and healthy controls is best explained by cross-modal perceptual competition, highlighting a common competition mechanism. Relative to controls, MSI impairments in SCI participants were better explained by reduced stimulus salience leading to increased cross-modal competition. By combining traditional analyses with model-based approaches, we examine how MSI is realized during normal function, and how it is compromised in a clinical population. Our findings support future investigations identifying and rehabilitating MSI deficits in clinical disorders.
Article
Full-text available
Deep learning has achieved state-of-the-art performances in several research applications nowadays: from computer vision to bioinformatics, from object detection to image generation. In the context of such newly developed deep-learning approaches, we can define the concept of multimodality. The objective of this research field is to implement methodologies which can use several modalities as input features to perform predictions. In this, there is a strong analogy with respect to what happens with human cognition, since we rely on several different senses to make decisions. In this article, we present a short survey on multimodal integration using deep-learning methods. In a first instance, we comprehensively review the concept of multimodality, describing it from a two-dimensional perspective. First, we provide, in fact, a taxonomical description of the multimodality concept. Secondly, we define the second multimodality dimension as the one describing the fusion approaches in multimodal deep learning. Eventually, we describe four applications of multimodal deep learning to the following fields of research: speech recognition, sentiment analysis, forensic applications and image processing.
Thesis
Full-text available
Forty years have passed since the coining of the term "peripersonal space" (PPS), that region of space in which our daily life takes place, in which we can interact with the objects and people around us. The first studies of the electrophysiological literature of this spatial representation have observed in specific regions of the macaque’s brain the existence of multisensory neurons capable of encoding tactile, visual and / or auditory stimuli according to their distance from specific parts of the body. These bi- or trimodal neurons, indeed, show tactile receptive fields centered on a specific part of the body, such as the face or hand, and visual and / or auditory receptive fields overlapping spatially with the formers. In this way, the same neurons are able to respond to tactile, visual and auditory stimulations delivered on or close to a specific body-part. Furthermore, these multisensory receptive fields are "anchored" to each other: the movement of the monkey's hand involves a coherent displacement not only of the tactile receptive fields, but also of the visual ones. This body-part centered reference frame of the coding of multisensory stimuli within PPS allows to keep the information relating to the position of the different parts of the body and surrounding objects always updated, with the aim of planning and implementing effective actions. Neurophysiological and behavioral studies on patients suffering from extinction and neglect following brain lesions of the right hemisphere have allowed to highlight, even in humans, the existence and modularity of the PPS. Subsequent neuroimaging studies have brought support to this evidence, highlighting a network of fronto-parietal and subcortical regions capable of coding multi-modal stimulations according to their distance from the body. The functions of this spatial representation are manifold: mediate the relationship between the perception of external stimuli and the execution of goal-directed actions, monitoring the space around the body in order to identify potential threats and implement defensive reactions, organize and manage the space between us and others in case of different types of social interaction and allow us to identify ourselves with our body, giving it a localization in space. However, despite the great scientific interest that this region of space has elicited over the past forty years, a direct comparison of its neural underpinnings in non-human primates and humans is still missing. For this reason, in the first chapter of this doctoral dissertation we will report the results of an fMRI study, conducted on human and macaque participants, which investigated the neural response patterns to stimulations close to or far from different body-parts, minimizing the differences among the experimental protocols used in the two species. For the first time PPS is tested in two different species but with the same experimental protocol, highlighting similarities and differences between the human and simian PPS circuit but also between the response patterns associated with the stimulation of different bodily districts. Starting from the second chapter we will instead focus our interest only on human participants, to try to shed light on a defining problem that has overlapped the concept of PPS representation to that of a second spatial representation: the arm reaching space (ARS). The latter, considered as the space around the body that we can reach by extending our arm, over time has often been used as a synonym for the PPS representation, leading to define PPS as ARS or to test the two spatial representations with the same experimental protocols. However, the different neural bases and the different characteristics of the encoding of stimuli within these two regions of space suggest their distinction. In chapter II, to this purpose, we will present a series of five behavioral experiments that investigated the differences and similarities between PPS and ARS .. [etc]
Article
Studies in virtual reality (VR) have introduced numerous multisensory simulation techniques for more immersive VR experiences. However, although they primarily focus on expanding sensory types or increasing individual sensory quality, they lack consensus in designing appropriate interactions between different sensory stimuli. This paper explores how the congruence between auditory and visual (AV) stimuli, which are the sensory stimuli typically provided by VR devices, affects the cognition and experience of VR users as a critical interaction factor in promoting multisensory integration. We defined the types of (in)congruence between AV stimuli, and then designed 12 virtual spaces with different types or degrees of congruence between AV stimuli. We then evaluated the presence, immersion, motion sickness, and cognition changes in each space. We observed the following key findings: 1) there is a limit to the degree of temporal or spatial incongruence that can be tolerated, with few negative effects on user experience until that point is exceeded; 2) users are tolerant of semantic incongruence; 3) a simulation that considers synesthetic congruence contributes to the user's sense of immersion and presence. Based on these insights, we identified the essential considerations for designing sensory simulations in VR and proposed future research directions.
Article
Full-text available
There is a cross-modal mapping between auditory pitch and many visual properties, but the relationship between auditory pitch and motion speed is unexplored. In this article, the ball and baffle are used as the research objects, and an object collision experiment is used to explore the perceptual influence of auditory pitch on motion speed. Since cross-modal mapping can influence perceptual experience, this article also explores the influence of auditory pitch on action measures. In Experiment 1, 12 participants attempted to release a baffle to block a falling ball on the basis of speed judgment, and after each trial, they were asked to rate the speed of the ball. The speed score and baffle release time were recorded and used for analysis of variance. Since making explicit judgments about speed can alter the processing of visual paths, another group of participants in Experiment 2 completed the experiment without making explicit judgments about speed. Our results show that there is a cross-modal mapping between auditory pitch and motion speed, and high or low tones cause perception shift to faster or slower speeds.
Article
Full-text available
The modulation of attentional load on the perception of auditory and visual information has been widely reported; however, whether attentional load alters audiovisual integration (AVI) has seldom been investigated. Here, to explore the effect of sustained auditory attentional load on AVI and the effects of aging, nineteen older and 20 younger adults performed an AV discrimination task with a rapid serial auditory presentation task competing for attentional resources. The results showed that responses to audiovisual stimuli were significantly faster than those to auditory and visual stimuli (AV>V≥A, all pload_2>load_3>load_4) for both older and younger adults. In addition, AVI was lower and more delayed in older adults than in younger adults in all attentional load conditions. These results suggested that auditory sustained attentional load decreased AVI and that AVI was reduced in older adults.
Article
Full-text available
Imitation of human behaviors is one of the effective ways to develop artificial intelligence. Human dancers, standing in front of a mirror, always achieve autonomous aesthetics evaluation on their own dance motions, which are observed from the mirror. Meanwhile, in the visual aesthetics cognition of human brains, space and shape are two important visual elements perceived from motions. Inspired by the above facts, this paper proposes a novel mechanism of automatic aesthetics evaluation of robotic dance motions based on multiple visual feature integration. In the mechanism, a video of robotic dance motion is firstly converted into several kinds of motion history images, and then a spatial feature (ripple space coding) and shape features (Zernike moment and curvature-based Fourier descriptors) are extracted from the optimized motion history images. Based on feature integration, a homogeneous ensemble classifier, which uses three different random forests, is deployed to build a machine aesthetics model, aiming to make the machine possess human aesthetic ability. The feasibility of the proposed mechanism has been verified by simulation experiments, and the experimental results show that our ensemble classifier can achieve a high correct ratio of aesthetics evaluation of 75%. The performance of our mechanism is superior to those of the existing approaches.
Chapter
The automotive industry is facing economic and technical challenges. The economic situation calls for more efficient processes, not only production processes but also renewals in the development process. Accelerating design work and simultaneously securing safe process outcome leads to products in good correspondence with market demands and institutional goals on safe traffic environments. The technique challenge is going from almost pure mechanical constructions to mechatronic systems, where computer-based solutions may affect core vehicle functionality. Since subcontractors often develop this new technology, system integration is increasingly important for the car manufacturers. To meet these challenges we suggest the simulator-based design approach. This chapter focuses on human-in-the- loop simulation, which ought to be used for design and integration of all car functionality affecting the driver. This approach has been proved successful by the aerospace industry, which in the late 1960s recognized a corresponding technology shift.
Article
Full-text available
Article
Full-text available
Eight experiments examined the conditions necessary for covert orienting and inhibition of return (IOR) to occur in audition. Spatially uninformative auditory cues facilitated responses to auditory targets at short stimulus onset asynchronies (SOAs) and inhibited them at longer SOAs when the decision to respond was based on the location of the target (Experiments 1, 3, and 4). The same cues did not influence performance when the decision to respond was based on nonspatial criteria (Experiments 2, 5, and 7) unless the cues predicted the location of the target (Experiment 6). In the absence of cues, the location of a previous target influenced performance when the decision to respond was based on spatial, but not nonspatial, criteria (Experiment 8). These findings demonstrate that covert orienting and IOR occur in audition only when spatial relevance is established, presumably inducing use of location-sensitive neurons in generating responses. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Compared audiovisual interactions in the ventriloquism effect and McGurk illusions. The ventriloquism effect was estimated through a discordance detection (auditory localization) task and the McGurk illusion through an identification task. The stimuli (4 consonant-vowel monosyllables articulated by man filmed from top of nose to tip of chin) were visually displayed on a screen located in front of the Ss' (32 19–33 yr olds) heads and acoustically delivered through 1 of 9 hidden loudspeakers located from straight ahead to 80 degrees left and right. The speaker's face was either upright or inverted. Results show that the ventriloquism effect was affected by the degree of spatial separation, but unaffected by upright vs inverted presentation of the face, or by the congruency of the stimuli. The McGurk illusion was of the same size whatever the loudspeaker location but was reduced by face inversion. The differences in the spatial and cognitive rules that govern both interactions are discussed in terms of specific functionality of the underlying mechanisms. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Covert orienting in hearing was examined by presenting auditory spatial cues prior to an auditory target, requiring either a choice or detection response. Targets and cues appeared on the left or right of Ss' midline. Localization of the target in orthogonal directions (up vs down or front vs back, independent of target side) was faster when cue and target appeared on the same rather than opposite sides. This benefit was larger and more durable when the cue predicted target side. These effects cannot reflect criterion shifts, suggesting that covert orienting enhances auditory localization. Fine frequency discriminations also benefited from predictive spatial cues, although uninformative cues only affected spatial discriminations. No cuing effects were observed in a detection task. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Presents a historical overview of the study of visual attention shifts. Contemporary research on this problem is outlined, and models of attention shift mechanisms are briefly described. In addition, several methodological variables are described that warrant consideration when evaluating claims about shifts of visual attention. These factors are cue type, temporal parameters, task and response type, trial presentation order, S expertise, and neutral trials and cost/benefit analyses. (French abstract) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
When the image of a speaker saying the bisyllable /aga/ is presented in synchrony with the sound of a speaker saying /aba/, subjects tend to report hearing the sound /ada/. The present experiment explores the effects of spatial separation on this class of perceptual illusion known as the McGurk effect. Synchronous auditory and visual speech signals were presented from different locations. The auditory signal was presented from positions 0°, 30°, 60° and 90° in azimuth away from the visual signal source. The results show that spatial incongruencies do not substantially influence the multimodal integration of speech signals.
Article
Full-text available
This study is concerned with three questions about the role of attention in peripheral detection: (1) Is the increased detection rate for spatial locations with high probabilities of target occurrence assigned to them due to sensitivity or criterion effects? (2) Does the effect of spatial cuing (probabilistic priming) require different explanations for letter detection and detection of luminance increments (Shaw, 1984)? (3) Can attention be shared between two separate locations cued to be most likely (Posner, Snyder, & Davidson, 1980)? These questions were investigated in two experiments, both using a signal detection plus localization task (rating method). In Experiment 1 (symbol detection), single or double cues indicating one or two most likely locations (three or two least likely locations) were presented. Introducing the second cued location resulted in a marked sensitivity gain for this position, relative to uncued locations in the single-cue condition. Decision criteria were more liberal for cued and more conservative for uncued locations. In Experiment 2, a luminance increment (single target probe) and two symbol detection (target plus distractors) tasks were compared. For symbol detection, there was a marked priming effect; but for luminance detection, cued locations showed no advantage in sensitivity. However, all tasks showed differential criterion setting for cued and for uncued locations. These results suggest that letter detection is capacity limited, whereas luminance increment detection is not, and furthermore, that decision criteria are largely preset according to a priori target probabilities assigned to particular locations.
Article
Full-text available
Thesis (Ph. D.)--University of London, 2000. Includes bibliographical references. Photocopy.
Article
We previously found that the gradient of attention can express itself visually in an illusory motion perceived in a line. The motion illusion was produced by both stimulus-induced (bottom-up) and voluntary (top-down) attention, which suggested that the two kinds of attention act on relatively early stages of visual processing. The objective of our study was to examine how various modes of spatial attention might be represented and reorganized in the brain. Using the induction of illusory line motion as a measure we found that (1) once attention is captured by a moving object, it follows the object as it moves; and (2) attention moves with a saccade in the retinal coordinates such that its focus remains fixed in space. We then asked whether attention acts across different sensory modalities. We found that both auditory and somatosensory cues induced focal visual attention at the location in space where the cue was presented. Based on these findings, we propose a model that would allow (1) matching of visual spatial information obtained across saccades; and (2) matching of spatial information obtained in different sensory modalities.
Chapter
This chapter focuses on a particular neurophysiological symptom related to impaired spatial functions, called 'extinction'. This phenomenon has been investigated in an attempt to reveal the functioning of multisensory spatial representation in humans. The chapter presents a series of highly convergent neurophysiological findings that provide strong evidence in favour of the existence, in humans, of integrated systems representing space through the multisensory coding of tactile and visual or auditory events occurring proximal to our bodies, that is, near peripersonal space.
Article
Physiological studies have demonstrated that inputs from different sensory modalities converge on, and are integrated by, individual superior colliculus neurons and that this integration is governed by specific spatial rules. The present experiments were an attempt to relate these neural processes to overt behavior by determining if behaviors believed to involve the circuitry of the superior colliculus would show similar multisensory dependencies and be subject to the same rules of integration. The neurophysiological-behavioral parallels proved to be striking. The effectiveness of a stimulus of one modality in eliciting attentive and orientation behaviors was dramatically affected by the presence of a stimulus from another modality in each of the three behavioral paradigms used here. Animals trained to approach a low intensity visual cue had their performance significantly enhanced when a brief, low intensity auditory stimulus was presented at the same location as the visual cue, but their performance was significantly depressed when the auditory stimulus was disparate to it. These effects were independent of the animals' experience with the modifying (i.e. auditory) stimulus and exceeded what might have been predicted statistically based on the animals' performance with each single-modality cue. The multiplicative nature of these multisensory interactions and their dependence on the relative positions and intensities of the two stimuli were all very similar to those observed physiologically for single cells. The few differences that were observed appeared to reflect the fact that understanding integration at the level of the single cell requires reference to the individual cell's multisensory receptive field properties, while at the behavioral level populations of receptive fields must be evaluated. These data illustrate that the rules governing multisensory integration at the level of the single cell also predict responses to these stimuli in the intact behaving organism.
Article
There has been a rapid growth of interest in the study of crossmodal attentional capture in recent years, as more and more researchers have started to address the question of whether or not the presentation of a spatially-nonpredictive peripheral event in one sensory modality will lead to a reflexive shift of attention in another modality (such as, for example, whether a sudden white noise burst or tap on the hand will capture visual attention). Although there has been a great deal of controversy regarding the existence of crossmodal capture between audition and vision (e.g., Spence & Driver, 1997a; Ward, 1994; Ward, McDonald, & Lin, 2000), empirical research now supports the view that crossmodal capture effects occur between all combinations of auditory, visual, and tactile stimuli, at least under certain conditions. In the present chapter, the key behavioural findings on crossmodal capture are reviewed, and an attempt is made to resolve this controversy over the existence of audiovisual capture effects.
Article
Previous experiments showing the importance of visual factors in auditory localization are shown to have been insufficiently quantitative. In the first Experiment, bells were rung and lights shone on the same or different vectors, eleven subjects indicating which bell had rung. In the second Experiment, a puff of steam was seen to issue from a kettle whistle with no whistling sound, while similar whistles were sounded by compressed air on that or another vector. Twenty-one subjects cooperated. The addition of a visual stimulus at 0° deviation increased the percentage of correct responses significantly in the second, and insignificantly in the first experiment. At 20°-30° deviation the proportion of naive responses to the visual cue was 43 per cent. in the first and 97 per cent, in the second experiment. At greater angular deviations, the proportion of naive responses fell to chance level in the first, but remained significant in the second experiment, even at 90°. The “visuo-auditory threshold” was found to be 20°-30°, but might be much larger if there were more grounds for supposing the two stimuli to be from the same source in space.
Article
The presentstudy examines mechanisms ofendogenous covertspatialorienting inaudition as revealed by event-related brain potentials (ERPs) and reaction times (RTs). In one experimental condition, subjects were instructed to respond to any target tone irrespective of whether it was presented in a valid (spatially predictive cue), neutral (uninformative cue), or invalid (misleading cue) trial. In another experimental condition, only target tones presented at a cued position required a response-that is, subjects could completely ignore tones presented at the uncued ear. Cue validity had an effect on RT, which consisted in benefits for valid trials and in costs for invalid trials relative to the RTs in neutral trials. There were also distinct ERP effects of cue validity in the 100-300 msec time range. These ERP effects were enlarged in the condition in which uncued tones could be ignored. The effects of cue validity onRTsandERPsdemonstratedcovertorientinginauditionbothforstimulirequiring anovert response andalso for stimuli that did not require a behavioural response. It is argued that this attentional selection is located at intermediate stages of information processing, rather than at peripheral stages such as basic sensory-specific processing or response selection.
Article
Selective visual attention to objects and locations depends both on deliberate behavioral goals that regulate even early visual representations (goal-directed influences) and on autonomous neural responses to sensory input (stimulus-driven influences). In this chapter, I argue that deliberate goal-directed attentional strategies are always constrained by involuntary, "hard-wired computations, and that an appropriate research strategy is to delineate the nature of the interactions imposed by these constraints. To illustrate the inter- action between goal-directed and stimulus-driven attentional control, four domains of visual selection are reviewed. First, selection by location is both spatially and temporally limited, reflecting in part early visual representations of the scene. Second, selection by feature is an available attentional strategy, but it appears to be mediated by location, and feature salience alone does not govern the deployment of attention. Third, early visual seg- mentation processes that parse a scene into perceptual object representations enable object-based selection, but they also enforce selection of entire objects, and not just isolated features. And fourth, the appearance of a new perceptual object captures attention in a stimulus-driven fashion, but even this is subject to some top-down attentional control. Possible mechanisms for the interaction between bottom-up and top-down control are discussed.
Article
Judgments of the intensity of a stimulus are dependent on the level of central nervous system activity it generates. Generally, it is assumed that such judgments are based on activity along modality-specific pathways. Thus, visual intensity judgments would be based on unimodal visual activity. However, many neurons do not fit neatly within modality-specific categories, but can be influenced by more than one sensory modality. Often the multisensory effect is quite pronounced. If these multisensory neurons participate in such fundamental functions as perceived intensity, the presence of a nonvisual (i.e., auditory) cue may have a significant effect on the perceived intensity of a visual cue. The results of the present study were consistent with such a hypothesis. A brief, broad-band auditory stimulus was found to significantly enhance the perceived intensity of an LED. The effect was most pronounced at the lowest visual intensities, and was evident regardless of the location of the auditory cue. However, it was present only at the location of visual fixation. Yet, despite the significant influence of the auditory cue, and its differential effect at different visual intensities, a power function that maintains the proportionality among perceived visual intensities was retained.
Article
Tactile-visual links in spatial attention were examined by presenting spatially nonpredictive tactile cues to the left or right hand, shortly prior to visual targets in the left or right hemifield. To examine the spatial coordinates of any crossmodal links, different postures were examined. The hands were either uncrossed, or crossed so that the left hand lay in the right visual field and vice versa. Visual judgments were better on the side where the stimulated hand lay, though this effect was somewhat smaller with longer intervals between cue and target, and with crossed hands. Event-related brain potentials (ERPs) showed a similar pattern. Larger amplitude occipital N1 components were obtained for visual events on the same side as the preceding tactile cue, at ipsilateral electrode sites. Negativities in the Nd2 interval at midline and lateral central sites, and in the Nd1 interval at electrode Pz, were also enhanced for the cued side. As in the psychophysical results, ERP cueing effects during the crossed posture were determined by the side of space in which the stimulated hand lay, not by the anatomical side of the initial hemispheric projection for the tactile cue. These results demonstrate that crossmodal links in spatial attention can influence sensory brain responses as early as the N1, and that these links operate in a spatial frame-of-reference that can remap between the modalities across changes in posture.
Article
A first series of experiments had demonstrated certain conditions eliciting or inhibiting a “pendulum” phenomenon in the visual perception of apparent movement. The present study consists of five further variations designed to show more clearly conditions of occurrence and non-occurrence of this type of movement. The main findings are: (1) Altering the axis of display to vertical significantly reduces the frequency of pendular-movement perception; (2) Altering the position of metronome from behind to the side of the visual display, gives results almost identical with those where the metronome was inaudible, but, when the metronome is illuminated in this position, all forms of movement perception are reduced, and no pendular movement is reported. The results for all the ten conditions, including the five of the first series are summarized, and the following possible factors are discussed: past experience, physiological nystagmus, and intervening adaptation. All three may be required to account for the perceptual phenomena under investigation and the dichotomizing of explanations into “experiential,” or “physiological,” appears to be arbitrary and inconsistent with the complexity of the observed facts.
Article
Vision is both an important part of our sensory and perceptual capacities and an essential source of information for guiding actions and decisions in robot systems. Roger Watt provides a unified approach to human and computer vision in this book, in the first part developing a theoretical framework for understanding visual processes in both types of system, and in the second part explaining the nature of psychophysical enquiry into human vision. The material covered includes the functions of vision, the physical and statistical nature of optical images, the mathematical nature of elementary image operations and descriptions, and the use of visual models in interpreting the visual ouput. Key features of the book are: the formal analysis of vision supported by introductory explanations of all necessary concepts and mathematics; extensive illustrations and student exercises; detailed description and analysis of psychophysical experiments and data; [and] a classified bibliography for each chapter. This is an important text for students of vision, psychology, engineering and computer science, providing the foundations and material for understanding vision in its human and machine domains. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
as a midbrain site essential for mediating attentive and orientation behaviors to stimuli from different senses, the superior colliculus has provided an effective model for understanding the neural dynamics of multisensory integration and their effects on overt behavior / many of the fundamental determinants of multisensory integration revealed by this model appear to be operative at many sites in the CNS and may govern such higher-order functions as perception and cognition as well as the more immediate behavioral responses involving the midbrain / furthermore, the neural principles that guide multisensory integration appear to apply equally well to such widely divergent species as hamsters, rats, cats, monkeys, and humans (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Reviews previous research regarding the control of visual attention and proposes an integrative model that has implications for how we understand attention more generally. Specific issues addressed include location cueing and shifts of visual attention, preattentive and attentive visual processing, spatial indexing and preattentive localization, visual attention and eye movements, and the cognitive architecture of stimulus-driven attention shifts. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Describes a unified experimental approach to the study of the mind based on experiments in the time course of human information processing. New studies on the role of intensity in information processing, on vigilance, and on orienting and detecting are presented. A historical introduction to mental chronometry together with an integration of performance and physiological techniques for its study are provided. (15 p ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Traditional studies of spatial attention consider only a single sensory modality at a time (e.g. just vision, or just audition). In daily life, however, our spatial attention often has to be coordinated across several modalities. This is a non-trivial problem, given that each modality initially codes space in entirely different ways. In the last five years, there has been a spate of studies on crossmodal attention. These have demonstrated numerous crossmodal links in spatial attention, such that attending to a particular location in one modality tends to produce corresponding shifts of attention in other modalities. The spatial coordinates of these crossmodal links illustrate that the internal representation of external space depends on extensive crossmodal integration. Recent neuroscience studies are discussed that suggest possible brain mechanisms for the crossmodal links in spatial attention.
Article
Normal subjects performed simple reaction time responses to lateralized visual target stimuli (Experiment 1) and lateralized tactile target stimuli (Experiment 2). In each experiment, the lateralized targets were preceded at one of four intervals by a visual or tactile cue located on the same (valid cue), or opposite (invalid cue) side, or on both sides (neutral cue). The validity of the visual and tactile cues influenced the speed of response to either target stimulus. These findings, together with those previously reported (Buchtel and Butter, Neuropsychologia26, 499–509, 1988), are consistent with the view that intra-and inter-modal spatial cueing is effective with modalities that are linked to orienting systems in which movements of the sensory array serve to improve sensory analysis.
Article
This study deals with the relationship between the momentary objective probability of the delivery of a stimulus and the reaction time in a simple reaction-time task. The hypothesis was that the reaction time is closely related to the objective probability via expectancy i.e., the momentary probability of the delivery of the stimulus as experienced by the subject. This problem was experimentally approached from two directions: (1) by varying the objective probability, in which case the reaction times should change inversely with the objective probability, and (2) by keeping the objective probability constant (by using the Bernoulli process), in which case the reaction times should not change. Eight male subjects were used. The first assumption proved to be correct, whereas the second held only when certain mean inter-stimulus intervals were used.Finally, the status of the expectancy concept as an explanatory variable in the relationship between the fore-period and the reaction time was discussed, with emphasis on the need for some other explanatory concepts, which were proposed.
Article
Two experiments examined auditory spatial facilitation of visual search performance under conditions varying in auditory cue precision and visual distractor density. The auditory cue was spatially coincided with the target, was displaced from the target by 6°, or was uninformative. Distractors were manipulated globally (throughout the field) and locally (within 6.5° of the target) separately at densities of 0%, 20%, and 80%. In Experiment 1, auditory cue precision was constant and distractor densities varied within a trial block; in Experiment 2, auditory precision varied and distractor densities were constant within a trial block. Coincident auditory cues minimized local and global distractor effects in both experiments, suggesting that auditory spatial cues facilitate both target localization and identification. The effectiveness of displaced auditory cues depended on cue reliability: In some conditions, displaced cues caused higher mean search latencies than did centered cues, indicating that participants were unable to ignore inaccurate auditory stimuli. Actual or potential applications of this research include virtual audio environments and auditory displays in cockpits.
Article
40 female undergraduates performed a unimanual choice RT task, pressing either a left or right button in response to the onset of a light which illuminated the button. In a predetermined random sequence of trials, the stimulus light was either presented alone or concurrently with a monaural or binaural tone. The right response was facilitated by the tone to the right ear but inhibited by the tone to the left ear, while the left response was facilitated by the tone to the left ear, but inhibited by the tone to the right ear. RT on the binaural trials was faster than on the no-tone trials.
Article
In the present study we investigated cross-modal orienting in vision and hearing by using a cueing task with four different horizontal locations. Our main interest concerned cue–target distance effects, which might further our insight in the characteristics of cross-modal spatial attention mechanisms. A very consistent pattern was observed for both the unimodal (cue and target were both visual or auditory) and the cross-modal conditions (cue and target from different modalities). RTs to valid trials were faster than for invalid trials, and, most interestingly, there was a distance effect: RTs increased with greater cue–target distance. This applied to detection of visual targets and to localisation of both visual and auditory targets. The time interval between cue and target was also varied. Surprisingly, there was no indication of inhibition of return even with the longest cue–target intervals. In order to assess the role of endogenous (strategic) factors in exogenous spatial attention we increased in two additional experiments the cue validity from 25% to 80% . This appeared to have no large influence on the cueing pattern in both the detection and localisation tasks. Currently, it is assumed that spatial attention is organised in multiple strongly linked modality-specific systems. The foregoing results are discussed with respect to this supposed organisation.
Article
In the past two decades, attention has been one of the most investigated areas of research in perception and cognition. The Psychology of Attention presents a systematic review of the main lines of research on attention; the topics range from perception of threshold stimuli to memory storage and decisionmaking. The book develops empirical generalizations about the major issues and suggests possible underlying theoretical principles. Harold E. Pashler argues that widely assumed notions of processing resources and automaticity are of limited value in understanding human information processing. He proposes a central bottleneck for decisionmaking and memory retrieval, and describes evidence that distinguishes this limitation from perceptual limitations and limited-capacity short-term memory.