Article

Novel sonification designs: Compressed, iconic, and pitch-dynamic auditory icons boost driving behavior

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

With the development of connected vehicles, in-vehicle auditory alerts enable drivers to effectively avoid hazards by quickly presenting critical information in advance. Auditory icons can be understood quickly, evoking a better user experience. However, as collision warnings, the design and application of auditory icons still need further exploration. Thus, this study aims to investigate the effects of internal semantic mapping and external acoustic characteristics (compression and dynamics design) on driver performance and subjective experience. Thirty-two participants (17 females) experienced 15 types of warnings — (3 dynamics: mapping 0 vs. 1 vs. 2) × (5 warning types: original iconic vs. original metaphorical vs. compressed iconic vs. compressed metaphorical auditory icon vs. earcon) — in a simulator. We found that compression design was effective for rapid risk avoidance, which was more effective in iconic and highly pitch-dynamic sounds. This study provides additional ideas and principles for the design of auditory icon warnings.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Auditory warnings in vehicles can be mainly categorized into speechbased and non-speech-based warnings (Noyes et al., 2006). Speech-based warnings mainly include speech and spearcon warnings (Song et al., 2022). Speech warnings contain specific semantic information that alerts drivers of driving-related events. ...
... This characteristic is important in in-vehicle interface design, where a late response may result in a serious accident. Therefore, in the manual driving field, several studies have proposed the use of auditory icons as in-vehicle warnings (Bakowski et al., 2015;Larsson et al., 2009;McKeown and Isherwood, 2007;Song et al., 2022). Nevertheless, few studies have investigated the effectiveness of using auditory icons as TORs in automated vehicles. ...
... They found that spearcons lead to a faster response time than speech. McKeown and Isherwood (2007) and Song et al. (2022) also compared auditory icons with earcons in the vehicle warning system, and their results showed that auditory icons result in a faster response time and better driving performance than earcons. Hence, auditory icons and spearcons may have great application potential as TORs. ...
Article
With the era of automated driving approaching, designing an effective auditory takeover request (TOR) is critical to ensure automated driving safety. The present study investigated the effects of speech-based (speech and spearcon) and non-speech-based (earcon and auditory icon) TORs on takeover performance and subjective preferences. The potential impact of the non-driving-related task (NDRT) modality on auditory TORs was considered. Thirty-two participants were recruited in the present study and assigned to two groups, with one group performing the visual N-back task and another performing the auditory N-back task during automated driving. They were required to complete four simulated driving blocks corresponding to four auditory TOR types. The earcon TOR was found to be the most suitable for alerting drivers to return to the control loop because of its advantageous takeover time, lane change time, and minimum time to collision. Although participants preferred the speech TOR, it led to relatively poor takeover performance. In addition, the auditory NDRT was found to have a detrimental impact on auditory TORs. When drivers were engaged in the auditory NDRT, the takeover time and lane change time advantages of earcon TORs no longer existed. These findings highlight the importance of considering the influence of auditory NDRTs when designing an auditory takeover interface. The present study also has some practical implications for researchers and designers when designing an auditory takeover system in automated vehicles.
Article
Full-text available
Auditory alarms in hospitals are ambiguous and do not provide enough information to support doctors and nurses' awareness of patient events. A potential alternative is the use of short segments of time-compressed speech, or spearcons. However, sometimes it might be desirable for patients to understand spearcons and sometimes not. We used reverse hierarchy theory to hypothesize that there will be a degree of compression where spearcons are intelligible for trained listeners but not for untrained listeners. In Experiment 1, spearcons were compressed to either 20% or 25% of their original duration. Their intelligibility was very high for trained participants, but also quite high for untrained participants. In Experiment 2 each word within each spearcon was compressed to a different degree based on the results of Experiment 1. This technique was effective in creating the desired difference in spearcon intelligibility between trained and untrained listeners. An implication of these results is that manipulating the degree of compression of spearcons "by word" can increase the effect of training so that untrained listeners reliably do not understand the content of the spearcons. (PsycInfo Database Record (c) 2020 APA, all rights reserved).
Article
Full-text available
Objective The effectiveness of three types of in-vehicle warnings was assessed in a driving simulator across different noise conditions. Background Although there has been much research comparing different types of warnings in auditory displays and interfaces, many of these investigations have been conducted in quiet laboratory environments with little to no consideration of background noise. Furthermore, the suitability of some auditory warning types, such as spearcons, as car warnings has not been investigated. Method Two experiments were conducted to assess the effectiveness of three auditory warnings (spearcons, text-to-speech, auditory icons) with different types of background noise while participants performed a simulated driving task. Results Our results showed that both the nature of the background noise and the type of auditory warning influenced warning recognition accuracy and reaction time. Spearcons outperformed text-to-speech warnings in relatively quiet environments, such as in the baseline noise condition where no music or talk-radio was played. However, spearcons were not better than text-to-speech warnings with other background noises. Similarly, the effectiveness of auditory icons as warnings fluctuated across background noise, but, overall, auditory icons were the least efficient of the three warning types. Conclusion Our results supported that background noise can have an idiosyncratic effect on a warning’s effectiveness and illuminated the need for future research into ameliorating the effects of background noise. Application This research can be applied to better present warnings based on the anticipated auditory environment in which they will be communicated.
Article
Full-text available
Recent studies have shown that a similarity between sound and meaning of a word (i.e., iconicity) can help more readily access the meaning of that word, but the neural mechanisms underlying this beneficial role of iconicity in semantic processing remain largely unknown. In an fMRI study, we focused on the affective domain and examined whether affective iconic words (e.g., high arousal in both sound and meaning) activate additional brain regions that integrate emotional information from different domains (i.e., sound and meaning). In line with our hypothesis, affective iconic words, compared to their non‐iconic counterparts, elicited additional BOLD responses in the left amygdala known for its role in multimodal representation of emotions. Functional connectivity analyses revealed that the observed amygdalar activity was modulated by an interaction of iconic condition and activations in two hubs representative for processing sound (left superior temporal gyrus) and meaning (left inferior frontal gyrus) of words. These results provide a neural explanation for the facilitative role of iconicity in language processing and indicate that language users are sensitive to the interaction between sound and meaning aspect of words, suggesting the existence of iconicity as a general property of human language.
Conference Paper
Full-text available
The paper discusses a design challenge around the use of adaptive audio to support experience and uptake of autonomous driving. The paper outlines a collaboration that is currently being established between researchers at Swansea university and a major OEM that is set to examine user-centred approaches to designing audio that enhance and enrich human-experience with driving. The paper outlines the potential collaboration and describes how we will address the challenge to designing adaptive audio for unsupervised/autonomous driving. The paper outlines the research question we will address and how we will apply a tool/method that supports rapid prototyping for novice designers alongside addressing ideas around aesthetics in the interface and relationships between sound as a means for communication and as experience.
Article
Full-text available
This paper discusses visual metaphors and aspects of similarity in relation to metaphors. The concept of metaphor should here be understood as a semiotic unit that is also a sign (cf. Ricœur, P. 1986. The Rule of Metaphor: Multi-Disciplinary Studies of the Creation of Meaning in Language . London: Routledge and Kegan Paul.). This implies that not all semiotic units are signs, but also that not all signs are typical metaphors. The metaphor is a particular kind of sign because of its making use of the openness present in similarity relations. Metaphorical meaning making is related to a quality of vagueness in iconic sign relations. Furthermore, a notion of iconic attitude is proposed as a designation of subjective and intersubjective perspectives that might be taken on meanings founded on similarity. The iconic attitude mirrors the flexibility of thought and responds to the potentiality of vagueness in iconic sign relations; but, at the same time, the iconic attitude works as a stabilizing factor for meaning. Moreover, this attitude is crucial for the specification of the similarity relation in an actual sign experience with an iconic ground.
Article
Full-text available
The measurement of cognitive resource allocation during listening, or listening effort, provides valuable insight in the factors influencing auditory processing. In recent years, many studies inside and outside the field of hearing science have measured the pupil response evoked by auditory stimuli. The aim of the current review was to provide an exhaustive overview of these studies. The 146 studies included in this review originated from multiple domains, including hearing science and linguistics, but the review also covers research into motivation, memory, and emotion. The present review provides a unique overview of these studies and is organized according to the components of the Framework for Understanding Effortful Listening. A summary table presents the sample characteristics, an outline of the study design, stimuli, the pupil parameters analyzed, and the main findings of each study. The results indicate that the pupil response is sensitive to various task manipulations as well as interindividual differences. Many of the findings have been replicated. Frequent interactions between the independent factors affecting the pupil response have been reported, which indicates complex processes underlying cognitive resource allocation. This complexity should be taken into account in future studies that should focus more on interindividual differences, also including older participants. This review facilitates the careful design of new studies by indicating the factors that should be controlled for. In conclusion, measuring the pupil dilation response to auditory stimuli has been demonstrated to be sensitive method applicable to numerous research questions. The sensitivity of the measure calls for carefully designed stimuli.
Article
Full-text available
The cooperative adaptive cruise control (CACC) aims to achieve active safe driving that avoids vehicle accidents or a traffic jam by exchanging the road traffic information (e.g., traffic flow, traffic density, velocity variation, etc.) among neighbor vehicles. However, in CACC the butterfly effect is happened while exhibiting asynchronous brakes that easily lead to backward shockwaves and difficult to be removed. Thus, the driving stability is degraded significantly by backward shockwaves and affects the safe driving performance in CACC. Several critical issues should be addressed in CACC, including: (1) difficult to adaptively control the inter-vehicle distances among neighbor vehicles and the vehicle speed, (2) suffering from the butterfly effect, (3) unstable vehicle traffic flow, etc. For addressing above issues in CACC, this paper thus proposes the cooperative adaptive driving (CAD) approach that consists of three contributions: cooperative vehicle platooning (CVP), shockwave-avoidance driving (SAD), and adaptive platoon synchronization (APS). First, a platoon-based cooperative driving among neighbor vehicles is proposed in CVP. Second, in SAD, the predictive shockwave detection is proposed to avoid shockwaves efficiently. Third, based on the traffic states, APS determines the adaptive platoon length and velocity for achieving synchronous control and reduces the butterfly effect when vehicles suddenly brake. Numerical results demonstrate that the proposed CAD approach outperforms the compared approaches in number of shockwaves, average affection range of shockwaves, average vehicle velocity, and average travel time. Additionally, the adaptive platoon length is determined according to the traffic information gathered from the global and local clouds.
Article
Full-text available
The aim of this study was to explore operator experience and performance for semantically congruent and incongruent auditory icons and abstract alarm sounds. It was expected that performance advantages for congruent sounds would be present initially but would reduce over time for both alarm types. Twenty-four participants (12M/12F) were placed into auditory icon or abstract alarm groupings. For each group both congruent and incongruent alarms were used to represent different driving task scenarios. Once sounded, participants were required to respond to each alarm by selecting a corresponding driving scenario. User performance for all sound types improved over time, however even with experience a decrement in speed of response remained for the incongruent iconic sounds and in accuracy of performance for the abstract warning sounds when compared to the congruent auditory icons. Semantic congruency was found to be of more importance for auditory icons than for abstract sounds. Practitioner Alarms are used in many operating systems as emergency, alerting, or continuous monitoring signals for instance. This study found that the type and representativeness of an auditory warning will influence operator performance over time. Semantically congruent iconic sounds produced performance advantages over both incongruent iconic sounds and abstract warnings.
Article
Full-text available
Objective: Auditory displays could be essential to helping drivers maintain situation awareness in autonomous vehicles, but to date, few or no studies have examined the effectiveness of different types of auditory displays for this application scenario. Background: Recent advances in the development of autonomous vehicles (i.e., self-driving cars) have suggested that widespread automation of driving may be tenable in the near future. Drivers may be required to monitor the status of automation programs and vehicle conditions as they engage in secondary leisure or work tasks (entertainment, communication, etc.) in autonomous vehicles. Method: An experiment compared memory for alerted events-a component of Level 1 situation awareness-using speech alerts, auditory icons, and a visual control condition during a video-simulated self-driving car ride with a visual secondary task. The alerts gave information about the vehicle's operating status and the driving scenario. Results: Speech alerts resulted in better memory for alerted events. Both auditory display types resulted in less perceived effort devoted toward the study tasks but also greater perceived annoyance with the alerts. Conclusion: Speech auditory displays promoted Level 1 situation awareness during a simulation of a ride in a self-driving vehicle under routine conditions, but annoyance remains a concern with auditory displays. Application: Speech auditory displays showed promise as a means of increasing Level 1 situation awareness of routine scenarios during an autonomous vehicle ride with an unrelated secondary task.
Conference Paper
Full-text available
As automated vehicles currently do not provide sufficient feedback relating to the primary driving task, drivers have no assurance that an automated vehicle has understood and can cope with upcoming traffic situations [16]. To address this we conducted two user evaluations to investigate auditory displays in automated vehicles using different types of sound cues related to the primary driving sounds: acceleration, deceleration/braking, gear changing and indicating. Our first study compared earcons, speech and auditory icons with existing vehicle sounds. Our findings suggested that earcons were an effective alternative to existing vehicle sounds for presenting information related to the primary driving task. Based on these findings a second study was conducted to further investigate earcons modulated by different sonic parameters to present primary driving sounds. We discovered that earcons containing naturally mapped sonic parameters such as pitch and timbre were as effective as existing sounds in a simulated automated vehicle.
Article
Full-text available
Forward Collision Warning (FCW) systems are intended to alert drivers to an imminent forward collision threat to reduce the frequency and severity of rear-end collisions and mitigate injury and property damage for occupants of both vehicles. This between-subjects driving simulator study examined the effects of three variables – FCW system training, auditory warning type, and gender – on a visually distracted driver's response to an unexpected, imminent, forward collision threat. First, only half of the participants were provided with a brief description of the FCW system. Second, four different auditory warning conditions were compared: no-auditory warning (baseline), average-urgency warning, highest-urgency warning, and an auditory icon car horn. Drivers who received FCW system training had faster reaction times and 68% fewer collisions than drivers who did not receive training. While the highest-urgency warning produced the fastest initial glance to the forward scene, it was the car horn warning that produced robust reaction time, glance behavior, and collision benefits. Unexpectedly, 34.7% of distracted drivers glanced back to the center console display, following their initial forward glance, prior to braking. This unanticipated behavior resulted in longer reaction times, shorter minimum time-to-collision, and a 63.7% increase in collisions as compared with drivers who did not glance back to the console display. Results suggest that even brief system training can aid interaction with an FCW system and glance behavior, reaction times can improve with an auditory icon car horn as the auditory warning, and that uninterrupted forward attention upon detection of a collision threat is imperative for an effective collision avoidance response.
Conference Paper
Full-text available
In previous work, we have been talking about the reduction of accident levels by improving the driver's perception of the environment through multisensory interaction [27]. Being the vision the primary sense used during driving, it suffers a large overhead and, therefore, leaves space for the increased human error. For this reason it has been proposed the use of alternative means of communication by associating the sense of audition applied to the vehicular interface. The sense of audition used as a complementary interface involves several issues which have been identified such as the rhythm and intensity. The research described in this paper, instead, aims, first, at contributing to the reduction of accidents caused by speeding; and second, through the use the multisensory information, to aid the driver to maintain a more regular driving and controlled speed. It is a system for conscious users to whom is given the choice of establishing his limits, using his goals and needs as reference. Based on sound attributes, auditory communication and the aim of helping the driver to maintain a more regular speed for greater safety, a prototype system has been developed. As research methodology, tests were conducted, using a driving simulator, to evaluate the efficiency of the system, including user's preferences and comfort. Through the data obtained by the simulator, we sought to observe the variation of the average speed under the influence of time pressure. The questionnaire indicated no discomfort in using the auditory icons, also helping to keep a greater concentration on the road compared to the use of the speedometer only. Tests indicated that the duration of the trip as well as the dynamics of the landscape are important variables.
Article
Full-text available
Objective: Driver distraction and inattention are the main causes of accidents. The fact that devices such as navigation displays and media players are part of the distraction problem has led to the formulation of guidelines advocating various means for minimizing the visual distraction from such interfaces. However, although design guidelines and recommendations are followed, certain interface interactions, such as menu browsing, still require off-road visual attention that increases crash risk. In this article, we investigate whether adding sound to an in-vehicle user interface can provide the support necessary to create a significant reduction in glances toward a visual display when browsing menus. Methods: Two sound concepts were developed and studied; spearcons (time-compressed speech sounds) and earcons (musical sounds). A simulator study was conducted in which 14 participants between the ages of 36 and 59 took part. Participants performed 6 different interface tasks while driving along a highway route. A 3 × 6 within-group factorial design was employed with sound (no sound /earcons/spearcons) and task (6 different task types) as factors. Eye glances and corresponding measures were recorded using a head-mounted eye tracker. Participants' self-assessed driving performance was also collected after each task with a 10-point scale ranging from 1 = very bad to 10 = very good. Separate analyses of variance (ANOVAs) were conducted for different eye glance measures and self-rated driving performance. Results: It was found that the added spearcon sounds significantly reduced total glance time as well as number of glances while retaining task time as compared to the baseline (= no sound) condition (total glance time M = 4.15 for spearcons vs. M = 7.56 for baseline, p =.03). The earcon sounds did not result in such distraction-reducing effects. Furthermore, participants ratings of their driving performance were statistically significantly higher in the spearcon conditions compared to the baseline and earcon conditions (M = 7.08 vs. M = 6.05 and M = 5.99 respectively, p =.035 and p =.002). Conclusions: The spearcon sounds seem to efficiently reduce visual distraction, whereas the earcon sounds did not reduce distraction measures or increase subjective driving performance. An aspect that must be further investigated is how well spearcons and other types of auditory displays are accepted by drivers in general and how they work in real traffic.
Article
Full-text available
The work demonstrates for the first time a thermal regenerated grating (RG) operating at an ultra-high temperature up to 1400°C. A new class of photosensitive optical fiber based on erbium-doped yttrium stabilized zirconia-calcium-alumina-phospho silica (Er-YZCAPS) glass is fabricated using modified chemical vapor deposition (MCVD) process, followed by solution doping technique and conventional fiber drawing. A type-I seed grating inscribed in this fiber is thermal regenerated based on the conventional thermal annealing technique. The investigation result indicates that the produced RG has an ultrahigh temperature sustainability up to 1400°C. The measured temperature sensitivities are 14.1 and 15.1 pm/°C for the temperature ranges of 25°C–1000°C and 1000°C–1400°C, respectively.
Conference Paper
Full-text available
Previous studies have evaluated Audio, Visual and Tactile warnings for drivers, highlighting the importance of conveying the appropriate level of urgency through the signals. However, these modalities have never been combined exhaustively with different urgency levels and tested while using a driving simulator. This paper describes two experiments investigating all multimodal combinations of such warnings along three different levels of designed urgency. The warnings were first evaluated in terms of perceived urgency and perceived annoyance in the context of a driving simulator. The results showed that the perceived urgency matched the designed urgency of the warnings. More urgent warnings were also rated as more annoying but the effect of annoyance was lower compared to urgency. The warnings were then tested for recognition time when presented during a simulated driving task. It was found that warnings of high urgency induced quicker and more accurate responses than warnings of medium and of low urgency. In both studies, the number of modalities used in warnings (one, two or three) affected both subjective and objective responses. More modalities led to higher ratings of urgency and annoyance, with annoyance having a lower effect compared to urgency. More modalities also led to quicker responses. These results provide implications for multimodal warning design and reveal how modalities and modality combinations can influence participant responses during a simulated driving task.
Conference Paper
Full-text available
For decades, auditory menus using both speech (usually text-to-speech, TTS) and non-speech sounds have been extensively studied. Researchers have developed situation-optimized auditory menus involving such cues as auditory icons, earcons, spearcons, and spindex. Spearcons have generally outperformed other cues in terms of providing both contextual information and item-specific information. However, little research has been devoted to exploration of spearcons in languages other than English, or the use of spearcon-only auditory menus. In this study, we evaluated the use of spearcons in Korean menus, as well as the use of spearcons alone. Twenty-five native Korean speakers navigated through a two-dimensional auditory menu presented via TTS, with or without spearcon enhancements. Korean spearcons were successful. Participants also rated the spearcon-enhanced menu as seeming speedier and more fun than the TTS-only menu. After a short learning period, mean time-to-target in the auditory menu was even faster with spearcons alone, compared to traditional TTS-only menus.
Article
When more than one audible alarm is heard simultaneously, discrimination may be compromised. This experiment compares near-simultaneous clinical alarms in two styles, the first are the tonal ‘melodies’ from the 2012/2006 version of a global medical device safety standard (IEC 60601-1-8) and the second are the auditory-icon-style recommended in the 2020 version of the same standard. Sixty-six participants were required to identify the meaning and priority of four different clinical alarms for one of the two styles of alarm (between-subjects). Alarms sounded both singly and in pairs (within-subjects). Results showed that the auditory icon alarms outperformed the tonal alarms on all measures except one, both for overall accuracy (recognizing both priority and function) and for partial accuracy (recognizing priority or function but not both). The results add to the growing body of evidence supporting the use of auditory icon alarms in clinical environments.
Chapter
Drivers often rely on navigation systems and traffic alerts to anticipate the road events ahead, such as obstacles, accidents, and roadworks. We designed simple road situation alerts using visual, speech, and auditory modalities to warn drivers about upcoming road events. The prototype was tested with a driving simulator and evaluated with elderly drivers. In this study, we evaluated drivers’ subjective trust, cognitive workload, and situational awareness in three experimental conditions. We also collected electrocardiograms to measure the workload and stress as a response to the stimuli. Results show that visual warnings were difficult to notice and distractive. Speech and sound combination resulted in the lowest cognitive load, highest trust while maintaining the highest situational awareness. Both speech and visual warnings reduced distrust compared to the baseline. The weather did not affect any of the subjective measures. The physiological analysis showed that visual warnings induce lower stress compared to speech warning alerts. Speech alerts enabled the highest situational awareness.
Article
In semi-automated vehicles, non-speech sounds have been prevalently used as auditory displays for control transitions since these sounds convey urgency well. However, there are no standards of specifications for warning sounds so that diverse non-speech sounds are being employed. To shed light on this, the effects of different non-speech auditory warnings on driver performance were investigated and quantified through the experimental study and human performance modeling approaches. Twenty-four young drivers drove in the driving simulator and experienced both handover and takeover transitions between manual and automated modes while performing a secondary task. The reaction times for handover and takeover, mental workload, and subjective responses were reported. Overall, a traditional warning sound with many repetitions and an indicator sound with decreasing polarity outperformed and were preferred. Additionally, a mathematical model, using the Queuing Network-Model Human Processor (QN-MHP) framework, was applied to quantify the effects of auditory warnings’ acoustic characteristics on drivers’ reaction times in response to takeover request displays. The acoustic characteristics, including the fundamental frequency, the number of repetitions, and the range of dominant frequencies were utilized in modeling. The model was able to explain 99.7% of the experimental data with a root mean square error (RMSE) of 0.148. The present study can contribute to establishing standards and design guidelines for takeover request displays in semi-automated vehicles.
Article
With the development of cutting-edge technology in the area of driving performance, driver warning systems based on head up displays (HUD) are considered to have the potential to improve driving safety in the future. The location of HUD warning graphics is a vital component to ensure that drivers obtain information the first time and avoid cognitive tunneling when coming across hazards; however, few studies have critically examined this. The present study investigated the advantages of HUD in presenting warning graphics in comparison with traditional head down display (HDD) in vehicles, and further explored the effect of HUD location based on comprehensive indicators, including behavior performance, eye movement data, and subjective assessment. The results revealed that compared with HDD, presenting warning graphics to drivers on HUD could significantly improve driving performance and eye movement patterns, and HUD was the preferential mode for drivers. Results also demonstrated that presenting HUD warning graphics at a location of 8°below the sight line was associated with the worst results in driving performance, eye movement patterns and subjective assessment. Other locations of HUD presentation were not associated with any significant differences for most indicators. These findings have some reference implications for automobile designers as they construct and implement HUD warning systems.
Article
Objective The objective of this study was to assess the effects of different warning messages for an Intersection Movement Assist (IMA) based on drivers’ ability to avoid a potential safety hazard. Background An IMA system can detect hazards and warn drivers when it is unsafe to enter an intersection. The effects of different warning information conveyed by these systems are still unknown. Method A driving simulator study with 80 participants was conducted with a red light running (RLR) scenario using a 5 (warnings) x 2 (training) between-subject design. IMA warnings included the messages “Danger,” “Brake now,” “Vehicle on your left,” a beep, and no IMA warning. Training was provided to half of the participants. Analysis of variance and logistic regression models were used to examine differences in drivers’ avoidance behavior. Results The analyses showed that all tested warning messages can significantly enhance drivers’ avoidance performance. Significant differences were observed in crash occurrence, avoidance behavior (i.e., reaction time and speed change), and eye movements (i.e., fixation pattern and time to first fixation). The effects of training also differed given the warning message provided. Conclusion The “Brake now” message performed best in reducing crash involvement and prompted better avoidance performance. The “Danger” and “Vehicle on your left” messages improved drivers’ hazard detection ability. The training showed a potential to enhance the effectiveness of nonspeech warning messages. Application The findings of this study can help designers and engineers better design IMA warning messages for RLR scenarios.
Conference Paper
Visual metaphors are a creative technique used in print media to convey a message through images. This message is not said directly, but implied through symbols and how those symbols are juxtaposed in the image. The messages we see affect our thoughts and lives, and it is an open research challenge to get machines to automatically understand the implied messages in images. However, it is unclear how people process these images or to what degree they understand the meaning. We test several theories about how people interpret visual metaphors and find people can interpret the visual metaphor correctly without explanatory text with 41.3% accuracy. We provide evidence for four distinct types of errors people make in their interpretation, which speaks to the cognitive processes people use to infer the meaning. We also show that people's ability to interpret a visual message is not simply a function of image content but also of message familiarity. This implies that efforts to automatically understand visual images should take into account message familiarity.
Conference Paper
Take-over is one of the most crucial user interactions in semi-automated vehicles. To make better communication between driver and vehicle, research has been conducted on various take-over request displays, yet the potential has not been fully investigated. The present paper explored the effects of adding auditory displays to visual text. Earcon and speech showed the best performance and acceptance with spearcon the least. This study is expected to provide the basic data and guidelines for future research and design practice.
Article
Auditory icons are short sound messages that convey information about an object, event or situation. Originally, auditory icons have been used in computer interfaces, but are nowadays found in many other fields. In this review article, an overview is given of the main theoretical ideas behind the use and design of auditory icons. We identified the most common fields in which auditory icons have been used, and analyzed their acoustic characteristics. The review shows that few studies have provided a precise description of the physical characteristics of the sounds in auditory icons, e.g., their intensity level, duration, and frequency range. To improve the validity and replicability of research on auditory icons, and their universal design, precise descriptions of acoustic characteristics should thus be provided.
Article
Train-vehicle collisions at highway-rail grade crossings (RR crossings) continue to be a major issue in the US and across the world. To prevent several decades of safety improvements from plateauing, experts are turning towards novel warning devices that can be applied to all crossings with minimal cost. One of the potential approaches is in-vehicle auditory alerts (IVAAs) which are implementable with today’s technology and could potentially complement the existing warnings in a cost effective manner. Study 1 collected subjective data on a pool of potential in-vehicle auditory alerts from 31 participants. Study 2 recruited 20 participants to drive in a medium fidelity driving simulator with and without IVAAs for RR crossings. Results suggest IVAAs inform and remind drivers of how to comply at RR crossings, and have a lasting effect on driver behavior after the IVAA is no longer presented. Compliance scores were highest among combination RR crossing visual warnings, such as crossbucks featuring STOP or YIELD signs. Compliance was lowest for crossbucks alone and active gates in the off position. IVAAs had the largest impact on compliance scores at crossbucks and gates. The discussion includes implications for designing IVAA systems for RR crossings, the limitations of the study, and the participants’ perception of the novel warning type.
Article
Objective: The aim was to compare the effectiveness of two auditory displays, implemented with spearcons (time-compressed speech), for monitoring multiple patients. Background: Sequences of sounds can convey information about patients' vital signs, such as oxygen saturation (SpO2) and heart rate (HR). We tested whether participants could monitor five patients using spearcon-based sound sequences. Method: A 2 × 3 within-subjects design was used. The first factor was interface, with two levels: the ALL interface used spearcons to convey vital signs for all five patients, whereas the ABN (abnormal) interface represented patients who had normal vital signs with a low-pitched single-tone sound and patients who had at least one abnormal vital sign with spearcons. The second factor was the number of patients who had at least one abnormal vital sign: there were one, two, or three such patients in each monitoring sequence. Participants were 40 nonclinicians. Results: Participants identified abnormal patients' SpO2 and HR levels and located abnormal patients in the sound sequence more accurately with the ABN interface than the ALL interface. Accuracy declined as the number of abnormal patients increased. Participants associated ABN with easier identification of vital signs, resulting in higher ratings of confidence and pleasantness compared with ALL. Conclusion: Sequences of spearcons may support effective eyes-free monitoring of multiple patients. Application: Sequences of spearcons may be useful in monitoring multiple patients and the underlying design principles may extend to monitoring in other domains such as industrial process control or control of multiple autonomous vehicles.
Article
Various advanced driver assistance systems (ADAS) have been developed to improve drivers' behavior and perceptual ability; however, whether these ADAS have any measurable effect on driving performance needs to be verified by field operational tests. The purpose of this study was to evaluate the effectiveness of ADAS on Chinese drivers as well as any possible influences of roadway type, gender and experience on driving performance, which can be measured by several variables, including longitudinal, lateral and braking behavior. The ADAS used in this study was a Mobileye M630 with forward collision warning (FCW) and lane departure warning (LDW) functions. Thirty-two participants were recruited to drive a vehicle equipped with Mobileye M630. Participants drove the same test route twice. The route consisted of a 12 km urban road, 34 km urban expressway and 45 km freeway as well as a 14 km adaption road. Vehicle dynamics, environmental information and driving operational data was recorded by CAN (Controller Area Network) bus and video cameras. The results show that ADAS significantly affects braking behavior. Braking time increased and relative speed decreased when drivers were exposed to ADAS. The ADAS also significantly affects several longitudinal behaviors, including the longitudinal deceleration and time headway (THW). The occurrence of critically low THW decreased in the experiment. However, there was no significant effect on lateral behavior. Furthermore, driver acceptance of the FCW function was much higher than the LDW function, and acceptance on the expressway and freeway was much higher than on the urban road. The results also reveal the significant influence of road type and experience on driving behaviors. These findings support policy development and technology improvements for future development of ADAS.
Article
In this paper, we study the effects of acoustic characteristics of spoken disaster warnings in Japanese on listeners' perceived intelligibility, reliability, and urgency. Our findings are threefold: (a) For both speaking speed and fo, setting them to normal (compared from slow/fast ({+}/{-}20%) for speed, and from low/high (+/- up to 36 Hz) for fo) improved the average evaluations for Intelligibility and Reliability. (b) For Urgency only, setting speed to faster (both slow to normal and normal to fast) or setting fo to higher (both low to normal and normal to high) resulted in an improved average evaluation. (c) For all of intelligibility, reliability, and urgency, the main effect of speaking speed was the most dominant. In particular, urgency can be influenced by the speed factor alone by up to 39%. By setting speed to fast (+20%), all other things being equal, the average perceived urgency raised to 4.0 on the 1–5 scale from 3.2 when the speed is normal. Based on these results, we argue that the speech rate may effectively be varied depending on the purpose of an evacuation call, whether it prioritizes urgency, or intelligibility and reliability. Care should be taken to the possibility that the respondent-specific variation and experimental conditions may interplay these results.
Article
Objective We compared the effectiveness of single-tone earcons versus spearcons in conveying information about two commonly monitored vital signs: oxygen saturation and heart rate. Background The uninformative nature of many medical alarms—and clinicians’ lack of response to alarms—is a widespread problem that can compromise patient safety. Auditory displays, such as earcons and spearcons (speech-based earcons), may help clinicians maintain awareness of patients’ well-being and reduce their reliance on alarms. Earcons are short abstract sounds whose properties represent different types and levels of information, whereas spearcons are time-compressed spoken phrases that directly state their meaning. Listeners might identify patient vital signs more accurately with spearcons than with earcons. Method In Experiment 1 we compared how accurately 40 nonclinician participants using either (a) single-tone earcons differentiated by timbre and tremolo or (b) Cantonese spearcons recorded using a female Cantonese voice could identify both oxygen saturation and heart rate levels. In Experiment 2 we tested the identification performance of six further nonclinician participants with spearcons recorded using a male Cantonese voice. Results In Experiment 1, participants using spearcons identified both vital signs together more accurately than did participants using earcons. Participants using Cantonese spearcons also learned faster, completed trials faster, identified individual vital signs more accurately, and felt greater ease and more confident when identifying oxygen saturation levels. Experiment 2 verified the previous findings with male-voice Cantonese spearcons. Conclusion Participants identified vital signs more accurately using spearcons than with the single-tone earcons. Application Spearcons may be useful for patient monitoring in situations in which intermittently presented information is desirable.
Conference Paper
Driver distraction and inattention are the main causes of accidents today and one way for vehicle manufacturers to address this problem may be to replace or complement visual information in in-vehicle interfaces with auditory displays. In this paper, we address the specific problem of giving text input to an interface while driving. We test whether the handwriting input method, which previously has been shown to be promising in terms of reducing distraction, can be further improved by adding speech feedback. A driving simulator study was carried out in which 11 persons, (3 female) drove in two different scenarios (curvy road and straight motorway) while performing three different handwriting text input tasks. Glance behavior was measured using a head mounted eyetracker, and subjective responses were also acquired. ANOVA Analysis revealed that speech feedback resulted in less distraction as measured by total glance time compared to the baseline condition (no speech). There were however also interaction effects which indicated that the positive effect of speech feedback were not as prominent for the curvy road scenario. Post-experiment interviews nonetheless showed that the participants felt as if the speech feedback made the text input task safer, and also that they preferred speech feedback over no speech.
Conference Paper
Within vehicle Human Machine Interface design, visual displays are predominant, taking up more and more of the visual channel for each new system added to the car, e.g. navigation systems, blind spot information and forward collision warnings. Sounds however, are mainly used to alert or warn drivers together with visual information. In this study we investigated the design of auditory displays for advisory information, by designing a 3D auditory advisory traffic information system (3DAATIS) which was evaluated in a drive simulator study with 30 participants. Our findings indicate that overall, drivers' performance and situation awareness improved when using this system. But, more importantly, the results also point towards the advantages and limitations of the use of advisory 3D-sounds in cars, e.g. attention capture vs. limited auditory resolution. These findings are discussed and expressed as design implications.
Article
Auditory warning devices used on typical industrial vehicles or material handling devices are produced in quantity as standard items affording the equipment purchaser little or no choice of signal characteristics. Industrial experience has indicated that masking of auditory warning signals by industrial noise backgrounds occurs frequently, often presenting serious incidents or causing accidents. This paper presents statistical model and practical method whereby the effectiveness of auditory warning devices in industrial environments can be predicted based on spectral analyses of the signal and the background in which it is to be used.
Article
When a highly automated car reaches its operational limits, it needs to provide a takeover request (TOR) in order for the driver to resume control. The aim of this simulator-based study was to investigate the effects of TOR modality and left/right directionality on drivers' steering behaviour when facing a head-on collision without having received specific instructions regarding the directional nature of the TORs. Twenty-four participants drove three sessions in a highly automated car, each session with a different TOR modality (auditory, vibrotactile, and auditory-vibrotactile). Six TORs were provided per session, warning the participants about a stationary vehicle that had to be avoided by changing lane left or right. Two TORs were issued from the left, two from the right, and two from both the left and the right (i.e., nondirectional). The auditory stimuli were presented via speakers in the simulator (left, right, or both), and the vibrotactile stimuli via a tactile seat (with tactors activated at the left side, right side, or both). The results showed that the multimodal TORs yielded statistically significantly faster steer-touch times than the unimodal vibrotactile TOR, while no statistically significant differences were observed for brake times and lane change times. The unimodal auditory TOR yielded relatively low self-reported usefulness and satisfaction ratings. Almost all drivers overtook the stationary vehicle on the left regardless of the directionality of the TOR, and a post-experiment questionnaire revealed that most participants had not realized that some of the TORs were directional. We conclude that between the three TOR modalities tested, the multimodal approach is preferred. Moreover, our results show that directional auditory and vibrotactile stimuli do not evoke a directional response in uninstructed drivers. More salient and semantically congruent cues, as well as explicit instructions, may be needed to guide a driver into a specific direction during a takeover scenario.
Article
The ability to detect changes is crucial for safe driving. Previous research has demonstrated that drivers often experience change blindness, which refers to failed or delayed change detection. The current study explored how susceptibility to change blindness varies as a function of the driving environment, type of object changed, and safety relevance of the change. Twenty-six fully-licenced drivers completed a driving-related change detection task. Changes occurred to seven target objects (road signs, cars, motorcycles, traffic lights, pedestrians, animals, or roadside trees) across two environments (urban or rural). The contextual safety relevance of the change was systematically manipulated within each object category, ranging from high safety relevance (i.e., requiring a response by the driver) to low safety relevance (i.e., requiring no response). When viewing rural scenes, compared with urban scenes, participants were significantly faster and more accurate at detecting changes, and were less susceptible to “looked-but-failed-to-see” errors. Interestingly, safety relevance of the change differentially affected performance in urban and rural environments. In urban scenes, participants were more efficient at detecting changes with higher safety relevance, whereas in rural scenes the effect of safety relevance has marginal to no effect on change detection. Finally, even after accounting for safety relevance, change blindness varied significantly between target types. Overall the results suggest that drivers are less susceptible to change blindness for objects that are likely to change or move (e.g., traffic lights vs. road signs), and for moving objects that pose greater danger (e.g., wild animals vs. pedestrians).
Article
Our ability to sustain attention for prolonged periods of time is limited. Studies on the relationship between lapses of attention and psychophysiological markers of attentional state, such as pupil diameter, have yielded contradicting results. Here, we investigated the relationship between tonic fluctuations in pupil diameter and performance on a demanding sustained attention task. We found robust linear relationships between baseline pupil diameter and several measures of task performance, suggesting that attentional lapses tended to occur when pupil diameter was small. However, these observations were primarily driven by the joint effects of time-on-task on baseline pupil diameter and task performance. The linear relationships disappeared when we statistically controlled for time-on-task effects and were replaced by consistent inverted U-shaped relationships between baseline pupil diameter and each of the task performance measures, such that most false alarms and the longest and most variable response times occurred when pupil diameter was both relatively small and large. Finally, we observed strong linear relationships between the temporal derivative of pupil diameter and task performance measures, which were largely independent of time-on-task. Our results help to reconcile contradicting findings in the literature on pupil-linked changes in attentional state, and are consistent with the adaptive gain theory of locus coeruleus-norepinephrine function. Moreover, they suggest that the derivative of baseline pupil diameter is a potentially useful psychophysiological marker that could be used in the on-line prediction and prevention of attentional lapses.
Article
Cooperative warning systems have a great potential to prevent traffic accidents. However, because of their predictive nature, they might also go along with an increased frequency of incorrect alarms that could limit their effectiveness. To better understand the consequences associated with incorrect alarms, a driving simulator study with N = 80 drivers was conducted to investigate how situational context and warning urgency jointly influence drivers’ compliance with an unreliable advisory warning system (AWS). The participants encountered several critical urban driving situations and were either assisted by a 100% reliable AWS, a 60% reliable AWS that generated false alarms (without obvious reason) or a 60% reliable AWS that generated unnecessary alarms (with plausible reason). A baseline drive without any assistance was also introduced to the study. The warnings were presented either only visually or visual-auditory. In line with previous research, drivers’ compliance and effectiveness of the AWS was reduced by false alarms but not by unnecessary alarms. However, this so-called cry wolf effect (Breznitz, 1984) was only found in the visual-auditory condition, whereas there was no effect of warning reliability in the condition with visual AWS. Furthermore, false but not unnecessary alarms caused the participants to rate the AWS less favourably during a follow-up interview. In spite of these negative effects of false alarms, a reduction in the frequency of safety-critical events (SCEs) and an earlier braking onset were evident in all assisted drives compared with that of non-assisted driving, even when the AWS was unreliable. The results may thus lower concerns about the negative consequences of warning drivers unnecessarily about upcoming traffic conflicts if the reasons of these alarms are comprehensible. From a perspective of designing AWS, we recommend to use less urgent warnings to prevent the cry wolf effect.
Chapter
Due to the mobile Internet revolution, people communicate increasingly via social networks and instant messaging applications using their smartphones. In order to stay “always connected” they even use their smartphone while driving their car which puts the driver safety at risk. In order to reduce driver distraction an intuitive speech interface which provides the driver with proactively incoming events needs to be developed. Before developing a new speech dialog system developers have to examine what the user’s preferred interaction style is. This paper reports from a recent driving simulation study in which several speech-based proactive notification concepts for incoming events in different contextual situations are evaluated. 4 different speech dialog and 2 graphical user interface concepts, one including an avatar, were designed and evaluated on usability and driving performance. The results show that there are significant differences when comparing the speech dialog concepts. Informing the user verbally achieves the best result concerning usability. Earcons are perceived to be the least distractive. The presence of an avatar was not accepted by the participants and led to an impaired steering performance.
Conference Paper
Designing appropriate auditory warnings is a well-known challenge. The present work focuses on a new type of auditory warning for within-vehicle use, combining a signal that conveys urgency information with a signal that conveys more detailed information about the urgent event. In the study, three concepts of "combined warnings" are compared. The concepts differ in terms of the sound type used to convey event information. The results support the usefulness and potential of combined warnings. However, using information sounds that are too abstract can have a severe degrading effect on warning efficiency and cognitive effort. Interestingly, these abstract sounds may also negatively impact the user's ability to respond accurately to the urgency level of the warning.
Article
Two experiments are reported that investigate the effects of acoustics and semantics in verbal warnings. In the first experiment subjects rated the urgency of warning signal words spoken in different presentation styles (URGENT, NON-URGENT, MONOTONE). Significant differences in urgency ratings were found between presentation styles. Acoustic analysis revealed how acoustic parameters differed within these different presentation styles. These acoustic measurements were used to construct synthesised speech warnings that differed in urgency. They were rated in experiment 2 and the predicted differences between the urgency of the words were found. These studies indicate that urgency in natural speech is produced by alterations in a few acoustic parameters and that these alterations can easily be incorporated into synthetic speech to reproduce variations in urgency.
Conference Paper
The quantitative prediction and understanding of human performances in the responses to speech warnings is an essential component to improve warning effectiveness. Queuing network-model human processor (QN-MHP), as a computational architecture, enables researchers to model dual-task information processing. The current study enhanced QN-MHP by modelling the effect of loudness and semantics on human responses to speech warning messages. The model predictions of crash rate were validated with two empirical studies in collision warning systems with resultant R squares of 0.73 and 0.77, respectively. The developed mathematical model could be further utilized in optimizing the design of speech warnings to achieve most safety benefits.
Article
With shrinking displays and increasing technology use by visually impaired users, it is important to improve usability with non-GUI interfaces such as menus. Using non-speech sounds called earcons or auditory icons has been proposed to enhance menu navigation. We compared search time and accuracy of menu navigation using four types of auditory representations: speech only; hierarchical earcons; auditory icons; and a new type called spearcons. Spearcons are created by speeding up a spoken phrase until it is not recognized as speech. Using a within-subjects design, participants searched a 5 x 5 menu for target items using each type of audio cue. Spearcons and speech-only both led to faster and more accurate menu navigation than auditory icons and hierarchical earcons. There was a significant practice effect for search time, within each type of auditory cue. These results suggest that spearcons are more effective than previous auditory cues in menu-based interfaces, and may lead to better performance and accuracy, as well as more flexible menu structures.
Conference Paper
Pupil diameter is an important measure of cognitive load. However, pupil diameter is also influenced by the amount of light reaching the retina. In this study we explore the interaction between these two effects in a simulated driving environment. Our results indicate that it is possible to separate the effects of illumination and visual cognitive load on pupil diameter, at least in certain situations.
Article
Previous research has shown that urgent auditory warnings are likely to annoy drivers. Increased urgency could also raise drivers' stress levels, which in turn could impact their ability to detect and react to subsequent changes in the traffic environment. The authors conducted a simulator experiment with 24 truck drivers to investigate the potential of urgent alarms to raise annoyance and negatively affect drivers' subsequent responses to unrelated, critical events on the road. The drivers received two types of warnings that were designed to significantly differ in perceived urgency. Several times in the trial, an unexpected event occurred just seconds after drivers were presented with an unrelated warning, and the drivers had to brake immediately to avoid a collision. The results indicate that acoustic characteristics and semantic meaning may impact the perceived annoyance of in-vehicle warnings. Interestingly, the authors found a significant, negative correlation between the drivers' experience (years of truck driving experience) and the rated annoyance for both types of warnings. Also, the drivers who received the high-urgency warning braked significantly harder and tended to brake later than the drivers who received a low-urgency warning. These results have implications for ITS systems for heavy vehicles that intend to implement auditory warning signals.
Article
Speech reminders can severely disrupt list recall. Spearcons, time-compressed speech messages, might be less disruptive because they are much shorter. In this study, we asked 24 younger participants to recall 64 short lists of digit, animal, food, or furniture names. List items were presented one at a time; the number of items presented depended on individual digit spans. Spearcons affected list recall to the same extent as speech. However, people with higher digit spans had significantly worse recall. This could be due to short-term memory overload or the longer presentation time of long lists. We discuss implications for menu design.
Article
This article reviews the use and design of speech warnings in terms of ergonomic considerations. Firstly, it considers the benefits of using the auditory channel and technological approaches to producing artificial speech, Secondly, the characteristics of human and machine-generated speech are reviewed: the latter focusing on naturalness, intelligibility, rate of presentation, emotional content and quality. Thirdly, non-speech and speech warnings, and their potential uses are considered and the design of speech for warning applications. Given technological developments, greater use of the auditory channel for warnings is likely: taking into account human factors considerations should lead to better designed warnings.