Figure 2 - uploaded by Busra Sarigul
Content may be subject to copyright.
Results of Condition 1 and Condition 2, and Comparison of Condition 1 and Condition 2.

Results of Condition 1 and Condition 2, and Comparison of Condition 1 and Condition 2.

Contexts in source publication

Context 1
... When the visual cue was Human, subjects performed similarly in congruent and incongruent conditions (t(16) = 0.58, p=0.60) (See Figure 2). Comparison of Condition 1 and Condition 2: We investigated whether the human-likeness of the visual cue (Human, Android, Robot) affects how people categorize auditory target, and whether it interacts with congruency. ...
Context 2
... A closer look at the pattern of results showed that the difference between the congruent and incongruent conditions was largest for Robot, followed by Android, followed by Human (See Figure 2, right). ...

Citations

... In addition, some participants suggested that a match between the anthropomorphism levels of the voice and verbal style would be important in improving the impressions of the robots. Prior studies revealed that the incongruent anthropomorphism level of the robots' appearance and voice elicited more eeriness and violated users' expectations [47,48]. Likewise, investigating whether the incongruent anthropomorphism level of the robot's voice and verbal style affects user experience might be interesting. ...
... Bayesian models are used to quantify human cognitive prediction processes, which can give credence to studies which link prediction errors to the occurrence of the uncanny valley effect. This has been particularly successful under conditions of mismatched voice and appearance [43] such as ours. ...
Article
In this paper, we investigate the effect of a realism mismatch in the voice and appearance of a photorealistic virtual character in both immersive and screen-mediated virtual contexts. While many studies have investigated voice attributes for robots, not much is known about the effect voice naturalness has on the perception of realistic virtual characters. We conducted the first experiment in Virtual Reality (VR) with over two hundred participants investigating the mismatch between realistic appearance and unrealistic voice on the feeling of presence, and the emotional response of the user to the character expressing a strong negative emotion. We predicted that the mismatched voice would lower social presence and cause users to have a negative emotional reaction and feelings of discomfort towards the character. We found that the concern for the virtual character was indeed altered by the unnatural voice, though interestingly it did not affect social presence.The second experiment was conducted with a view towards heightening the appearance realism of the same character for the same scenarios, with an additional lower level of voice realism employed to strengthen the mismatch of perceptual cues. While voice type did not appear to impact reports of empathic responses towards the character, there was an observed effect of voice realism on reported social presence, which was not detected in the first study. There were also significant results on affinity and voice trait measurements that provide evidence in support of perceptual mismatch theories of the Uncanny Valley.
... A higher level of humanoid is expected to carry a more sociable speech style, which promotes acceptance and preference. [29] also suggested that the congruent features of voice and appearance may promote efficiency in human-robot interaction. In addition, previous research has found that warmth and competence among all RoSAS and GodSpeed subscales are the two most crucial and decisive predictors for human preferences [30]. ...
... McGinn and Torre [114] found that participants were only able to match the voice of one robot, the PR2, to its body. Sarigul et al. [158] found that people were quicker to assign a robot voice to a robot image, rather than a human voice to a robot image. Gong and Lai [62] found that mixing a human voice with a TTS at the same time led to poorer performance, even though people thought that they had performed better and found that version of the system easier to use. ...
Article
Full-text available
Social robots, conversational agents, voice assistants, and other embodied AI are increasingly a feature of everyday life. What connects these various types of intelligent agents is their ability to interact with people through voice. Voice is becoming an essential modality of embodiment, communication, and interaction between computer-based agents and end-users. This survey presents a meta-synthesis on agent voice in the design and experience of agents from a human-centered perspective: voice-based human--agent interaction (vHAI). Findings emphasize the social role of voice in HAI as well as circumscribe a relationship between agent voice and body, corresponding to human models of social psychology and cognition. Additionally, changes in perceptions of and reactions to agent voice over time reveals a generational shift coinciding with the commercial proliferation of mobile voice assistants. The main contributions of this work are a vHAI classification framework for voice across various agent forms, contexts, and user groups, a critical analysis grounded in key theories, and an identification of future directions for the oncoming wave of vocal machines.
Article
Appearance and voice are essential factors impacting users' affective preferences for humanoid robots. However, little is known about how the appearance and voice of humanoid robots jointly influence users' affective preferences and visual attention. We conducted a mixed-design eye-tracking experiment to examine the multisensory integration effect of humanoid robot appearances and voices on users' affective preferences and visual attention. The results showed that the combinations of affectively preferred voices and appearances attracted more affective preferences and shorter average fixation durations. The combinations of non-preferred voices and preferred appearances captured less affective preferences and longer fixation durations. The results suggest that congruent combinations of affectively preferred voices and appearances might motivate a facilitation effect on users' affective preference and the depth of visual attention through audiovisual complements. Incongruent combinations of non-preferred voices and preferred appearances might stimulate an attenuation effect and result in less affective preferences and a deeper retrieval of visual information. Besides, the head attracted the most amount of visual attention regardless of voice conditions. This paper contributes to deepening the understanding of the multisensory integration effect on users' affective preferences and visual attention and providing practical implications for designing humanoid robots satisfying users' affective preferences.
Conference Paper
Full-text available
Mind perception is considered to be the ability to attribute mental states to non-human beings. As social robots increasingly become part of our lives, one important question for HRI is to what extent we attribute mental states to these agents and the conditions under which we do so. In the present study, we investigated the effect of appearance and the type of action a robot performs on mind perception. Participants rated videos of two robots in different appearances (one metallic, the other human-like), each of which performed four different actions (manipulating an object, verbal communication, non-verbal communication, and an action that depicts a biological need) on Agency and Experience dimensions. Our results show that the type of action that the robot performs affects the Agency scores. When the robot performs human-specific actions such as communicative actions or an action that depicts a biological need, it is rated to have more agency than when it performs a manipulative action. On the other hand, the appearance of the robot did not have any effect on the Agency or the Experience scores. Overall, our study suggests that the behavioral skills we build into social robots could be quite important in the extent we attribute mental states to them.