The human face is the most natural interface for face-to-face communication, and the human form is the most effective design for traversing the human-made areas of the planet. Thus, developing realistic humanoid robots (RHRs) with artificial intelligence (AI) permits humans to interact with technology and integrate it into society in a naturalistic manner insurmountable to any other form of
... [Show full abstract] non-biological human emulation. However, RHRs have yet to attain a level of emulation that is indistinguishable from the human condition and fall into the uncanny valley (UV). The UV represents a dip in human perception, where affinity decreases with heightened levels of human likeness. Per qualified research into the UV, artificial eyes and mouths are the primary propagators of the uncanny valley effect (UVE) and reduce human likeness and affinity towards RHRs. In consideration, this thesis introduces, tests and comparatively assesses a pair of novel robotic eye prototypes with dilating pupils capable of simultaneously replicating the natural pupilar responses of the human eyes to light and emotion. The robotic pupil systems act as visual signifiers of sentience and emotion to enhance eye contact interfacing in human-robot interaction (HRI).
Secondly, this study presents the design, development and application of a novel robotic mouth system with buccinator actuators and custom machine learning (ML) speech synthesis to mouth articulation application for forming complex lip shapes (visemes) to emulate human mouth and lip patterns to vowel and consonant sounds. The robotic eyes and mouth system were installed in two RHRs named ‘Euclid and Baudi’ and tested for accuracy and processing rate against a living human counterpart. The results of these experiments indicated that the robotic eyes are capable of dilating within the average pupil range of the human eyes to light and emotion, and the robotic mouth operated with a 86.7% accuracy rating when compared against the lip movement of a human mouth during verbal communication. An HRI experiment was conducted using the RHRs and biometric sensors to monitor the physiological responses of test subjects for cross-analysis with a questionnaire. The sample consists of twenty individuals with experience in AI and robotics and related fields to examine the authenticity, accuracy and functionality of the prototypes. The robotic mouth prototype achieved 20/20 for aesthetical, and lip synchronisation accuracy compared to a robotic mouth with the buccinator actuators deactivated, heightening the potential application of the system in future RHR design. However, to reduce influential factors, test subjects were not informed of the dilating eye system, which resulted in 2/20 of test subjects noticing the pupil dilation sequences to emotive facial expressions (FEs) and light. Moreover, the eye contact behaviours of the RHRs was more significant than pupil dilation FEs and eye aesthetics during HRI, counter to previous research in the UVE in HRI.
Finally, this study outlines a novel theoretical evaluation framework founded on the 1950 Turing Test (TT) for AI, named The Multimodal Turing Test (MTT) for evaluating human-likeness and interpersonal and intrapersonal intelligence in RHR design and realistic virtual humanoids (RVHs) with embodied artificial intelligence (EAI). The MTT is significant in RHR development as current methods of evaluation, such as The Total Turing Test (TTT), Truly Total Turing Test (TTTT) Robot Turing Test (RTT), Turing Handshake Test (THT), Handshake Turing Test (HTT) and TT are not nuanced and comprehensive enough to evaluate the functions of an RHR/RVH simultaneously to pinpoint the causes of the UVE. Furthermore, unlike previous methods, the MTT provides engineers with a developmental framework to assess degrees of human-likeness in RHR and RVH design towards more advanced and accurate modes of human emulation.