Conference Paper

Design guidelines for hands-free speech interaction

Authors:
  • Bold Insight UK
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

As research on speech interfaces continues to grow in the field of HCI, there is a need to develop design guidelines that help solve usability and learnability issues that exist in hands-free speech interfaces. While several sets of established guidelines for GUIs exist, an equivalent set of principles for speech interfaces does not exist. This is critical as speech interfaces are so widely used in a mobile context, which in itself evolved with respect to design guidelines as the field matured. We explore design guidelines for GUIs and analyze how these are applicable to speech interfaces. For this we identified 21 papers that reflect on the challenges of designing (predominantly mobile) voice interfaces. We present an investigation of how GUI design principles apply to such hands-free interfaces. We discuss how this can serve as the foundation for a taxonomy of design guidelines for hands-free speech interfaces.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Along with this, we wanted to understand how designers map their experience and practices from general UI to VUI design. Previous papers show that, while GUI heuristics and patterns cannot be easily mapped to VUI design [37,44] existing GUI heuristics may prove to be a good base for developing new VUI-specific heuristics [23,26,42]. Some research encourages using heuristics like Nielsen's as a framework to base new VUI heuristics around [36]. ...
... One that has increased over the past 5 years is the develop of design guidelines or heuristics for VUIs. Several sets of VUI heuristics have been developed in previous research [15,23,35,40,41], and have received varying levels of validation. Furthermore, books such as "Designing Voice User Interfaces: Principles of Conversational Experiences" [31] by Cathy Pearl have worked to develop of set of industry-adoptable guidelines and practices for Voice User Interfaces, along with guidelines from companies like Amazon [6], Google [5], and Apple [7]. ...
... Even so, previous research shows that there is very little discussion and training on VUIs in current HCI curricula [22], making it difficult for designers to transition from the familiar space of designing general UIs to designing VUIs. We can see this in present VUI designs and their serious usability issues [10,11,23]. ...
... As Murad, Munteanu, Clark, and Cowan (2018) point out, feedback is of central importance in the design of IVAs. Audio-gamification can help to provide the user with the necessary feedback in a familiar way. ...
... At the same time, care must be taken not to overwhelm the user with additional elements which could distract or bore them. Our study is compliant with design recommendations for non-complex interaction models (Murad et al., 2018), and our results show that audio-gamification can improve the experience with IVAs. Our results provide implications for improving user motivation of IVAs, which might be helpful in many application domains. ...
... An important point in designing applications for IVAs is to make the menu navigation as efficient and straightforward as possible (Murad et al., 2018). Unfortunately, the integration of game design elements counteracts this principle and could therefore harm usability. ...
Article
Full-text available
Intelligent virtual assistants (IVAs) like Amazon Alexa or Google Assistant have become increasingly popular in recent years, and research into the topic is growing accordingly. A major challenge in designing IVA applications is making them appealing. Gamification as a concept might help to boost motivation when using IVAs. Visual representation of progress and feedback is an essential component of gamification. When using IVAs, however, visual information is generally not available. To this end, this article reports the results of a lab experiment with 81 subjects describing how gamification, utilized entirely by audio, can assist subjects to work faster and improve motivation. Game design elements such as points and levels are integrated within an Alexa Skill via audio output to motivate subjects to complete household tasks. The results show a substantial effect on the subjects. Both their attitude and the processing time of the given tasks were positively influenced by the audio-gamification. The outcomes indicate that audio-gamification has a huge potential in the field of voice assistants. Differences in experimental conditions were also considered, but no statistical significance was found between the cooperative and competitive groups. Finally, we discuss how these insights affect IVA design principles and future research questions.
... Customization and personalization features have been shown to enhance user satisfaction and improve performance [18,20,21,40,42]. Customized features are those the user can explicitly select between specific options, whereas personalized features are driven by computers based on users' individual needs [44]. ...
... Providing users with control over interaction can improve performance and user satisfaction [58]. A large body of research has highlighted the importance of customization in VAs, particularly for people with special needs [1,2,40,42,58]. Molnar and Kletke [40] emphasize that the lack of flexibility reduces productivity and satisfaction. Murad et al. [42] identified lack of control as a common cause of user frustration and highlighted the need for user control and freedom in speech interfaces. ...
... Molnar and Kletke [40] emphasize that the lack of flexibility reduces productivity and satisfaction. Murad et al. [42] identified lack of control as a common cause of user frustration and highlighted the need for user control and freedom in speech interfaces. Therefore, to enhance the experience, VAs should allow for the configuration of speed, tone, and volume along with other characteristics of the virtual agents [2]. ...
... While heuristics already exist for the design of GUIs [27,30,37], VUI heuristics are not yet widely available -or at best, are in their infancy. The few available VUI heuristics are not yet settled into a widely adopted and universally understood canon (with research on what such VUI heuristics may consist of also not yet completed [24]). While summative usability evaluation methods are often used for VUIs, heuristics allow designers to ground the entire process (e.g. ...
... design, formative research, summative evaluations) in a more comprehensive framework, which necessitates the exploration of developing widelyadoptable VUI heuristics. Though research shows that existing GUI-based heuristics cannot be mapped directly "as is" to VUI interaction [24,35,46], recent empirical work on VUI usability suggests that some existing GUI heuristics can provide a groundwork for the development of VUI heuristics [44]. Therefore, VUI heuristics may not need to be developed from scratch. ...
... Therefore, VUI heuristics may not need to be developed from scratch. There are also critical arguments in recent speech-related HCI literature that GUI heuristics can evolve or be adapted to VUIs [24,25]. ...
... The rules and conventions of conversations are well understood, but it is the nature of the stream that binds Voice User Interfaces. You can only remember so much, you can only go back so far; these redound in usability problems already identified pertaining to cognitive load and information [10]. Following that vein, numerous papers have worked towards offering guidelines and heuristics to improve usability [10,12,14]. ...
... You can only remember so much, you can only go back so far; these redound in usability problems already identified pertaining to cognitive load and information [10]. Following that vein, numerous papers have worked towards offering guidelines and heuristics to improve usability [10,12,14]. As one of them points out, "we may be in the same situation Mobile UIs were a decade ago" [10]. ...
... Following that vein, numerous papers have worked towards offering guidelines and heuristics to improve usability [10,12,14]. As one of them points out, "we may be in the same situation Mobile UIs were a decade ago" [10]. We argue that we could continue that path, keep on patching and improving on usability as we go, but if we do not expand on the design scope of VUIs, there will come a point of stalling. ...
... Current HCI research finds that there are still many key usability issues in even the most current voice-enabled devices, such as Google Home and Amazon Alexa. Some of these core issues consist of: difficulties with the amount of information that can be remembered, system feedback, learnability, and recognition errors [11,16,26,27]. Yet there is currently a disconnect between the research being performed in academia and the practical application of CUI design in industry. ...
... Yet there is currently a disconnect between the research being performed in academia and the practical application of CUI design in industry. Though some have been developed [17,19,26,32,33], there is a perceived lack of industry focused tools and heuristics to aid in CUI design. Current expert designers in industry are also often lacking the proper training and resources needed to know how to develop good CUIs [24,25]. ...
... We consider that the design principles of CUIs cannot be directly transferred from those of GUIs (cf. Voice-based CUIs [56,57,77,78]), and designing voice-based and text-based CUIs is partly converging because of the conversational nature of interaction, but diers signicantly because of the medium (with and without voice, i.e., phonological characters like prosody), the technology as well as the situations of use. In this paper, we provide a literature review of linguistics, especially in language in interactions, which serves as the foundation for deriving the characteristics of CUIs from an HCI point of view and the checkpoints. ...
... Previous approaches to dene heuristics and checklists for chatbots from a usability point of view have not explicitly mapped linguistic or technical checkpoints to Nielsen's heuristics (e.g., Trindi-Tick list [9] or [73,77]), or researchers have grouped checkpoints into several proprietary categories [51]. More recently, Murad [56,57] mapped Nielsen's 10 heuristics to the empirical ndings of Voice User Interfaces. Our approach is rather the development of CUI-specic checkpoints based on HCI perspectives combined with conversation and language studies, and focuses on text-based CUIs/chatbots. ...
... When designing dialogs, it is best to keep things as simple as possible in accordance with current design guidelines (Murad et al., 2018). However, integrating purely acoustically presented game design elements is a challenge because, due to fundamental differences in vision and hearing, elements from classic visual computer games cannot be transferred without considerable alterations to their properties (Friberg & G€ ardenfors, 2004). ...
Article
Full-text available
Gamification can increase motivation in learning, and intelligent virtual assistants (IVAs) can support foreign language learning at home. However, there is a lack of design concepts to motivate learners to practice with their IVA. This study combines both concepts and analyzes if audio-gamification can increase engagement to address this research gap. To this end, a one-year long-term field experiment with 230 subjects using a German language learning skill for Amazon Alexa was conducted. A between-subjects design determined differences in learning behavior and learning outcomes between a control group and two gamified groups (achievements and leaderboard). The findings reveal a positive effect on the number of translated vocabulary and learning success. However, only in the group with a leaderboard was a statistically significant effect on the number of translated vocabulary found. These findings imply that audio-gamification can be a helpful tool for increasing motivation to use IVAs for foreign language learning.
... As a result, people who have negative experience using speech input may resist using it again. Another limitation is related to capturing personal data in public spaces: some people feel embarrassed talking to their phone in front of others [40,41] or concerned about their privacy being disclosed [40,51]. ...
Conference Paper
Speech as a natural and low-burden input modality has great potential to support personal data capture. However, little is known about how people use speech input, together with traditional touch input, to capture different types of data in self-tracking contexts. In this work, we designed and developed NoteWordy, a multimodal self-tracking application integrating touch and speech input, and deployed it in the context of productivity tracking for two weeks (N = 17). Our participants used the two input modalities differently, depending on the data type as well as personal preferences, error tolerance for speech recognition issues, and social surroundings. Additionally, we found speech input reduced participants' diary entry time and enhanced the data richness of the free-form text. Drawing from the findings, we discuss opportunities for supporting efficient personal data capture with multimodal input and implications for improving the user experience with natural language input to capture various self-tracking data.
... The voice app was designed to have the same functionalities as those of the mobile app, which include the following: (1) asking the user to measure their weight, blood pressure, and heart rate; (2) saving and storing those values in the Medly clinical dashboard; (3) asking the user a series of yes or no questions relating to HF symptoms; (4) processing the data using the Medly algorithm; and (5) outputting a message to the user based on the algorithm result. These requirements helped create an app that was appropriate for voice interaction and were based on research as well as guidelines related to VUI design [27,28]. Conversational flow diagrams were also created using VUI design guidelines, and each scenario was tested on a VUI following its implementation [29,30]. ...
Article
Background The use of digital therapeutics (DTx) in the prevention and management of medical conditions has increased through the years, with an estimated 44 million people using one as part of their treatment plan in 2021, nearly double the number from the previous year. DTx are commonly accessed through smartphone apps, but offering these treatments through additional platforms can improve the accessibility of these interventions. Voice apps are an emerging technology in the digital health field; not only do they have the potential to improve DTx adherence, but they can also create a better user experience for some user groups. Objective This research aimed to identify the acceptability and feasibility of offering a voice app for a chronic disease self-management program. The objective of this project was to design, develop, and evaluate a voice app of an already-existing smartphone-based heart failure self-management program, Medly, to be used as a case study. Methods A voice app version of Medly was designed and developed through a user-centered design process. We conducted a usability study and semistructured interviews with patients with heart failure (N=8) at the Peter Munk Cardiac Clinic in Toronto General Hospital to better understand the user experience. A Medly voice app prototype was built using a software development kit in tandem with a cloud computing platform and was verified and validated before the usability study. Data collection and analysis were guided by a mixed methods triangulation convergence design. Results Common themes were identified in the results of the usability study, which involved 8 participants with heart failure. Almost all participants (7/8, 88%) were satisfied with the voice app and felt confident using it, although half of the participants (4/8, 50%) were unsure about using it in the future. Six main themes were identified: changes in physical behavior, preference between voice app and smartphone, importance of music during voice app interaction, lack of privacy concerns, desired reassurances during voice app interaction, and helpful aids during voice app interaction. These findings were triangulated with the quantitative data, and it concluded that the main area for improvement was related to the ease of use; design changes were then implemented to better improve the user experience. Conclusions This work offered preliminary insight into the acceptability and feasibility of a Medly voice app. Given the recent emergence of voice apps in health care, we believe that this research offered invaluable insight into successfully deploying DTx for chronic disease self-management using this technology.
... As a basis for our design guideline development, we adopt Murad et al. [36]'s guidelines, which address the criteria in the appendix [54]. Among the criteria, the guidelines had to be general enough to apply to agents with many different purposes, had to have specific evaluative usability recommendations, and had to have been used by many others in the literature [55]. ...
Preprint
Full-text available
A majority of researchers who develop design guidelines have WEIRD, adult perspectives. This means we may not have technology developed appropriately for people from non-WEIRD countries and children. We present five design recommendations to empower designers to consider diverse users' desires and perceptions of agents. For one, designers should consider the degree of task-orientation of agents appropriate to end-users' cultural perspectives. For another, designers should consider how competence, predictability, and integrity in agent-persona affects end-users' trust of agents. We developed recommendations following our study, which analyzed children and parents from WEIRD and non-WEIRD countries' perspectives on agents as they create them. We found different subsets of participants' perceptions differed. For instance, non-WEIRD and child perspectives emphasized agent artificiality, whereas WEIRD and parent perspectives emphasized human-likeness. Children also consistently felt agents were warmer and more human-like than parents did. Finally, participants generally trusted technology, including agents, more than people.
... • The difficulty of discovering the CA's abilities. • The cognitive load for recalling information instead of recognizing them as in GUIs (Murad et al., 2018). • The slowness of the conversation compared to other modes. ...
Book
Full-text available
Artificial intelligence is more-or-less covertly entering our lives and houses, embedded into products and services that are acquiring novel roles and agency on users. Products such as virtual assistants represent the first wave of materializa- tion of artificial intelligence in the domestic realm and beyond. They are new interlocutors in an emerging redefined relationship between humans and computers. They are agents, with miscommunicated or unclear proper- ties, performing actions to reach human-set goals. They embed capabilities that industrial products never had. They can learn users’ preferences and accordingly adapt their responses, but they are also powerful means to shape people’s behavior and build new practices and habits. Nevertheless, the way these products are used is not fully exploiting their potential, and frequently they entail poor user experiences, relegating their role to gadgets or toys. Furthermore, AI-infused products need vast amounts of personal data to work accurately, and the gathering and processing of this data are often obscure to end-users. As well, how, whether, and when it is preferable to implement AI in products and services is still an open debate. This condition raises critical ethical issues about their usage and may dramatically impact users’ trust and, ultimately, the quality of user experience. The design discipline and the Human-Computer Interaction (HCI) field are just beginning to explore the wicked relationship between Design and AI, looking for a definition of its borders, still blurred and ever-changing. The book approaches this issue from a human-centered standpoint, proposing designerly reflections on AI-infused products. It addresses one main guiding question: what are the design implications of embedding intelligence into everyday objects?
... • The difficulty of discovering the CA's abilities. • The cognitive load for recalling information instead of recognizing them as in GUIs (Murad et al., 2018). • The slowness of the conversation compared to other modes. ...
Chapter
Full-text available
The current panorama of AI-infused devices portrays a significant dominance of first-party smart speakers, which appear to be the first massive embodiment of AI in the domestic landscape. These devices are nothing more than discreet ornaments, looking at their simple physical appearance. Although, the simple appearance betrays a complexity determined by numerous features that make such products challenging to analyze from a UX point of view. The main evident characteristic is that they are not just “simple products” but ecosystems consisting of several interfaces and touchpoints. Most of them integrate multiple interfaces – namely physical, digital, conversational – sometimes overlapping. The second element of complexity resides in their technological core, based on learning algorithms. Therefore, the same device can provide different outputs at the same input over time, a condition that can affect the user experience. To increase the complexity of these devices, at least from a UX standpoint, there is the fact that their real potential is rarely exploited by most users, which mainly uses routine actions such as reading news, weather forecasting, and controlling simple home appliances. Accordingly, the chapter frames the wicked relationship between user experience and AI-infused products. Moving from the three identified elements of the complexity of AI-infused products, it advances reflection on how it could be possible to analyze these products from a UX standpoint.
... Als Ergebnis aus den Interviews wurden 30 Fragen aus dem FAQ in den Skill übernommen. Die zugehörigen Antworten wurden gekürzt oder aufgeteilt, um so den Empfehlungen in Usability Guidelines für CUIs zu entsprechen [18,26]. Diese empfehlen ein möglichst minimales Design für den Dialog zwischen Mensch und CUI, um die Nutzer:innen kognitiv nicht zu überlasten. ...
Conference Paper
This study presents a first experimental approach for the use of intelligent virtual assistants (IVA) to support political participation. In order to involve as many citizens as possible in participatory political processes, such as the search for a repository site for high-level radioactive waste, IVAs could offer a possibility to convey information in an interactive way and to arouse interest in such complex topics. However, the question arises whether an IVA can adequately convey such a topic and ensure appropriate usability despite many complex dialogues with the user. The explorative study presents the results with a prototypically implemented Amazon Alexa Skill. Compared to a website that addresses the same questions as the Skill, a slightly poorer usability was found. Based on this first study, various questions arose that need to be investigated in future studies. These include questions related to the trustworthiness of such applications and challenges related to the auditive representations of different political opinions.
... Several studies reported challenges in prototyping the user experience of AI systems [21,93]. In response, researchers have developed practitioner-facing AI tools, methods, guidelines, and design patterns to aid designers in accounting for AI systems' UX breakdowns [2,3,31,51,61,65], such as planning for AI inference errors [38] or setting user expectations [45]. ...
Conference Paper
Full-text available
HCI research has explored AI as a design material, suggesting that designers can envision AI's design opportunities to improve UX. Recent research claimed that enterprise applications offer an opportunity for AI innovation at the user experience level. We conducted design workshops to explore the practices of experienced designers who work on cross-functional AI teams in the enterprise. We discussed how designers successfully work with and struggle with AI. Our findings revealed that designers can innovate at the system and service levels. We also discovered that making a case for an AI feature's return on investment is a barrier for designers when they propose AI concepts and ideas. Our discussions produced novel insights on designers' role on AI teams, and the boundary objects they used for collaborating with data scientists. We discuss the implications of these findings as opportunities for future research aiming to empower designers in working with data and AI.
... Recent research in the Ubicomp domain has been focusing on increasing learnability of CUIs [37], on designing CUIs to help people with overcoming problems caused by such interfaces when they arise [36], and on studying the differences between CUIs and classical graphical user interfaces (GUIs) to obtain the design patterns and optimise CUIs' effectiveness [52] or also on how to use CUI to assist in lab work [10]. Similarly, Murad et al. [35] discuss a first taxonomy of design guidelines for hands-free speech interfaces. While the studies on understanding the ways people interact with CUIs in everyday scenarios are very recent [29], [41], [30], [45], [49], [32], there is only a small amount of research on how CUIs can be used in other (more extreme) environments. ...
Conference Paper
Full-text available
Long-term space missions are challenging and demanding for astronauts. Confined spaces and long-duration sensory deprivation may cause psychological problems for the astronauts. In this paper, we envision how extraterrestrial habitats (e.g., a habitat on the Moon or Mars) can maintain the well-being of the crews by augmenting the astronauts. In particular, we report on the design, implementation, and evaluation of conversational user interfaces (CUIs) for extraterrestrial habitats. The goal of such CUIs is to support scientists during their daily and scientific routines on their missions within the extraterrestrial habitat and provide emotional support. During a week-long so-called analog mission with four scientists using a Wizard of Oz prototype, we derived design guidelines for such CUIs. Successively, based on the derived guidelines, we present the implementation and evaluation of two CUIs named CASSIOPEIA and PEGASUS.
... Studies suggest that VA usage can even 'adversely affect traffic safety' in these situations [7]. As of today, in-car voice user interfaces (VUIs) are oftentimes designed based on GUI solutions not pursuing a voice first approach [8]. This adds to the abovementioned issue as both interfaces differ in many regards. ...
... Instead, the interaction with such a device is rather integrated within an actual conversation and used as a utility. Hence, the interaction with a conversational agent does not automatically follow the same rules as a real conversation between people does -at least, if the voice assistant comes in a smart speaker form factor. Attempts in the direction of VUI guidelines based on already existing and proven rules already have been made for GUI guidelines (in distinction to the the GUI guidelines for mild cognitive impairment mentioned above): Murad et al. [26,27] performed a matching between commonly used GUI guidelines, such as those by Nielsen [28], and existing challenges of voice interaction identified by means of a literature review. Their research indicate that, although GUI guidelines might be generalisable and interpretable as VUI guidelines as well, there is a strong need for additional design principles not yet covered by GUI-related guidelines that address challenges such as privacy and context-awareness. ...
Conference Paper
Elderly people and especially people with dementia often experience social isolation and need assistance while performing activities of daily living. We investigate a novel approach to cope with this problem by integrating voice assistants and social assistance robots. Due to the special communication needs of people with mild cog-nitive impairment, the design of interfaces of such systems is to be based on the particular requirements of the target user group. This paper investigates, how a voice user interface should be designed for elderly users with mild cognitive impairment-such as an early stage of dementia-to provide personalised support throughout activities of daily living. A context and user analysis delivered a set of 11 guidelines for voice user interfaces for people with demen-tia. For a pilot study we selected those strategies often applied by caregivers in their communication with people with dementia and evaluated the voice user interface among elderly participants and healthcare workers who reported a high feasibility, usefulness and acceptance of the designed system.
... An interesting approach to create guidelines for CUIs and VUIs for older users is to consider already given and proven guidelines for graphical user interfaces (GUIs) and translate them into guidelines for speech-based interfaces. Murad et al. [23,24] followed this idea and came up with the conclusion that GUI-focused guidelines might be used as a foundation to create guidelines for CUIs but further guidelines and design principles are needed to create a sufficient set of rules to create usable CUIs as the underlying interaction paradigms are too different. ...
Conference Paper
In the past few years, voice assistants have become broadly available in different forms of presentation and devices-not only as a personal assistant within smartphones but as smart speakers, within TV sets or as part of an in-car infotainment system. Furthermore, we live in an ageing society and considering elderly people as users of voice assistants gains more relevance driven by both trends. The goal of this study is to identify the specific age-related preferences of older people when using a conversational user interface in form of a voice assistant. We conducted a survey based on 26 elderly-related communication strategies among participants of different age. The participants had to evaluate the strategies according to their own preferences for using voice assistants. As a result, we identified 11 preferences specific to older users. Surprisingly, most of the communication strategies, when applied to voice assistants, seem to be relevant for users of all ages, and a few of the communication strategies do not apply when used in voice assistants. The preferences specific to older people help to develop new guidelines for voice user interfaces or conversational user interfaces in general. They do not automatically lead to those guidelines but provide a foundation to derive requirements, develop guidelines and evaluate those guidelines by means of user-based usability tests.
... Even though, our dataset can be considered old, this is the first log analysis of a bespoke conversational system. Even though many researchers have suggested design guidelines [9,16] from a user's perspective for conversational systems, little is known about the research log analysis aspect. Lastly, we could not utilise the timestamps series fully due to the different speeds in voices and the lack of end timestamps. ...
Conference Paper
Full-text available
Studies of interaction log analysis are a common tool to investigate behavioural data and contribute to insights into users' interaction patterns with a system [11, 18]. We present a log analysis from a be-spoke conversational system, RealSAM, an audio-only interaction media assistant in which users can navigate and interact with media content through natural language. The novel assistant is designed for people with a vision impairment or other disability that prevents a person from accessing printed material. The exploratory analysis was conducted to provide an initial insight into the communication and interaction behaviours. We focus on understanding how users utilise the application. The results are twofold, we highlight the (i) implications for the design of future voice-enabled systems such as "infinite-reading" mode, enhanced interaction management enabling file navigation or time-compression techniques, and (ii) challenges of analysing conversational logs and suggest guidelines making these logs more accessible for future research.
... This "new genre of conversation" [29] has recently become a topic of analysis for HCI researchers. Numerous studies [1,3,27,57,77,78] indicate the need for customization in these interactions with VAs. Abdolrahmani et al. identifed the inability to adjust the speed of a VA's voice hindered its ability to handle complex tasks [3]. ...
... Una interfaz de voz tiene como objetivo hacer una tarea más rápida y fácil de usar, por lo que el usuario debe tener claro lo que se puede realizar con la interfaz de voz de una manera que se sienta cómodo y sea fácil de aprender a usar. A su vez el sistema debe proporcionar al usuario una sensación de consistencia, control y confianza para que no se sienta perdido durante la interacción con el sistema [14]. Diversos sistemas que manejan VUIs cuentan con servicios de procesamiento en la nube, por lo que la rapidez de respuesta del sistema se puede ver afectada por una mala conectividad en la red de datos. ...
Article
Full-text available
Voice user interfaces (VUI) have been increasingly used in everyday settings and they are growing in popularity. These interfaces have predominantly eyes-free and hands-free interactions. This kind of experiences continues to be an inceptive field compared to other input methods such as touch or using the keyboard/mouse. Thus, it is important to identify tools used to evaluate the usability of VUIs. This article presents a systematic review, in which we analyzed 57 articles and describes nine questionnaires used for evaluating the usability of VUIs, assessing the potential suitability of these questionnaires to measure different types of interactions and various usability dimensions. We found that these questionnaires were used to evaluate the usability of voice-only and voice-added VUIs: AttrakDiff, ICF-US, MOS-X, SUISQ-R, SUS, SASSI, UEQ, PARADISE and USE, where the SUS questionnaire is the most commonly used. However, its items do not directly assess voice quality, although it evaluates the general user interaction with a system. All the questionnaires include items related to three usability dimensions (effectiveness, efficiency, and satisfaction). The questionnaire with the most homogeneous coverage regarding the number of items in each aspect of usability is the SASSI questionnaire. It is a normal practice to use multiple questionnaires to obtain a more complete measurement of usability. We perceive the necessity to increase usability research about the differences between the voice interaction with diverse display types (voice-first, voice-only, voice-added) and the dialog types (command-based and conversational), and how usability affects the user expectations about the VUIs.
... The reason behind this different feeling can be found on the fact that users tend to associate voice and talking to other people, not talking and interacting with a technology [18]. Voice user interfaces are growing in number [19], sophistication and reasonably priced availability, hence it is strategic the definition of best practices and/or guidelines to design the user experience within a vocal user interface [20]. ...
... We focus on VAs, which have received much less research attention in this field than voiceinput systems and other voice-activated technologies, due in part to the novelty of the technology. VAs research examines a large (and increasing) number of issues, including parenting [29], user satisfaction [20], anthropomorphism [13,18], communication [17,8,3,21,14], privacy, trust and security [10,7], design guidelines [16,4], adoption [11,23], emotional expressions [24], social presence [6], and self-disclosure [30]. Whilst VAs research explores the use of VAs by different user groups, including older people [18,15], children [9], and people with disabilities [1,2,19,31], BLVP have been mostly overlooked. ...
Conference Paper
Full-text available
People who are blind or have severe low vision (BLVP) often rely on synthesized voice (output) to interact with computers. Thanks to Voice Assistants (VAs), BLVP can now use voice commands to interact (input) with a range of devices. Yet, very little is known about how they use VAs. This exploratory paper reports on semi-structured, face-to-face interviews with (N=10) legally blind adults, including typhlotechnicians, who teach other BLVP to use digital technologies and may themselves be blind people. Whilst the current impact of VAs on our everyday lives focuses on aiding in the completion of simple day-today activities, the results show that the 'couple' Apple Siri and Voice Over has a strong, positive impact on the everyday lives of our participants. They reported using VAs mostly as a tool, not as a social actor, and that productivity was more important for them than privacy in their everyday use of Siri. Implications for design and research are outlined.
... Contributing to the usage-pattern in private context as worked out in the previous chapter, the IPA supports short commands for information retrieval and task execution. During the development of the voice assistant, guidelines (Murad et al., 2018(Murad et al., , 2019 were taken into account, to create a state-of-the-art user experience to reduce the risks and training cost for ERP-implementations (ElFarmawi, 2019). Table 1 gives a brief summary of the IPAs functions for user-interaction, derived from the research based on detailed and previously mentioned literature research, feedback of conferences and businesssummit participants as explained later in this chapter. ...
Article
Full-text available
Background The usage of intelligent personal assistants (IPA), such as Amazon Alexa or Google Assistant is increasing significantly, and voice-interaction is relevant for workflows in a business context. Objectives This research aims to determine IPA characteristics to evaluate the usefulness of specific functions in a simulated production system of an Enterprise Resource Planning (ERP) software. A new function called explanation-mode is introduced to the scientific community and business world. Methods/Approach As part of a design science research, an artefact, i.e. an add-on for speech-interaction in business software, was developed and evaluated using a survey among ERP users and researchers. Results In the area of IPA-features, the search-function and speech input for textual fields were recognised as most useful. The newly introduced feature, the explanation mode, was positively received too. There is no significant correlation between the usefulness of features and participant-characteristics, affinity to technology or previous experience with IPAs in a private context, which is in line with previous studies in the private environment leading to the conclusion that the task attraction is the most important element for usefulness. Conclusions Most of the participants agreed that the speech-input is not able to fully substitute standard input devices, such as a keyboard or a mouse, so the IPA is recognised as an addition to traditional input methods. The usefulness is rated high especially for speech-input for long text fields, calling up masks and search-functions.
... A log analysis study by Guy [13] showed that voice web search queries were longer and used richer natural language compared to text queries. Pointing out the rapid growth of voice user interfaces, Murad et al. [23] reflected on design guidelines for visual and voice interfaces, noting that a high cognitive load poses design challenges for the latter. Demberg et al. [11] showed that preferences for a voice-based interactive system can vary depending on the usage scenario. ...
Conference Paper
Full-text available
Voice-based assistants have become a popular tool for conducting web search, particularly for factoid question answering. However, for more complex web searches, their functionality remains limited, as does our understanding of the ways in which users can best interact with audio-based search results. In this paper, we compare and contrast user behaviour through the representation of search results over two mediums: text and audio. We begin by conducting a crowdsourced study exposing the differences in user selection of search results when those are presented in text and audio formats. We further confirm these differences and investigate the reasons behind them through a mixed-methods laboratory study. Through a qualitative analysis of the collected data, we produce a list of guidelines for an audio-based presentation of search results.
Preprint
Full-text available
The use of digital therapeutics (DTx) in the prevention and management of medical conditions has increased through the years with an estimated 44 million people using one as part of their treatment plan in 2021, nearly double the amount from last year. DTx are commonly accessed through smartphone apps, but offering these treatments through an alternative input can improve the accessibility of these interventions. Voice apps are an emerging technology in the digital health field, and may be an appropriate alternative platform for some patients. This research aimed to identify the acceptability and feasibility of offering a voice app as an alternative input for a chronic disease self-management program. The objective of this project was to design, develop, and evaluate a voice app of an already existing smartphone-based heart failure self-management program, Medly , to be used as a case study. A voice app version of Medly was designed and developed through a user-centered design process. We conducted a usability study and semi-structured interviews with representative end users (n=8) at the Peter Munk Cardiac Clinic in Toronto General Hospital to better understand the user experience. A Medly voice app prototype was built using a software development kit in tandem with a cloud computing platform. Three out of the eight participants were successful in completing the usability session, while the rest of the participants were not due to various errors. Almost all (7 out of the 8) participants were satisfied with the voice app and felt confident using it. Half of the participants were unsure about using the voice app in the future, though. With these findings, design changes were made to better improve the user experience. With rapid advancements in voice user interfaces, we believe this technology will play an integral role when providing access to DTx for chronic disease management.
Conference Paper
Full-text available
Aging usually involves major changes in roles and social status. The adoption of a sedentary lifestyle, which causes loneliness, depression, and a variety of diseases, is the greatest health risk for older adult. However, persuading an older adult to participate in daily physical activity is not always an easy task. Exergame is a compelling approach to encourage the physical activity of older adults. With the growing power of mobile devices, Mobile Exergaming can now be used to encourage physical activity as well. This paper presents an outcome of a literature review conducted in SCOPUS, Web of Science and Google Scholar. The review is driven by the PRISMA Statement review method. As a result, this paper also proposes a personalized-persuasive mobile exergame design model to encourage physical activities for older adults.
Article
Always-on speech recognition terminals (ASRTs), which detect a user’s speech all the time and convert it into text for the speech interaction system, have broad prospects. However, the conventional implementations of ASRTs, which are always based on accurate computing design, suffer from redundant power consumption, high processing latency, and extensive memory access. Since the processing of the algorithms used for ASRTs has an error-tolerance property in its nature, this article adopts analog and digital approximate computing techniques to solve these challenges.
Conference Paper
Full-text available
The concept of “social affordances” is commonly used in HCI research. However, the advantages and limitations associated with employing the concept are yet to be fully understood. This paper presents a critical examination of “social affordances”, which includes a discussion of current uses of the concept in HCI and a comparison of “social affordances” with more traditional interpretations of “affordances”. We argue that making full use of “social affordances” as an analytical tool in HCI requires an unpacking of the relationship between perceiving a potential action, supported by the environment, and utilizing the potential and actually carrying out the action. We also argue that in case of “social affordances” it is particularly apparent that the perception of an affordance does not automatically result in a problem-free execution of the respective action, and needs to be integrated with other processes within the overall structure of action regulation. We propose a tentative framework for the analysis of the interplay between perception and action in the enactment of social affordances. Implications of the framework for employing the concept of social affordances in HCI research are discussed
Chapter
In recent years, voice user interfaces have evolved substantially, enabling seamless and efficient human–machine interaction through spoken language. In spite of the increasing research, there is an absence of explicit evaluation methods for voice user interfaces (VUIs) for enabling their improvement. Presently, the evaluation criteria are primarily based on subjective metrics such as user reviews, ratings, or likelihood of future use. While these metrics have utility, they are often subjective to the users and offer hurdles to research such as cost of recruitment, time, and resources. Other alternatives are performance based such as response time and error rates, offering value to developers, but little insight into the design of these systems. There is a need for a usability-based evaluation method for VUI, analogous to the heuristic evaluation metrics proposed by Nielsen and Molich for screen-based interfaces. To address the same, our study presents a set of heuristics for usability evaluation of VUIs. Initially, existing literature in the domain of VUI was analysed to identify prevalent themes of usability issues. These themes were then categorised to define 11 usability heuristics. The set of heuristics will enable designers to rapidly evaluate and investigate areas of improvement in VUIs. A between-subjects study with 12 HCI professionals, involving usability evaluation of a VUI application, was conducted to test this hypothesis. The study reveals a statistically significant increase in the number and diversity of usability issues identified using the presented heuristics.
Chapter
The concept of androgynous or gender-neutral fashion is known for its distinctive attribute that blends both conventional masculine and feminine design characteristics. In the history of fashion, the notion of androgynous fashion has been evolving since the 1920s, although it was irregular at times. In the postmodern Western cultures, androgynous aesthetic in fashion is increasingly accepted, encouraging the multiplicity of gender expressions. With significant influencers of the generation identifying themselves as gender-neutral and speaking out on the topic, the concept of being gender fluid is catching a lot of attention recently in the international fashion industry. Androgynous fashion is an emergent trend, which reflects in fashion ramps with models showcasing silhouettes and design elements that breakdown gender stereotypes. With this in mind, the current research aims to study androgynous fashion from both conceptual and user-centric perspectives in the Indian context. Data were collected through primary and secondary sources. Relevant secondary data were gathered from various books, research papers and fashion publications to set the conceptual context of the research. Additionally, to gather primary information about the Indian LGBTQ consumers’ perception of androgynous fashion, a questionnaire was circulated amongst young Indian fashion consumers using convenience and snowball sampling methods. The results and analysis of the study reveal the aspirations behind the gender-neutral design genre. This study also brings out the emotional needs of the Indian LGBTQ community members, who are the primary consumers of androgynous aesthetic.
This book constitutes the proceedings of the 4th International Workshop on Chatbot Research and Design, CONVERSATIONS 2020, which was held during November 23-24, 2020, hosted by the University of Amsterdam. The conference was planned to take place in Amsterdam, The Netherlands, but changed to an online format due to the COVID-19 pandemic. The 14 papers included in this volume were carefully reviewed and selected from a total of 36 submissions. The papers in the proceedings are structured in four topical groups: Chatbot UX and user perceptions, social and relational chatbots, chatbot applications, and chatbots for customer service. The papers provide new knowledge through empirical, theoretical, or design contributions.
Chapter
Chatbots are becoming increasingly important in the customer service sector due to their service automation, cost saving opportunities and broad customer satisfaction. Similarly, in the business-to-business (B2B) sector, more and more companies use chatbots on their websites and social media channels, to establish sales team contact, to provide information about their products and services or to help customers with their requests and claims. Customer relations in the B2B environment are especially characterized by a high level of personal contact service and support through expert explanations due to the complexity of the products and service offerings. In order to support these efforts, chatbots can be used to assist buying centers along the purchase decision process. However, B2B chatbots have so far only been marginally addressed in the scientific human-computer interaction and information systems literature. To provide both researchers and practitioners with knowledge about the characteristics and archetypal patterns of chatbots currently existing in B2B customer services, we develop and discuss a 17-dimensional chatbot taxonomy for B2B customer services based on Nickerson et al. [1]. By classifying 40 chatbots in a cluster analysis, this study has identified three archetypal structures prevailing in B2B customer service chatbot usage.
Chapter
Full-text available
The fundamentals of verbal communication skills are developed during childhood, and existing studies pinpoint the benefits of stimulating language and expression skills from an early age. Our research is a preliminary evaluation of conversational technology to support this process. In this paper, we describe the design process of a speech-based conversational agent for children, which involved a Wizard-of-Oz empirical study with 20 primary school children aged 9–10 y.o. in order to identify the design guidelines for the automated version of the system. Our agent is called ISI, is integrated into a web application and exploits oral and visual interaction modes. ISI enables children to practice verbal skills related to the description of a person’s physical characteristics. It provides opportunities for them to learn and use words and linguistic constructs. Also, ISI permits to develop their body awareness and self-expression (when describing their self) or the attention to “the other” (when describing someone else). ISI engages users in a speech-based conversational flow composed of two main repeated steps. It talks to the children and stimulates them with questions about a specific part of their body (e.g., “What color is your hair?”). When the users describe the required feature adequately, ISI provides a cheerful real-time visual representation of the answer; otherwise, it provides hints.
Article
Commercialised voice user interface devices for the home, like Amazon Echo, Google Home, and Apple HomePod, with integrated digital personal assistants have rapidly grown in popularity. These devices embody intelligent software agents that support users in their everyday life through easy and intuitive conversational interactions. While their use in everyday activities is largely unexplored, the proliferation in home use presents a valuable opportunity to add to understanding around the use of in-home digital personal assistants. In this paper, we investigate their home use in a broad context to learn more about people’s experiences, attitudes, interactions and expectations with these devices contributing new insights to current knowledge around this use. Applying the digital ethnography method, we collected 3542 reviews and comments about Amazon Echo, Google Home, and Apple HomePod on Amazon, eBay, and Reddit. Six main themes and 29 categories were derived through filtering, thematic analysis and affinity diagramming. These findings constitute a conceptual framework characterising the current landscape of home use of digital personal assistants. Additionally, we identify and discuss unique issues discovered around the invisible interface, interactive freedom, and creative appropriation. We use our findings to propose implications for interaction design of DPAs for home use.
Article
Digital personal assistants (DPAs) have recently grown in popularity because they are both a commercially available new technology and reasonably affordable to the average household. This opens opportunities for new ways to assist people in everyday activities in their homes through voice-interaction. Physical activity has significant health benefits, and yet globally, 1 in 4 adults are not active enough. To address this, we investigate the persuasive potential of DPAs in increasing people’s physical activity at home. We conducted a study with 48 participants to understand the effect of applying three of Fogg’s persuasive principles to the design of a DPA exercise programme: Suggestion, Virtual Reward, and Praise. Our findings show that DPAs have the potential, within their current technical and reactive capabilities, to persuade people to increase their physical activity at home, using Suggestion to encourage physical effort, Virtual Reward to encourage endurance, and Praise to create reassurance for beginners. Based on this, we offer three alternate perspectives for developing persuasive DPAs. We also discuss limitations of the study and suggest future research directions around using persuasion with DPAs.
Conference Paper
Full-text available
Voice User Interfaces (VUIs) are growing in popularity. However, even the most current VUIs regularly cause frustration for their users. Very few studies exist on what people do to overcome VUI problems they encounter, or how VUIs can be designed to aid people when these problems occur. In this paper, we analyze empirical data on how users (n=12) interact with our VUI calendar system, DiscoverCal, over three sessions. In particular, we identify the main obstacle categories and types of tactics our participants employ to overcome them. We analyzed the patterns of how different tactics are used in each obstacle category. We found that while NLP Error obstacles occurred the most, other obstacles are more likely to frustrate or confuse the user. We also found patterns that suggest participants were more likely to employ a "guessing" approach rather than rely on visual aids or knowledge recall.
Conference Paper
Full-text available
In this panel, we discuss the challenges that are faced by HCI practitioners and researchers as they study how voice assistants (VA) are used on a daily basis. Voice has become a widespread and commercially viable interaction mechanism with the introduction of VAs such as Amazon's Alexa, Apple's Siri, the Google Assistant, and Microsoft's Cortana. Despite their prevalence, the design of VAs and their embeddedness with other personal technologies and daily routines have yet to be studied in detail. Making use of a roundtable, we will discuss these issues by providing a number of VA use scenarios that panel members will discuss. Some of the issues that researchers will discuss in this panel include: (1) obtaining VA data & privacy concerns around the processing and storage of user data; (2) the personalization of VAs and the user value derived from this interaction; and (3) the relevant UX work that reflects on the design of VAs?
Article
Full-text available
In human-human dialogue, the way in which a piece of information is added to the partners’ common ground (i.e., presented and accepted) constitutes an important determinant of subsequent dialogue memory. The aim of this study was to determine whether this is also the case in human-system dialogue. An experiment was conducted in which a naïve participant and a simulated dialogue system took it in turns to present references to various landmarks featured on a list. The kind of feedback used to accept these references (verbatim repetition vs. implicit acceptance) was manipulated. The participant then performed a recognition test during which he or she attempted to identify the references mentioned previously. Self-presented references were recognised better than references presented by the system; however, such presentation bias was attenuated when the initial presentation of these references was followed by verbatim repetition. Implications for the design of automated dialogue systems are discussed.
Conference Paper
Full-text available
While the technology underlying speech interfaces has improved in recent years, our understanding of the human side of speech interactions remains limited. This paper provides new insight on one important human aspect of speech interactions: the sense of agency -defined as the experience of controlling one's own actions and their outcomes. Two experiments are described. In each case a voice command is compared with keyboard input. Agency is measured using an implicit metric: intentional binding. In both experiments we find that participants' sense of agency is significantly reduced for voice commands as compared to keyboard input. This finding presents a fundamental challenge for the design of effective speech interfaces. We reflect on this finding and, based on current theory in HCI and cognitive neuroscience, offer possible explanations for the reduced sense of agency observed in speech interfaces. Speech interfaces; voice commands; the sense of agency.
Conference Paper
Full-text available
We present the first user study of out-of-turn interaction in menu-based, interactive voice-response systems. Out-of- turn interaction is a technique which empowers the user (un- able to respond to the current prompt) to take the conver- sational initiative by supplying information that is currently unsolicited, but expected later in the dialog. The technique permits the user to circumvent any flows of navigation hard- wired into the design and navigate the menus in a manner which reflects their model of the task. We conducted a la- boratory experiment to measure the effect of the use of out- of-turn interaction on user performance and preference in a menu-based, voice interface to voicemail. Specifically, we compared two interfaces with the exact same hierarchical menu design: one with the capability of accepting out-of- turn utterances and one without this feature. The results in- dicate that out-of-turn interaction significantly reduces task completion time, improves usability, and is preferred to the baseline. This research studies an unexplored dimension of the design space for automated telephone services, namely the nature of user-addressable input (utterance) supplied (in- turn vs. out-of-turn), in contrast to more traditional dimen- sions such as input modality (touch-tone vs. text vs. voice) and style of interaction (menu-based vs. natural language). Author Keywords Out-of-turn interaction, Interactive Voice-Response systems (IVRs), Automated Telephone Services (ATS), speech user interfaces,user studies,Automatic SpeechRecognition(ASR), mixed-initiative interaction, usability.
Conference Paper
Full-text available
As emphasis is placed on developing mobile, educational, and other applications that minimize cognitive load on users, it is becoming more essential to explore interfaces based on implicit engagement techniques so users can remain focused on their tasks. In this research, data were collected with 12 pairs of students who solved complex math problems using a tutorial system that they engaged over 100 times per session entirely implicitly via speech amplitude or pen pressure cues. Results revealed that users spontaneously, reliably, and substantially adapted these forms of communicative energy to designate and repair an intended interlocutor in a computer-mediated group setting. Furthermore, this behavior was harnessed to achieve system engagement accuracies of 75-86%, with accuracies highest using speech amplitude. However, students had limited awareness of their own adaptations. Finally, while continually using these implicit engagement techniques, students maintained their performance level at solving complex mathematics problems throughout a one-hour session.
Conference Paper
Full-text available
Currently there are no dialog systems that enable purely voice-based access to the unstructured information on websites such as Wikipedia. Such systems could be revolutionary for non-literate users in the developing world. To investigate interface issues in such a system, we developed VoicePedia, a telephone-based dialog system for searching and browsing Wikipedia. In this paper, we present the system, as well as a user study comparing the use of VoicePedia to SmartPedia, a Smartphone GUI-based alternative. Keyword entry through the voice interface was significantly faster, while search result navigation, and page browsing were significantly slower. Although users preferred the GUI-based interface, task success rates between both systems were comparable - a promising result for regions where Smartphones and data plans are not viable. Index Terms: dialog system, information access
Conference Paper
HCI research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most difficult modalities to be understood by machines - despite, and perhaps, because it is the highest-bandwidth communication channel we possess. While significant research efforts, from engineering, to linguistic, and to cognitive sciences, have been spent on improving machines' ability to understand speech, the MobileHCI community (and the HCI field at large) has been relatively timid in embracing this modality as a central focus of research. This can be attributed in part to the unexpected variations in error rates when processing speech, in contrast with often-unfounded claims of success from industry, but also to the intrinsic difficulty of designing and especially evaluating speech and natural language interfaces. As such, the development of interactive speech-based systems is mostly driven by engineering efforts to improve such systems with respect to largely arbitrary performance metrics. Such developments have often been void of any user-centered design principles or consideration for usability or usefulness. The goal of this course is to inform the MobileHCI community of the current state of speech and natural language research, to dispel some of the myths surrounding speech-based interaction, as well as to provide an opportunity for researchers and practitioners to learn more about how speech recognition and speech synthesis work, what are their limitations, and how they could be used to enhance current interaction paradigms. Through this, we hope that HCI researchers and practitioners will learn how to combine recent advances in speech processing with user-centred principles in designing more usable and useful speech-based interactive systems.
Conference Paper
In this paper we unpack the use of conversational agents, or so-called intelligent personal assistants (IPAs), in multi- party conversation amongst a group of friends while they are socialising in a café. IPAs such as Siri or Google Now can be found on a large proportion of personal smartphones and tablets, and are promoted as ‘natural language’ interfaces. The question we pursue here is how they are actually drawn upon in conversational practice? In our work we examine the use of these IPAs in a mundane and common-place setting and employ an ethnomethodological perspective to draw out the character of the IPA-use in conversation. Additionally, we highlight a number of nuanced practicalities of their use in multi-party settings. By providing a depiction of the nature and methodical practice of their use, we are able to contribute our findings to the design of IPAs.
Conference Paper
Voice interactions on mobile phones are most often used to augment or supplement touch based interactions for users' convenience. However, for people with limited hand dexterity caused by various forms of motor-impairments voice interactions can have a significant impact and in some cases even enable independent interaction with a mobile device for the first time. For these users, a Mobile Voice User Interface (M-VUI), which allows for completely hands-free, voice only interaction would provide a high level of accessibility and independence. Implementing such a system requires research to address long standing usability challenges introduced by voice interactions that negatively affect user experience due to difficulty learning and discovering voice commands. In this paper we address these concerns reporting on research conducted to improve the visibility and learnability of voice commands of a M-VUI application being developed on the Android platform. Our research confirmed long standing challenges with voice interactions while exploring several methods for improving the onboarding and learning experience. Based on our findings we offer a set of implications for the design of M-VUIs.
Conference Paper
The past four years have seen the rise of conversational agents (CAs) in everyday life. Apple, Microsoft, Amazon, Google and Facebook have all embedded proprietary CAs within their software and, increasingly, conversation is becoming a key mode of human-computer interaction. Whilst we have long been familiar with the notion of computers that speak, the investigative concern within HCI has been upon multimodality rather than dialogue alone, and there is no sense of how such interfaces are used in everyday life. This paper reports the findings of interviews with 14 users of CAs in an effort to understand the current interactional factors affecting everyday use. We find user expectations dramatically out of step with the operation of the systems, particularly in terms of known machine intelligence, system capability and goals. Using Norman's 'gulfs of execution and evaluation' [30] we consider the implications of these findings for the design of future systems.
Article
This paper explores the differences in users' responses to a spoken language search interface through voice and touch gesture input when compared with a textual input search interface. A Wizard of Oz user experiment was conducted with 48 participants who were asked to complete an entry questionnaire and then six tasks on a spoken search interface and six tasks on a textual search interface. Post-task and post-system questionnaires were also completed followed by an exit interview. The content analysis method was used to analyze the transcribed exit interview data. Results from the content analysis indicated that users' familiarity with the system, ease-of-use of the system, speed of the system, as well as trust, comfort level, fun factor and novelty were all factors that affected users' perception. We identified several major factors that may have implications for the design of future spoken language search interfaces and potential improvements in the user experience of such interfaces or systems.
Article
A review of popular technology adoption models identified several factors that are likely to influence Voice Activated Personal Assistant (VAPA) use in public spaces. To inform design decisions of how to make the private use of the VAPA in public spaces more acceptable from the users’ point of view, an online survey was conducted to investigate the likelihood of usage of the smartphone VAPA such as Apple’s Siri (compared to the usage of smartphone keyboard) as a function of location (private vs. public) and type of information (private vs. nonprivate). Responses from participants showed that users were more cautious of transmitting private than nonprivate information. This effect of type of information was amplified in the social context of public locations and when using conspicuous methods of information input such as the VAPA. Participants also preferred using the VAPA in private locations and showed no preference of location for keyboard entries. Correlations between likelihood of usage of VAPA and the social acceptability ratings were positive and predicted similar patterns of smartphone usage.
Conference Paper
Several published sets of usability heuristics were compared with a database of existing usability problems drawn from a variety of projects in order to determine what heuristics best explain actual usability problems. Based on a factor analysis of the explanations as well as an analysis of the heuristics providing the broadest explanatory coverage of the problems, a new set of nine heuristics were derived: visibility of system status, match between system and the real world, user control and freedom, consistency and standards, error prevention, recognition rather than recall, flexibility and efficiency of use, aesthetic and minimalist design, and helping users recognize, diagnose, and recover from errors.
Conference Paper
It's an old story. A relationship built on promises turns to bitterness and recriminations. But speech technology has changed: Yes, we know we hurt you, we know things didn't turn out the way we hoped, but can't we put the past behind us? We need you, we need design. And you? You need us. How can you ful�ll a dream of pervasive technology without us? So let's look at what went wrong. Let's see how we can �x this thing. For the sake of little Siri, she needs a family. She needs to grow into more than a piece of PR, and maybe, if we could only work out our di�erences, just maybe, think of the magic we might make together.
Conference Paper
In a study of users' interactions with Siri, the iPhone personal assistant application, we noticed the emergence of overlaps and blurrings between explanatory categories such as "human" and "machine". We found that users work to purify these categories, thus resolving the tensions related to the overlaps. This "purification work" demonstrates how such categories are always in flux and are redrawn even as they are kept separate. Drawing on STS analytic techniques, we demonstrate the mechanisms of such "purification work." We also describe how such category work remained invisible to us during initial data analysis, due to our own forms of latent purification, and outline the particular analytic techniques that helped lead to this discovery. We thus provide an illustrative case of how categories come to matter in HCI research and design.
Article
This study examined the effects of the number of options in a message and different message endings on the memorisation of multiple-option messages. Twenty-seven participants were told to pay attention to the quality of interactions between users and an interactive voice response system and were asked to recall system messages. The multiple-option messages contained three, five or seven options and ended either in a pseudoword suffix, in a natural-language prompt or in a beep. Results showed that option recall was impaired when messages were longer and contained a suffix. The interaction between the number of options and the presence of a suffix was not significant. Results also showed that, in messages with five or more options, the recency effect was greater than the primacy effect. These results bolster our knowledge about the design of spoken menus.
Article
We evaluated two strategies for alleviating working memory load for users of voice interfaces: presenting fewer options per turn and providing confirmations. Forty-eight users booked appointments using nine different dialogue systems, which varied in the number of options presented and the confirmation strategy used. Participants also performed four cognitive tests and rated the usability of each dialogue system on a standardised questionnaire. When systems presented more options per turn and avoided explicit confirmation subdialogues, both older and younger users booked appointments more quickly without compromising task success. Users with lower information processing speed were less likely to remember all relevant aspects of the appointment. Working memory span did not affect appointment recall. Older users were slightly less satisfied with the dialogue systems than younger users. We conclude that the number of options is less important than an accurate assessment of the actual cognitive demands of the task at hand.
Article
Interaction with electronic speech products is becoming a fact of life through telephone answering systems, and speech driven booking systems and is set to increase in the future. Older adults will be obliged to use more of these electronic products and because of their special interactional needs due to age related impairment it is important that such interactions are designed to suit the needs of such users, and in particular to support learning about the interaction. Drawing upon the expertise of tutors at Age Concern Oxfordshire, and the results of preliminary investigations with older adults using dialogues in a speech system, adults use a speech system, this paper explores the conditions which best provide for the learning experience of older adults and looks at special features which enable instructions and help for learning to be embedded within speech dialogue design.
Article
This experiment investigated the effect of interface metaphor and context of use (private/public) on the usability of a hierarchically structured speech-activated mobile city guide service. Two different versions of the service were evaluated using a Wizard of Oz methodology. The first was a non-metaphor standard service with numbered menu options. The second was a service based on an office filing system metaphor, with different metaphor-related menu options at each level. User performance and attitudes to the services were recorded over a six week period, and post-trial interviews conducted. Results showed that the interface metaphor improved participants' performance compared to the standard service, but had no effect on attitudes. Context of use did not affect the usability of the services, which supports their use for mobile interaction. Visualisation of the metaphor-based service significantly affected participants' attitudes, suggesting an additional benefit of using interface metaphor for the design of speech-based mobile phone services.
Article
This paper explores the consequences of adopting an alternative strategy to that of explicitly listing all options within the main menu of a speech-driven automated telephone banking service. An existing service was augmented with an overdraft request dialogue, accessible at its main menu, which could be triggered using the keyword “overdraft”. However, the keyword was not explicitly mentioned as an option in the main menu. Instead, system-initiated proposals for an overdraft were introduced into the call flow, notifying callers that they could apply for an overdraft by saying “overdraft” at the main menu. An experiment with 114 participants was carried out to investigate the effectiveness of this strategy as a way of offering new services without increasing the length of the main menu. Results showed that a significant proportion of participants (37%) did not succeed in completing an overdraft request. The reasons for this failure are discussed.
Article
The objective of this study was to compare empirically the effect on performance and satisfaction of a menu and a front-end voice interface to a commonly used spreadsheet software package. In this study, the type of human-computer interface used (standard keyboard/mouse use of menus or keyboard/mouse with a voice front-end) is expected to influence user performance (task completion time and error rates) and satisfaction. The user's novice/expert classification is expected to interact with these two types of interfaces to influence the efficiency of the user's performance. The results suggest that there are significant relationships between task performance and level of expertise/type of interface and between user attitudes and type of interface. In general, the front-end voice interface users performed worse and had less favorable attitudes towards the software tool than the menu interface users.
Conference Paper
We present an approach to control information flow in object-oriented systems. The decision of whether an informatin flow is permitted or denied depends on both the authorizations specified on the objects and the process by which information is obtained ...
Article
To improve speech recognition applications, designers must understand acoustic memory and prosody. Human-human relationships are rarely a good model for designing effective user interfaces. Spoken language is effective for human-human interaction but often has severe limitations when applied to human-computer interaction. Speech is slow for presenting information, is transient and therefore difficult to review or edit, and interferes significantly with other cognitive tasks. However, speech has proved useful for store-and-forward messages, alerts in busy environments, and input-output for blind or motor-impaired users.
Article
SpeechActs is an experimental conversational speech system. Experience with redesigning the system based on user feedback indicates the importance of adhering to conversational conventions when designing speech interfaces, particularly in the face of speech recognition errors. Study results also suggest that speech-only interfaces should be designed from scratch rather than directly translated from their graphical counterparts. This paper examines a set of challenging issues facing speech interface designers and describes approaches to address some of these challenges. Keywords: Speech interface design, speech recognition, auditory I/O, discourse, conversational interaction. INTRODUCTION Mobile access to on-line information is crucial for traveling professionals who often feel out of touch when separated from their computer. Missed messages can cause serious inconvenience or even spell disaster when decisions are delayed or plans change. A portable computer can empower the nomad to...
Progress in Mobile User Experience
  • Raluca Budiu
Raluca Budiu. Progress in Mobile User Experience. Nielsen Norman Group. Retrieved May 16, 2018 from www.nngroup.com.
Mobile Usability, First Findings
  • Jakob Nielsen
Jakob Nielsen. Mobile Usability, First Findings. Nielsen Norman Group. Retrieved May 16, 2018 from https://www.nngroup.com.
Do animals have accents
  • Martin Porcheron
  • Joel Fischer
  • Sarah Sharples
Martin Porcheron, Joel Fischer and Sarah Sharples. 2017. "Do animals have accents ?" Proc. CSCW '17.
Hidden menu options in automated human-computer telephone dialogues
  • J Wilke
  • F Mcinnes
  • M A Jack
  • P Littlewood
J. Wilke, F. McInnes, M. A. Jack, and P. Littlewood. 2007. Hidden menu options in automated humancomputer telephone dialogues. J Behaviour and Information Technology 26, 6.
What can I help you with
  • R Benjamin
  • Nadia Cowan
  • David Pantidi
  • Kellie Coyle
  • Peter Morrissey
  • Sara Clarke
  • David Al-Shehri
  • Natasha Earley
  • Bandeira
Benjamin R. Cowan, Nadia Pantidi, David Coyle, Kellie Morrissey, Peter Clarke, Sara Al-Shehri, David Earley, and Natasha Bandeira. 2017. "What can I help you with?" Proc. MobileHCI '17: 1-12.
Factors Affecting User Perception of a Spoken Language vs
  • M Grace
  • Ning Begany
  • Xiaojun Sa
  • Yuan
Investigating memory constraints on recall of options in interactive voice response system messages
  • Le Ludovic
  • Loïc Bigot
  • Christine Caroux
  • Ros
  • Joseph Kaye
  • Joel Fischer
  • Jason Hong
  • Frank Bentley
  • Cosmin Munteanu
  • Alexis Hiniker
Joseph Kaye, Joel Fischer, Jason Hong, Frank Bentley, Cosmin Munteanu, Alexis Hiniker, Janice POSTERS MobileHCI'18, September 3-6, Barcelona, Spain
Voice Assistants, UX Design and Research
  • Tawfiq Tsai
  • Ammari
Tsai, and Tawfiq Ammari. 2018. Voice Assistants, UX Design and Research. In Proc of CHI EA '18.
Like Having a Really Bad PA
  • Ewa Luger
  • Abigail Sellen
Ewa Luger and Abigail Sellen. 2016. "Like Having a Really Bad PA". Proc. CHI '16.
Speech and Hands-free Interaction
  • Cosmin Munteanu
  • Gerald Penn
Cosmin Munteanu and Gerald Penn. 2017. Speech and Hands-free Interaction. Proc. CHI EA '17.
What can I help you with?
  • Cowan Benjamin R.
Doubled Currency. Donald Norman. 1988. The design of everyday things
  • Donald Norman
Factors Affecting User Perception of a Spoken Language vs. Textual Search Interface
  • Begany Grace M.