Conference PaperPDF Available

Abstract and Figures

The UEQ+ is a modular framework for the construction of UX questionnaires. The researcher can pick those scales that fit his or her research question from a list of 16 available UX scales. Currently, no UEQ+ scales are available to allow measuring the quality of voice interactions. Given that this type of interaction is increasingly essential for the usage of digital products, this is a severe limitation of the possible products and usage scenarios that can be evaluated using the UEQ+. We describe in this paper the construction of three specific scales to measure the UX of voice interactions. Besides, we discuss how these new scales can be combined with existing UEQ+ scales in evaluation projects. CCS CONCEPTS • Human-centred computing • Human computer interaction • HCI design and evaluation methods
Content may be subject to copyright.
Construction of UEQ+ Scales for Voice Quality
Measuring User Experience !ality of Voice Interaction
Andreas M. Klein
Faculty of Technology
University of Applied Sciences
Emden/Leer
Emden, Germany
andreas.klein@hs-emden-leer.de
Andreas Hinderks
Department of Computer
Languages and Systems
University of Seville
Seville, Spain
andreas.hinderks@iwt2.org
Martin Schrepp
SAP SE
Walldorf, Germany
martin.schrepp@sap.de
Jörg !omaschewski
Faculty of Technology
University of Applied Sciences Emden/Leer
Emden, Germany
joerg.thomaschewski@hs-emden-leer.de
ABSTRACT
The UEQ+ is a modular framework for the construction of UX
questionnaires. The researcher can pick those scales that fit his or
her research question from a list of 16 available UX scales.
Currently, no UEQ+ scales are available to allow measuring the
quality of voice interactions. Given that this type of interaction is
increasingly essential for the usage of digital products, this is a
severe limitation of the possible products and usage scenarios that
can be evaluated using the UEQ+. We describe in this paper the
construction of three specific scales to measure the UX of voice
interactions. Besides, we discuss how these new scales can be
combined with existing UEQ+ scales in evaluation projects.
CCS CONCEPTS
Human-centred computing
Human computer
interaction
HCI design and evaluation methods
KEYWORDS
User Experience, Usability, Voice Systems, Voice Interaction,
Voice User Interfaces, Measurement, !estionnaires, UX, VUI
ACM Reference format:
Andreas M. Klein, Andreas Hinderks, Martin Schrepp, Jörg
!omaschewski. 2020. Construction of UEQ+ Scales for Voice "ality:
Measuring User Experience "ality of Voice Interaction. In
Proceedings of
MuC'20, September 69, 2020, Magdeburg, Germany, 5 pages.
https://doi.org/10.1145/3404983.3410003
1 INTRODUCTION
"e impression of a user about the user experience (UX) of a
product results from his or her perception of many distinct quality
aspects, for example, eciency of use, stimulation, trust, or visual
aesthetics. "e importance of such quality aspects for the UX
impression varies between products supporting dierent tasks
and use cases [1]. For example, intuitive use is mandatory for an
infrequently used self-service (users forget how to use it between
two usage points). Simultaneously, an unnecessary click does not
hurt much, as eciency is not so crucial. For a very o%en-used
business application, intuitive use is not essential (some learning
is accepted). Due to the high usage frequency, each unnecessary
click hurts much, so high eciency is a key requirement. "e
many UX quality aspects and their varying importance for
dierent products caused the creation of many dierent UX
questionnaires. Each of these questionnaires realizes by its
selection of scales a dierent set of measured UX quality aspects
and thus ts to a certain group of products [2]. Of course, none of
these questionnaires contain all UX quality aspects discussed in
research literature, since this would increase the length of the
questionnaire above any reasonable limit. For a UX researcher
evaluating a concrete product, this can be an issue. If he or she has
a clear view of which UX aspects are important and should be thus
measured in an evaluation project, it can easily happen that none
of the published UX questionnaires really t these requirements.
Sometimes, it is possible to combine several UX questionnaires to
cover all relevant aspects, but this also causes practical problems
since dierent questionnaires o%en have dierent item and
answer formats [3, 4]. "e UEQ+ [3, 4] is a modular framework
that tries to address this issue. It consists of 16 scales that can be
combined to form a concrete questionnaire. "us, the researcher
can decide which of these scales are important for the product that
should be evaluated. "en he or she can simply pick those scales
and combine them for the evaluation. For a detailed description of
the idea behind UEQ+ and scale construction please refer to [3, 4].
"e material to conduct a study with the UEQ+ can be found free
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or
distributed for profit or commercial advantage and that copies bear this notice and
the full citation on the first page. Copyrights for components of this work owned
by others than ACM must be honored. Abstracting with credit is permitted. To
copy otherwise, or republish, to post on servers or to redistribute to lists, requires
prior specific permission and/or a fee. Request permissions from
Permissions@acm.org.
MuC'20, September 69, 2020, Magdeburg, Germany © 2020 Association for
Computing Machinery. ACM ISBN 978-1-4503-7540-5/20/09…$15.00
https://doi.org/10.1145/3404983.3410003
MuC'20, September 69, 2020, Magdeburg, Germany
A. M. Klein et al.
of charge on h'ps://ueqplus.ueq-research.org/. Currently the
UEQ+ framework does not provide scales that measure the quality
of voice interaction. But in recent years voice and speech
recognition so%ware has become the leading-edge interface
technology for a wide range of applications, for example in
healthcare, automotive, authentication and identication, voice
commerce and customer service, and smart home sectors [5].
Current studies expect that global sales of voice and speech
recognition so%ware will increase from about $1.3 billion in 2019
to nearly $7 billion in 2025 [5]. "us, an increasingly important
category of products cannot be evaluated by questionnaires built
with the UEQ+ framework, for example popular voice assistants
(VAs) such as Google Assistant (Google), Siri (Apple) or Alexa
(Amazon). We try to ll this gap by constructing new UEQ+ scales
that cover the UX aspects associated with voice interactions. "is
paper describes the scale construction and how these new scales
can be used in the UEQ+ framework [3, 4] to measure the UX of
systems based on voice interaction. "ere are a few questionnaires
[6, 7, 8, 9] that measure the usability of voice systems. "ese
questionnaires concentrate solely on the task-related aspects of an
interaction and ignore non-task related or hedonic aspects of
voice interactions. In addition, they mix UX aspects of voice
interaction with other more general UX aspects and cannot,
therefore, simply be reused within the UEQ+ framework.
2 THE UEQ+ FRAMEWORK
"e UEQ+ is not a UX questionnaire in the sense that it can
directly by applied to measure the UX of a specic product. It is a
modular catalogue with 16 UX scales that can be combined to form
a UX questionnaire. "e name UEQ+ was chosen because the six
scales of the UEQ (
A!ractiveness
,
Eciency
,
Perspicuity
,
Dependability
,
Stimulation
,
Novelty
) were used as a starting point.
Besides these six scales several others were designed and added:
Trust
[10],
Haptics
and
Acoustics
[11], or
Aesthetics
,
Adaptability
,
Usefulness
,
Intuitive Use
,
Value
,
Trustworthiness of Content
and
Content #ality
[3,4]. "e UEQ+ and an Excel tool for evaluation
are available free of charge at h'ps://ueqplus.ueq-research.org/.
Since UEQ+ scales can be combined arbitrarily, a special modular
scale format is used. Each scale consists of a short introductory
sentence (this sentence sets the context for the items), followed by
four items in the form of a semantic dierential with a seven-point
Likert-scale. In addition, the importance of the UX aspect
represented by the scale is asked directly below the scale. As an
example, we show the scale
Eciency
[4]:
To achieve my goals, I consider the product as
slow
o o o o o o o
fast
inefficient
o o o o o o o
efficient
impractical
o o o o o o o
practical
cluttered
o o o o o o o
organized
The product property described by these terms is for me
completely irrelevant
o o o o o o o
highly relevant
"e scales of the UEQ+ that are selected for a specic situation are
simply displayed one below the other. In a product-related
questionnaire only a limited number of scales (for example 6
scales) should be used in order to keep the eort required to
complete the questionnaire manageable. Further information on
the selection of scales can be found in the UEQ+ handbook [12].
"e UEQ+ scales can be grouped into three dierent types:
Scale type 1
describes user interaction with the product. For
example, the user holds the product in his hand (
Haptics
) or the
product emits sounds during use (
Acoustics
). Aesthetic plays a role
when interacting with graphical user interfaces (GUI), for
example, whether the user interface appears nice and appealing.
Scale type 2
summarizes fundamental and psychological qualities.
While applying the product, the general user needs are to be
captured here. Examples are
Perspicuity
, whether the product is
easy to understand and learn, or
Eciency
, whether its goals can
be achieved with minimal eort.
Scale type 3
addresses specic needs of utilization and the
resulting consequences. For example,
A!ractiveness
, whether the
user evaluates the product positively or rejects it summarily.
Additional subjective impressions are
Trust
, whether the user
places its input data in safe hands or
Value
, whether the product
is professional and of high quality.
If there is interaction with specic product types, scale type 1 is
selected, whereas scale types 2 and 3 can be used independently
of the interaction, this is, they can be combined with scale type 1
as desired. For Voice User Interfaces (VUIs), scale type 1 lacks the
appropriate scales, and this article aims to close this gap.
3 ASPECTS OF VOICE INTERACTION
Depending on use cases, dierent additional UX criteria may be
relevant for voice interaction. According to an American study
[13], examples of use cases for smart speakers are found primarily
in the areas of information (e.g. weather reports) and
entertainment (listening to streaming music services). What Siri
and Alexa are for consumers, SAP CoPilot will be for business
users [14]. Key features of SAP CoPilot include natural language
communication using a dialogue-oriented user interface and
support for business contexts and machine learning based on
predened business rules [14]. Hassenzahl and Tractinsky [15]
identify three components that signicantly inuence UX in
human-computer interaction (HCI): rst, the user with
expectation and motivation; second, the system with
characteristics such as functionality and purpose; and third, the
context with the organizational/social environment and the
purpose of the activity. Cohen, Giangola, Balogh [16] describe
three elements (Goals/Context, User and Application) has to be
understood, to dene criteria for measuring quality and
improvements in the design process of a VUI application. In VA
applications, the user expects a natural and trustful interaction,
that the system fulls the user's intention and that the context
recognizes the user's intent without particular formulations. From
these considerations, three UX aspects can be derived that are
signicant for speech interaction [17].
MuC'20, September 69, 2020, Magdeburg, Germany
A. M. Klein et al.
Response behaviour:
Users expect that a voice system
communicates like a human conversationalist. "us, responses
should be respectful, patient, polite, and trustworthy.
Response quality:
"e responses of the voice system cover the
users information needs. "us, answers are perceived as clear,
distinct, and up-to-date; the queries match the context; and the
users intention is fullled.
Comprehensibility:
"e user has the impression that the VA
correctly understands his or her instructions and questions using
natural language. "e intention of the user is recognized without
forcing him or her to use an unnatural way of speaking.
4 STUDY FOR SCALE CONSTRUCTION
The empirical study described below shows the construction of
the scales that represent the three UX aspects described above.
Therefore, an online survey was conducted with a German-
language questionnaire containing the scales with a selection of
bipolar items in the future called candidate items.
4.1 Participants
The online questionnaire was sent to several email distribution
lists of students and members of the University of Applied
Sciences in Emden/Leer (Germany). A total of 96 persons
participated voluntarily. The average age of the participants (59
male, 35 female, 2 no answer) was 35 years (SD 12).
4.2 Material
We created a selection of candidate items for each of the UX
aspects described above for VAs. The concrete items can be found
in the tables of the next section. The online questionnaire contains
all candidate items listed one below the other after an
introductory sentence that sets the appropriate context.
4.3 Procedure
"e survey by online questionnaires took place between 7 and 24
January 2020. "e participants were advised in the introductory
email not to continue the survey if they had no experience with
voice interaction. In the beginning, when querying for the socio-
demographic data, the participants could choose which VA to use.
"en the scales followed in the order
Response behaviour
,
Response
quality
and
Comprehensibility
with the corresponding candidate
items in the order shown in Tables 1 to 3. "e study was done in
the German language. Detailed data analysis and screenshots of
the online study pages are available in the research protocol [18].
4.4 Results
The following VAs were rated by the participants (number of
participants that have chosen this system in brackets): Alexa (35),
Siri (27), Google Assistant (26), Others (8).
1
Response behaviour (German original items): 1. technisch/menschlich’, 2.
künstlich/natürlich’, 3. fremd/vertraut’, 4. ungewöhnlich/gewöhnlich’, 5.
langsam/schnell’, 6. unangenehm/angenehm‘, 7. unsympathisch/sympathisch‘, 8.
unfreundlich/freundlich‘, 9. ‘langweilig/unterhaltsam‘.
"e factorial analysis of all three candidate item sets shows
(Kaiser-Gu'man criteria and analysis of the scree plot) that a
single factor represents the data suciently well (see [18]). In
addition, a common analysis of all 30 items together with principal
component analysis (Varimax rotation) conrmed the assumption
of three factors. We show in the following the items used and the
loadings on the corresponding factor, which is the basis for the
selection of the four items that represent the scale.
Table 1: Set of candidate items for response behaviour
1
No.
Items
Loadings
1
technical
human
0.66
2
artificial
natural
0.80
3
unfamiliar
familiar
0.66
4
unusual
usual
0.25
5
slow
fast
0.48
6
unpleasant
pleasant
0.75
7
unlikeable
likable
0.81
8
unfriendly
friendly
0.66
9
boring
entertaining
0.68
Table 1 shows the items and loadings for the factor corresponding
to the scale
Response behaviour
. Items 2, 6, 7, and 9 were selected
with an introducing sentence as follows:
In my opinion the response behaviour of the voice assistant is
artificial
o o o o o o o
natural
unpleasant
o o o o o o o
pleasant
unlikable
o o o o o o o
likable
boring
o o o o o o o
entertaining
Table 2: Set of candidate items for response quality
2
No.
Items
Loadings
1
incomprehensible
understandable
0.14
2
illogical
logical
0.55
3
inappropriate
suitable
0.74
4
useless
useful
0.76
5
not helpful
helpful
0.82
6
laborious
simple
0.42
7
uninteresting
interesting
0.41
8
unintelligent
intelligent
0.61
9
unclear
clear
0.47
10
indistinct
exacting
0.53
11
outdated
current
0.49
2
Response quality (German original items): 1. unverständlich/verständlich’, 2.
unlogisch/logisch’, 3. unpassend/passend’, 4. nutzlos/nützlich’, 5. nicht
hilfreich/hilfreich’, 6. umständlich/einfach‘, 7. uninteressant/interessant‘, 8.
unintelligent/intelligent‘, 9. unklar/klar‘, 10. undeutlich/deutlich‘, 11.
veraltet/aktuell‘.
MuC'20, September 69, 2020, Magdeburg, Germany
A. M. Klein et al.
Table 2 shows all items and loadings for the scale Response quality.
The highest loadings are found in items 3, 4, 5, and 8. The
introductory sentence is as follows:
The answers and questions asked by the voice assistant are
inappropriate
o o o o o o o
suitable
useless
o o o o o o o
useful
not helpful
o o o o o o o
helpful
unintelligent
o o o o o o o
intelligent
Table 3: Set of candidate items for comprehensibility
3
No.
Items
Loadings
1
complicated
simple
0.84
2
inaccurate
accurate
0.78
3
nonsensical
apt
0.72
4
unambiguous
ambiguous
0.82
5
illogical
logical
0.75
6
incomprehensible
understandable
0.78
7
unexpected
expected
0.68
8
unclear
clear
0.79
9
enigmatic
explainable
0.75
10
difficult
easy
0.72
Table 3 shows the corresponding results for the scale
Comprehensibility. The three highest loadings are found in items
1, 4, and 8. Item 8 was not selected because it showed an overlap
with item 9 of Response quality (see table 2). Items 2 and 6 with
the identical value of 0.78 move up. Since item 6 overlaps with
items of the existing scales Perspicuity and Quality of Content of
the UEQ+, it is replaced by item 9 with a slightly lower load. The
focus here is on speech comprehensibility in the exemplary sense
of:Does the VA give enigmatic answers because it does not
understand me?”. The UEQ+ scales Perspicuity and Quality of
Content represent UX aspects for voice assistance systems that are
prospectively used often in combination with the new scales, as
described in the following section. The final scale is as follows:
In my opinion the voice assistant has understood my voice
commands
complicated
o o o o o o o
simple
unambiguous
o o o o o o o
ambiguous
inaccurate
o o o o o o o
accurate
enigmatic
o o o o o o o
explainable
5 USING SCALES IN THE UEQ+ FRAMEWORK
The selection of relevant scales for creating a product-related
questionnaire depends on various sources of information. Winter,
Hinderks, Schrepp and Thomaschewski [1] recommend that
product-specific UX aspects should be considered first to be
followed by other criteria. These can also be UX aspects that are
essential for marketing and product placement but not vital for
the user. Further information on the scale selection and creation
3
Comprehensibility (German original items): 1. kompliziert/einfach’, 2.
ungenau/genau’, 3. unsinnig/sinnig’, 4. nicht eindeutig/eindeutig’, 5.
of a product-related questionnaire can be found in the UEQ+
handbook [12].
We conclude with two examples showing how the new scales for
voice interaction can be combined with other UEQ+ scales for
concrete evaluation projects. We assume that we want to evaluate,
for example, the application of smart home VAs used for general
tasks, such as asking questions or online-shopping. In this case,
the three voice scales can be combined with scales highly relevant
for the information search in the web [1], that is, Perspicuity, Trust
and Quality of Content. If we want to evaluate the UX of a VA
customer service, other criteria are relevant [1] since the main
focus of the user here is to get his or her request or task done. In
this case, the classical scales Efficiency, Dependability, and
Perspicuity may be good candidates besides the new voice scales.
Of course, the specific use case and goal of the evaluation
determine which scales should be combined.
6 SUMMARY
"is article describes the construction of voice interaction scales
for the UEQ+ framework. "e modular concept of the UEQ+ is
based on various scales, which allows the measurement of
product-specic UX aspects [9]. "e extension of the UEQ+ scale
type 1 by voice interaction closes a gap in the UEQ+ and
demonstrates a new method for the exible evaluation of voice
assistance systems. In this empirical study, three new scales were
developed; and it was shown how relevant UX aspects for voice
interaction could be represented. "e data were evaluated using
factor analysis and they presented the factors
Response
behaviour
,
Response quality
and
Comprehensibility
with four
candidate items each. Two compact examples of possible
questionnaires nally demonstrate how the UEQ+ scales can be
combined with the new voice interaction scales. "e validation of
the new voice interaction scales is planned in a further study that
will include the creation and application of questionnaires for
voice system evaluation to obtain benchmarks.
REFERENCES
[1] Winter, D., Hinderks, A., Schrepp, M. & Thomaschewski, J., (2017). Welche UX
Faktoren sind für mein Produkt wichtig? In: S. Hess & H. Fischer (Eds.), Mensch
und Computer 2017Usability Professionals. Regensburg: Gesellschaft für
Informatik e.V. (pp. 191200).
[2] Schrepp, M. (2018). User Experience mit Fragebögen messen [Measure user
experience with questionnaires]. Amazon Kindle Direct Publishing, ISBN:
9781986843768.
[3] Schrepp, M., & Thomaschewski, J. (2019). Eine modulare Erweiterung des User
Experience Questionnaire. Konferenz: Mensch und Computer 2019. DOI:
10.18420/muc2019-up-0108
[4] Schrepp, M., & Thomaschewski, J. (2019). Design and validation of a framework
for the creation of user experience questionnaires. International Journal of
Interactive Multimedia and Artificial Intelligence. DOI:
10.9781/ijimai.2019.06.006.
[5] https://www.tractica.com/newsroom/press-releases/voice-and-speech-
recognition-software-market-to-reach-6-9-billion-by-2025/, accessed January
23, 2020.
[6] Hone, K. S., & Graham, R. (2000). Towards a tool for the subjective assessment
of speech system interfaces (SASSI). Natural Language Engineering, 6(3-4), 287
303.
unlogisch/logisch’, 6. unverständlich/verständlich‘, 7. unerwartet/erwartet‘, 8.
unklar/klar‘, 9. ‘rätselhaft/erklärbar‘, 10. ‘schwierig/leicht‘.
MuC'20, September 69, 2020, Magdeburg, Germany
A. M. Klein et al.
[7] Polkosky, M. D. (2008). Machines as mediators: The challenge of technology for
interpersonal communication theory and research. In E. Konjin (Ed.), Mediated
Interpersonal Communication (pp. 3457). New York, NY: Routledge.
[8] Polkosky, M. D., & Lewis, J. R. (2003). Expanding the MOS: Development and
psychometric evaluation of the MOS-R and MOS-X. International Journal of
Speech Technology, 6(2), pp. 161-182.
[9] Bos, J., Larsson, S., Lewin, I., Matheson, C., & Milward, D. (1999). Survey of
existing interactive systems. Trindi (Task Oriented Instructional Dialogue)
report, (D1), 3.
[10] Hinderks, A. (2016). Modifikation des User Experience Questionnaire (UEQ) zur
Verbesserung der Reliabilität und Validität. Unveröffentlichte Masterarbeit,
University of Applied Sciences Emden/Leer.
[11] Boos, B. & Brau, H., (2017). Erweiterung des UEQ um die Dimensionen Akustik
und Haptik. In: Hess, S. & Fischer, H. (Hrsg.), Mensch und Computer 2017
Usability Professionals, Regensburg: Gesellschaft für Informatik e.V., S. 321
327.
[12] Schrepp, M. & Thomaschewski, J. (2019). Handbook for the modular extension
of the User Experience Questionnaire. - All you need to know to apply the
UEQ+ to create your own UX questionnaire. DOI: 10.13140/RG.2.2.15485.20966.
[13] https://voicebot.ai/2019/03/12/smart-speaker-owners-agree-that-questions-
music-and-weather-are-killer-apps-what-comes-next, accessed January 30,
2020.
[14] SAP CoPilot. https://blogs.sap.com/2016/10/06/the-human-touch-sap-
introduces-a-digital-assistant-for-the-enterprise/, accessed February 5, 2020.
[15] Hassenzahl, Tractinsky, User experiencea research agenda, Behaviour &
Information Technology, Vol. 25, No. 2, March-April 2006, 91 97.
[16] Cohen, Michael H., Giangola, James P. & Balogh, Jennifer, (2004). Voice user
interface design. Boston: Addison-Wesley Professional.
[17] Klein, A. M., Hinderks, A., Schrepp, M., & Thomaschewski, J., (2020). Measuring
User Experience Quality of Voice Assistants. 2020 15th Iberian Conference on
Information Systems and Technologies (CISTI), Sevilla, Spain, 2020, pp. 1-4,
DOI: 10.23919/CISTI49556.2020.9140966.
[18] Klein, A. M., Hinderks, A., Schrepp, M., & Thomaschewski, J., (2020). Protocol
for. Measuring User Experience Quality of Voice Assistants. DOI:
10.13140/RG.2.2.12816.35848
... Existing questionnaires should be extended or a new questionnaire should be created to evaluate VAs, which should lead to improvements in VAs. For example, a new and flexible method is the modular framework UEQ+ based on various scales to construct a product-specific questionnaire for which three VUI scales have been developed but not yet validated [16]. Others, however, focus on exploring current users, use cases, and systems to understand VA interaction, as well as finding design patterns [2,6,28]. ...
... Another category could be response quality, which combines other answer options such as ability to answer quicker, sound more natural, ability to distinguish between users, and ability to recognize feelings. These categories merge appropriate response options as, e.g., the voice quality scales of the UEQ+ framework, which contain four bipolar item-pairs with 7-point Likert-type scales [16,25]. These scales contain very similar product characteristics as those that are assigned to UX aspects. ...
... Therefore, we intend to ask them about how they use VAs in order to identify research gaps regarding assessment methods and the VA context of use [14]. Our study results can help extend measurement methods, e.g., scale construction for the UEQ+ framework regarding VUI assessment [16]. Hence, we must define relevant UX criteria depending on the VA use case and apply the factorial analysis to identify single factors [17]. ...
Article
Currently, voice assistants (VAs) are trendy and highly available. The VA adoption rate of internet users differs among European countries and also in the global view. Due to speech intelligibility and privacy concerns, using VAs is challenging. Additionally, user experience (UX) assessment methods and VA improvement possibilities are still missing, but are urgently needed to overcome users’ concerns and increase the adoption rate. Therefore, we conducted an intercultural study of technology-based users from Germany and Spain, expecting that higher improvement potential would outweigh concerns about VAs. We investigated VA use in terms of availability versus actual use, usage patterns, concerns, and improvement proposals. Comparing Germany and Spain, our findings show that nearly the same amount of intensive VA use is found in both technology-based user groups. Despite cultural differences, further results show very similar tendencies, e.g., frequency of use, privacy concerns, and demand for VA improvements.
... For post-evaluation, we propose the UEQ+ framework [69], as it can be adapted to further dimensions depending on the context, goal, or purpose. The UEQ+ framework [69], based on the User Experience Questionnaire (UEQ) [41], provides a flexible UX assessment approach in different languages, e.g., Spanish [66].From the VUI perspective, the UEQ+ provides three specific VUI voice quality scales designed to measure the following UX aspects: response behavior (i.e., the VUI reacts like a human conversationalist), response quality (i.e., the user's intention is fulfilled), and comprehensibility (i.e., the user's intention is recognized without any special formulation) [32,33]. The UEQ+ framework currently consists of 20 scales (e.g., efficiency, dependability, trust, and trustworthiness of content), which can be combined arbitrarily to create a product-related questionnaire that measures a variety of UX aspects [68]. ...
... Mixed-method approaches in VUI research are common [7,38,27]. An example, if one wants to assess a VUI customer service response regarding train schedules (the giving context), then the selector's recommendation could be a mixed-method approach to the UEQ+ with the six appropriate scales (e.g., efficiency, dependability, perspicuity, response behavior, response quality, and comprehensibility [32]) in combination with another tool (e.g., interviews). Interviews provide deeper insights into users' experiences with products [27]. ...
Conference Paper
Voice user interfaces (VUI) come in various forms of software or hardware, are controlled by voice, and can help the user in their daily life. Despite VUIs being readily available on smartphones, they have a low adoption rate. This can be attributed to challenges such as the misunderstanding of voice commands as well as privacy and data security concerns. Still, there are intensive VUI users, but they also raise concerns that may be independent of culture. Hence, we will discuss in our paper the various areas that should be considered when developing VUIs to increase user acceptance and foster a positive user experience (UX). We propose exploring the context of use and UX aspects to understand users' needs while using VUIs. Additionally, we suggest using the UEQ+ scales for VUI assessment to compare different design decisions. All of our suggestions can help VUI developers to design better VUIs.
... Recent research includes several attempts to define important UX aspects of VUI using an expertdriven process (Hone and Graham, 2000;Kocaballi et al., 2019;Klein et al., 2020a). To the best of our knowledge, however, there is no user-driven identification of relevant UX aspects for VUIs that is based on up-to-date user data. ...
... Examples of other UEQ+ scales are Attractiveness, Novelty, and Efficiency. The voice quality scales are constructed with consideration of human-computer interaction (HCI) and the VUI design process (Klein et al., 2020a). User, system, and context all influence HCI significantly (Hassenzahl and Tractinsky, 2006). ...
Conference Paper
Full-text available
Voice User Interfaces (VUIs) are becoming increasingly available while users raise, e.g., concerns about privacy issues. User Experience (UX) helps in the design and evaluation of VUIs with focus on the user. Knowledge of the relevant UX aspects for VUIs is needed to understand the user’s point of view when developing such systems. Known UX aspects are derived, e.g., from graphical user interfaces or expert-driven research. The user’s opinion on UX aspects for VUIs, however, has thus far been missing. Hence, we conducted a qualitative and quantitative user study to determine which aspects users take into account when evaluating VUIs. We generated a list of 32 UX aspects that intensive users consider for VUIs. These overlap with, but are not limited to, aspects from established literature. For example, while Efficiency and Effectivity are already well known, Simplicity and Politeness are inherent to known VUI UX aspects but are not necessarily focused. Furthermore, Independency and Context-sensitivity are some new UX aspects for VUIs.
... In a second step, we updated the list of UX factors Hinderks et al. (2020b) by including the UX factors from the UX questionnaire 'User Experience Questionnaire Plus (UEQ+)' Schrepp and Thomaschewski (2019b). Additionally, we included the UX factors from Klein et al. (2020) and Otten et al. (2020). These UX factors are specifically for voice assistants, such as Alexa, Siri, or Google Home. ...
... 4 Overview of all UX Factors fromHinderks et al. (2020b),Schrepp and Thomaschewski (2019b),Klein et al. (2020), andOtten et al. (2020) Dependability: The product always responds to user interaction in a predictable and consistent way.UX factorHinderks Schrepp Klein Otten Ease of Use: The product is easy to operate. X Efficiency: The user can reach their goals with minimum time required and minimum physical effort. ...
Thesis
Context. Agile methods are increasingly being used by companies, to develop digital products and services faster and more effectively. Today's users not only demand products that are easy to use, but also products with a high User Experience (UX). Agile methods themselves do not directly support the development of products with a good user experience. In combination with UX activities, it is potentially possible to develop a good UX. Objective. The objective of this PhD thesis is to develop a UX Lifecycle, to manage the user experience in the context of Agile methods. With this UX Lifecycle, Agile teams can manage the UX of their product, in a targeted way. Method. We developed the UX Lifecycle step by step, according to the Design Science Research Methodology. First, we conducted a Structured Literature Review (SLR) to determine the state of the art of UX management. The result of the SLR concludes in a GAP analysis. On this basis, we derived requirements for UX management. These requirements were then implemented in the UX Lifecycle. In developing the UX Lifecycle, we developed additional methods (UX Poker, UEQ KPI, and IPA), to be used when deploying the UX Lifecycle. Each of these methods has been validated in studies, with a total of 497 respondents from three countries (Germany, England, and Spain). Finally, we validated the UX Lifecycle, as a whole, with a Delphi study, with a total of 24 international experts from four countries (Germany, Argentina, Spain, and Poland). Results. The iterative UX Lifecycle (Figure 1) consists of five steps: Initial Step 0 ‘Preparation’, Step 1 ‘UX Poker’ (before development/Estimated UX), Step 2 ‘Evaluate Prototype’ (during development/Probable UX), Step 3 ‘Evaluate Product Increment’ (after development/Implemented UX), and a subsequent Step 4 ‘UX Retrospective’. With its five steps, the UX Lifecycle provides the structure for continuously measuring and evaluating the UX, in the various phases. This makes it possible to develop the UX in a targeted manner, and to check it permanently. In addition, we have developed the UX Poker method. With this method, the User Experience can be determined by the Agile team, in the early phases of development. The evaluation study of UX Poker has indicated that UX Poker can be used to estimate the UX for user stories. In addition, UX Poker inspires a discussion about UX, that results in a common understanding of the UX of the product. To interpret the results from the evaluation of a prototype and product increment, we developed or derived the User Experience Questionnaire KPI and Importance-Performance Analysis. In a first study, we were able to successfully apply the two methods and, in combination with established UEQ methods, derive recommendations for action, regarding the improvement of the UX. This would not have been possible without their use. The results of the Delphi study, to validate the UX Lifecycle, reached consensus after two rounds. The results of the evaluation and the comments lead to the conclusion, that the UX Lifecycle has a sufficiently positive effect on UX management. Conclusion. The goal-oriented focus on UX factors and their improvement, as propagated in the UX Lifecycle, are a good way of implementing UX management in a goal-oriented manner. By comparing the results from UX Poker, the evaluation of the prototype, and product increment, the Agile team can learn more about developing a better UX, within a UX retrospective. The UX Lifecycle will have a positive effect on UX management. The use of individual components of the UX Lifecycle, such as UX Poker or Importance-Performance Analysis, already helps an Agile team to improve the user experience. But only in combination with the UX Lifecycle and the individual methods and approaches presented in this PhD thesis, is a management of the user experience in a targeted manner possible, in our view. This was the initial idea of this PhD thesis, which we are convinced we could implement.
... Walaupun metode UEQ telah banyak digunakan untuk mengukur pengalaman pengguna berbagai aplikasi termasuk pembelajaran elektronik [6]- [11], sistem pengelolaan administrasi akademik [12], [13], e-commerce [14]- [16], aplikasi kesehatan [17], [18], dan e-government [19], hingga saat ini masih sedikit sekali penelitian yang memanfaatkan UEQ+ karena framework ini masih tergolong baru. Beberapa penelitian yang telah menggunakan UEQ+ di antaranya adalah penelitian yang mengukur pengalaman pengguna dari voice assistant [20], [21], aplikasi ujian mobile [22], dan website universitas [23]. ...
Article
Full-text available
This study aims to measure and evaluate the user experience of the Microsoft Teams application as a learning and video conferencing platform using the UEQ+ framework, which is the development of the UEQ method. Through the UEQ+ framework, questionnaires can be designed by customizing user experience variables according to the application to be measured, thus the research results are expected to be more accurate and relevant. The scale of user experience measured in this questionnaire includes: efficiency, perspicuity, dependability, trust, usefulness, intuitive use, trustworthiness of content, quality of content, and clarity. After the questionnaires were distributed, 149 data were obtained which could be processed using data processing tools which being provided by UEQ+ called Data Analysis Tools. In conclusion, respondents have positive impression of Microsoft Teams as a video conferencing application and learning platform. The most important scale that represents the quality of Microsoft Teams is usefulness, clarity, trustworthiness of content, and quality of content.
... and a high affinity for technology interaction (M =4.304, SD=0.86). After interacting with MentalBuddy, the participants completed the User Experience Questionnaire (UEQ) [24] and UEQ+ scales for voice interaction [10], and participated in a brief semi-structured interview. ...
Preprint
Full-text available
Voice interactions with conversational agents are becoming increasingly ubiquitous. At the same time, stigmas around mental health are beginning to break down, but there remain significant barriers to treatment. Mental health conditions are highly prevalent and people fail to receive help due to lack of access, information, or structures. We aim to address these problems by investigating the applicability of voice-based conversational agents for mental health. In this paper, we introduce our first prototype, MentalBuddy, present initial user feedback, and discuss the potential ethical implications of using conversational agents in mental health applications. With proper considerations, conversational interfaces have the potential to create scalable access to mental health prevention, diagnosis, and therapy.
... This questionnaire, implemented through the conversational agent, was used by a group of 40 participants to evaluate two products. Later, these same participants evaluated the agent using the UEQ + [15] questionnaire, constructed from the scales for interfaces of voice proposals in [16]. ...
Chapter
In recent years there has been an increase of voice interfaces, driven by developments in Artificial Intelligence and the expansion of commercial devices that use them, such as smart assistants present on phones or smart speakers. One field that could take advantage of the potential of voice interaction is in the self-administered surveys data collection, such as standardized UX evaluation questionnaires. This work proposes a set of conversational design patterns for standardized UX evaluation questionnaires that use semantic difference scales as a means of collecting quantitative information on user experience, as is the case of AttrakDiff and UEQ (User Experience Questionnaire). The presented design patterns seek to establish a natural conversation created in accordance with the user, the conservation of context between subsequent questions, the minimization of statements and with statement repair mechanisms not completely understood by the user or voice agent, as eliciting explanation of a concept or repetition.
Article
Navigating indoor spaces is especially challenging for individuals with blindness and visual impairments. Although many solutions currently exist, the acceptance of most of them is extremely poor due to their technical limitations and the complete lack of taking into consideration factors, such as usability and the perceived experience among others, which influence adoption rates. To alleviate this problem, we created BlindMuseumTourer, a state-of-the-art indoor navigation smartphone application that tracks and navigates the user inside the spaces of a museum. At the same time, it provides services for narration and description of the exhibits. The proposed system consists of an Android application that leverages the sensors found on smartphones and utilizes a novel pedestrian dead reckoning (PDR) mechanism that optionally takes input from the Bluetooth low energy (BLE) beacons specially mounted on the exhibits. This article presents the extended Usability and User Experience evaluation of BlindMuseumTourer and the findings carried out with 30 participants having varying degrees of blindness. Throughout this process, we received feedback for improving both the available functionality and the specialized user-centred training sessions in which blind users are first exposed to our application’s functionality. The methodology of this evaluation employs standardized questionnaires and semi-structured interviews, and the results indicate an overall positive attitude from the users. In the future, we intend to extend the number and type of indoor spaces supported by our application.
Article
Conversational agents are becoming popular for providing a more natural and realistic user experience. New studies have become significant to understand how to assess this experience, mainly because of the increase in applications of this nature. We systematically reviewed the literature to identify how the user experience is assessed when interacting with conversational agents. A total of 443 studies were identified in the ACM, IEEE, Springer, and Scopus databases. Of these, 27 studies met the eligibility criteria. Most studies used their own evaluation methods, without adopting questionnaires validated for UX evaluation. Few studies used assessment tools before participants interacted with agents, and only two carried out assessments before, during, and after use. The results of the assessments can be better if specific instruments for UX are adopted. Furthermore, it is necessary to assess the experience at different times and use combined methods, to understand aspects related to the participants’ feelings and behaviours.
Data
Full-text available
Klein, A. M., Hinderks, A. Schrepp, M. and Thomaschewski, J., (2020) describe the construction of three scales to measure UX aspects specific to voice systems and how these can be used in the UEQ+ framework (Schrepp & Thomaschewski, 2019) to measure UX of such systems. This research report gives detailed information about data analysis done for VUI scale construction and the first validations of the extension scales.
Conference Paper
Full-text available
Wir stellen einen modularen Ansatz vor, mit dem man sich aus einem Katalog von 16 UX Aspekten einen perfekt passenden UX Fragebogen bauen kann. Grundlage sind die 6 Skalen des UEQ, die um 10 weitere Skalen erweitert wurden. Wir beschreiben die Konstruktion und erste Evaluationsergebnisse zu diesen neuen Skalen. Es wird weiterhin beschrieben, wie UX Professionals sich aus diesem Katalog einfach einen Fragebogen zusammenstellen können, wie die Daten ausgewertet werden, was man bei diesem Ansatz in der praktischen Anwendung beachten sollte, aber auch wo die Limitationen dieses Vorgehens sind, d.h. in welchen Fällen man besser einen vorhandenen Standardfragebogen verwendet. Alle notwendigen Materialien werden im Vortrag besprochen und nach dem Vortrag frei zur Verfügung gestellt.
Article
Full-text available
Existing user experience questionnaires have a fixed number of scales. Each of these scales measures a distinct aspect of user experience. These questionnaires can be used with little effort and provide a number of useful support materials that make the application of such a questionnaire quite easy. However, in practical evaluation scenarios it can happen that none of the existing questionnaires contains all scales necessary to answer the research question. It is of course possible to combine several UX questionnaires in such cases, but due to the variations of item formats this is also not an optimal solution. In this paper, we describe the development and first validation studies of a modular framework that allows the creation of user experience questionnaires that fit perfectly to a given research question. The framework contains several scales that measure different UX aspects. These scales can be combined to cover the relevant research questions.
Technical Report
Full-text available
This handbook describes how to use the UEQ+ (a modular Extension of the User Experience Questionnaire UEQ). Scales, items and basic information about best practices are provided.
Conference Paper
Full-text available
Langwierige Diskussionen, bei denen sich die beteiligten Personen über einzelne Aspekte eines Designs nicht einigen können, sind für erfahrene UX-Professional nichts Neues. Solche Konflikte sind in der Regel kraft- und zeitraubend. Diese Konflikte resultieren oft aus unterschiedlichen Einschätzungen bzgl. der Wichtigkeit bestimmter Qualitätseigenschaften eines Produkts. Allerdings ist den meisten Produktteams, und damit den Beteiligten am Design-Prozess, nicht bewusst welche Aspekte wichtig sind und welche vernachlässigt werden können. Das macht die Diskussion in solchen Konfliktsituationen umso schwieriger, da die Beteiligten aus ihrer jeweiligen Perspektive Recht haben. Wir stellen einen Prozess vor, der die Reduzierung von Abstimmungskonflikten unterstützt und die Produktentwicklung dadurch beschleunigt. Als Grundlage verwenden wir 16 klar beschriebene UX-Faktoren, welche in empirischen Studien ermittelt und bzgl. ihrer Wichtigkeit für bestimmte Produktgruppen bewertet wurden.
Thesis
Full-text available
Der Mehrwert dieser Masterarbeit lässt sich in drei Bereiche zusammenfassen. Erstens wurde ein Fragebogen mit den Dimensionen Vertrauen, Attraktivität, Steuerbarkeit und Durchschaubarkeit entwickelt. Zweitens wurde eine Methode vorgestellt, mit der die Faktorenanalyse iterativ eingesetzt wird, um ein bessere Aussage bezüglich des Faktorenmodells und der Stichprobengröße geben zu können. Und als drittes wurde durch die Modifikation des UEQ die Grundlage für eine mögliche Modularisierung gelegt.
Book
Eine gute User Experience ist für den Erfolg eines interaktiven Produkts sehr wichtig. Für die Gestaltung und die kontinuierliche Verbesserung des Produktdesigns ist es daher hilfreich, diesen Qualitätsaspekt auch quantitativ messen zu können. Andererseits haben verschiedene Nutzer in Bezug auf die User Experience eines Produkts oft stark abweichende Meinungen, d.h. es handelt sich hier um eine sehr subjektiv wahrgenommene Produkteigenschaft. Aus diesem Grund sind Fragebögen eine geeignete und in der Praxis auch weit verbreitete Methode zur Messung von User Experience, da man hier mit vertretbarem Aufwand auch größere Gruppen von Nutzern befragen kann. Dieses Buch gibt einen Überblick der wichtigsten User Experience Fragebögen. Weiterhin wird das abstrakte Konzept User Experience in Teilaspekte aufgeschlüsselt, z.B. Effizienz, Steuerbarkeit, Schönheit und Originalität des Designs. Es wird untersucht, welche dieser Teilaspekt für welche Arten von Produkten wichtig sind und daher bei einer Messung berücksichtigt werden sollten. Abschließend werden Hinweise für die praktische Durchführung und Auswertung einer Produktevaluation mit einem Fragebogen gegeben. Das Buch richtet sich vor allem an Praktiker, die Fragebögen für Produktevaluationen nutzen möchten.