DataPDF Available

Construction of Voice Communication Scales for the UEQ+ Framework Protocol for Measuring User Experience Quality of Voice Assistants Version 1.0 / 2020

Authors:

Abstract

Klein, A. M., Hinderks, A. Schrepp, M. and Thomaschewski, J., (2020) describe the construction of three scales to measure UX aspects specific to voice systems and how these can be used in the UEQ+ framework (Schrepp & Thomaschewski, 2019) to measure UX of such systems. This research report gives detailed information about data analysis done for VUI scale construction and the first validations of the extension scales.
1"
"
Construction of
Voice Communication Scales for
the UEQ+ Framework
Protocol for Measuring User Experience Quality of Voice Assistants
Version 1.0 / 2020
Andreas M. Klein
University of Applied Sciences Emden/Leer
Emden, Germany
Andreas Hinderks
University of Seville
Seville, Spain
Martin Schrepp
SAP SE
Walldorf, Germany
Jörg Thomaschewski
University of Applied Sciences Emden/Leer
Emden, Germany
17.04.2020
2"
"
Contents
1"Research Goal".................................................................................................................."3"
2"Scale Construction"..........................................................................................................."3"
2.1" Construction"of"candidate"items".................................................................................."3"
2.2" Setup"of"the"study"........................................................................................................"5"
2.3" Results".........................................................................................................................."5"
2.3.1"Un-rotated"............................................................................................................."6"
2.3.2" Promax"rotation"....................................................................................................."7"
2.3.3" Varimax"rotation"...................................................................................................."8"
2.3.4" Response"behaviour"(varimax"rotation)"................................................................"9"
2.3.5" Response"quality"(varimax"rotation)"....................................................................."9"
2.3.6" Comprehensibility"(varimax"rotation)"................................................................."10"
3"Conclusion"......................................................................................................................."11"
4"Future work"......................................................................................................................"11"
Bibliography"..........................................................................................................................."12"
Appendix – Screenshots of the used questionnaires"......................................................"13"
3"
"
1 Research Goal
Klein, Hinderks, Schrepp and Thomaschewski (2020) describe the construction of
three scales to measure UX aspects specific to voice systems and how these can be
used in the UEQ+ framework (Schrepp & Thomaschewski, 2019) to measure UX of
such systems.
The UEQ+ (Schrepp & Thomaschewski, 2019) is a modular framework that contains
several scales to measure different UX aspects. These scales can be combined to
create a product-related questionnaire that covers the relevant aspects of a given
research question.
The qualities of voice user interfaces (VUI) cannot be captured with the modular
construction kit of the UEQ+ available so far, because the focus is on graphical user
interfaces (GUI).
This research report gives detailed information about data analysis done for VUI
scale construction and the first validations of the extension scales. The description of
the research context and the application of the extension scales are not part of this
document. The publication of Klein, Hinderks, Schrepp and Thomaschewski (2020)
provides detailed information about that topic.
2 Scale Construction
The new scales "Response behaviour", "Response quality" and "Comprehensibility"
are designed to capture the interaction, or more precisely, the UX aspects of the
user's communication with the voice assistants (VA). These scales are derived from
a list of components that significantly influences UX aspects described by Klein,
Hinderks, Schrepp and Thomaschewski (2020).
2.1 Construction of candidate items
For each selected UX aspect, several potential German items in the UEQ+ format
(semantic differential with a 7-point Likert-scale) were constructed by three experts.
We call these items in the following the candidate items. The final selection of
candidate items per factor name can be found in the following table 1:
4"
"
No
Candidate items
(German Original Version)
Candidate items
(English Translation)
Factor name
1
technisch
menschlich
human
Antwortverhalten
(German Original Version)
Response behaviour
(English Translation)
2
künstlich
natürlich
natural
3
fremd
vertraut
familiar
4
ungewöhnlich
gewöhnlich
usual
5
langsam
schnell
fast
6
unangenehm
angenehm
pleasant
7
unsympathisch
sympathisch
likable
8
unfreundlich
freundlich
friendly
9
langweilig
unterhaltsam
entertaining
10
unverständlich
verständlich
understandable
Antwortqualität
(German Original Version)
Response quality
(English Translation)
11
unlogisch
logisch
logical
12
unpassend
passend
suitable
13
nutzlos
nützlich
useful
14
nicht hilfreich
hilfreich
helpful
15
umständlich
einfach
simple
16
uninteressant
interessant
interesting
17
unintelligent
intelligent
intelligent
18
unklar
klar
clear
19
undeutlich
deutlich
exacting
20
veraltet
aktuell
current
21
kompliziert
einfach
simple
Verständlichkeit (German
Original Version)
Comprehensibility
(English Translation)
22
ungenau
genau
accurate
23
unsinnig
sinnig
apt
24
nicht eindeutig
eindeutig
ambiguous
25
unlogisch
logisch
logical
26
unverständlich
verständlich
understandable
27
unerwartet
erwartet
expected
28
unklar
klar
clear
29
rätselhaft
erklärbar
explainable
30
schwierig
leicht
easy
Table 1: Candidate items
5"
"
2.2 Setup of the study
The online questionnaire was sent to several e-mail distribution lists of students and
members of the University of Applied Sciences in Emden/Leer (Germany). A total of
96 persons participated voluntarily. Each participant could choose a VA that he or she
uses (participants were instructed in the introductory mail not to start the
questionnaire if they have no experience with a speech system) and rate it with the
corresponding lists of candidate items. The average age of the participants (59 male,
35 female, 2 no answer) was 35 years. The age ranges from 16 years to 78 years.
The survey took place between 7 and 24 January 2020.
2.3 Results
The resulting data was analysed by factorial analysis using the function principal of
the R package psych (Revelle, 2018).
Figure 1: Scree plot - Results of Principal Component Analysis with all
items.
The scree plot speaks for a solution with 3 or 4 factors. We decided for our scale
construction to use the 3-factor solution since it provides a semantically more evident
interpretation.
Factor
Eigenvalue
Variance
Cumulative
1
10,13
23,51%
23,51%
2
3,34
9,81%
33,32%
3
2,35
9,37%
42,69%
4
1,55
9,35%
52,04%
5
1,51
6,66%
58,71%
6
1,31
6,63%
65,34%
7
1,11
5,67%
71,01%
Table 2: Eigenvalues and variance explained by the extracted factors.
0,00"
2,00"
4,00"
6,00"
8,00"
10,00"
12,00"
1" 2" 3" 4" 5" 6" 7"
Eigenvalue"
Factor"Number"
6"
"
In the following description of the extracted three factor solution and loading of the
items on these factors we use the original German items and factor names. The
English translation of the items can be found in tables 1, 6, 7 and 8.
According to the guidelines of Comrey and Lee (2013), charges with a value > 0.45
are relevant. These are highlighted in colour in following table 3, 4 and 5.
2.3.1 Un-rotated
Pos
Factor name
Items
Factor
1
Factor
2
Factor
3
1
Antwortverhalten
technisch - menschlich
0,46
0,51
-0,02
2
künstlich - natürlich
0,35
0,73
-0,01
3
fremd - vertraut
0,56
0,46
-0,07
4
ungewöhnlich - gewöhnlich
0,37
0,09
-0,12
5
langsam - schnell
0,42
0,33
-0,07
6
unangenehm - angenehm
0,52
0,58
-0,07
7
unsympathisch - sympathisch
0,42
0,69
-0,14
8
unfreundlich - freundlich
0,46
0,50
-0,08
9
langweilig - unterhaltsam
0,34
0,60
0,01
10
Antwortqualität
unverständlich - verständlich
0,35
-0,20
-0,05
11
unlogisch - logisch
0,50
-0,22
0,34
12
unpassend - passend
0,51
-0,21
0,55
13
nutzlos - nützlich
0,40
0,04
0,66
14
nicht hilfreich - hilfreich
0,53
-0,08
0,64
15
umständlich - einfach
0,45
-0,30
0,21
16
uninteressant - interessant
0,48
0,11
0,23
17
unintelligent - intelligent
0,43
0,22
0,49
18
unklar - klar
0,70
-0,30
0,14
19
undeutlich - deutlich
0,65
-0,05
0,26
20
veraltet - aktuell
0,51
0,09
0,30
21
Verständlichkeit
kompliziert - einfach
0,78
-0,21
-0,31
22
ungenau - genau
0,75
-0,21
-0,22
23
unsinnig - sinnig
0,73
-0,12
-0,21
24
nicht eindeutig - eindeutig
0,76
-0,23
-0,27
25
unlogisch - logisch
0,68
-0,18
-0,32
26
unverständlich - verständlich
0,79
-0,13
-0,24
27
unerwartet - erwartet
0,65
-0,28
-0,12
28
unklar - klar
0,79
-0,24
-0,15
29
rätselhaft - erklärbar
0,77
-0,17
-0,15
30
schwierig - leicht
0,71
-0,20
-0,19
Table 3: Loadings of candidate items (un-rotated).
7"
"
2.3.2 Promax rotation
Pos
Factor name
Item
Factor
1
Factor
2
Factor
3
1
Antwortverhalten
technisch - menschlich
0,04
0,06
0,66
2
künstlich - natürlich
-0,17
0,02
0,84
3
fremd - vertraut
0,17
0,04
0,64
4
ungewöhnlich - gewöhnlich
0,30
-0,04
0,21
5
langsam - schnell
0,15
0,01
0,46
6
unangenehm - angenehm
0,08
0,02
0,74
7
unsympathisch - sympathisch
0,00
-0,10
0,84
8
unfreundlich - freundlich
0,09
0,00
0,65
9
langweilig - unterhaltsam
-0,12
0,05
0,70
10
Antwortqualität
unverständlich - verständlich
0,40
0,06
-0,10
11
unlogisch - logisch
0,23
0,53
-0,13
12
unpassend - passend
0,07
0,77
-0,15
13
nutzlos - nützlich
-0,23
0,84
0,07
14
nicht hilfreich - hilfreich
-0,05
0,87
-0,02
15
umständlich - einfach
0,33
0,39
-0,21
16
uninteressant - interessant
0,10
0,38
0,22
17
unintelligent - intelligent
-0,20
0,65
0,28
18
unklar - klar
0,56
0,38
-0,13
19
undeutlich - deutlich
0,28
0,48
0,09
20
veraltet - aktuell
0,08
0,47
0,19
21
Verständlichkeit
kompliziert - einfach
0,91
-0,12
0,04
22
ungenau - genau
0,81
-0,02
0,02
23
unsinnig - sinnig
0,74
-0,02
0,10
24
nicht eindeutig - eindeutig
0,87
-0,07
0,00
25
unlogisch - logisch
0,82
-0,15
0,04
26
unverständlich - verständlich
0,80
-0,03
0,12
27
unerwartet - erwartet
0,71
0,06
-0,09
28
unklar - klar
0,81
0,07
-0,01
29
rätselhaft - erklärbar
0,76
0,05
0,06
30
schwierig - leicht
0,75
0,00
0,02
Table 4: Loadings of candidate items (promax rotation).
8"
"
2.3.3 Varimax rotation
Pos
Factor name
Item
Factor
1
Factor
2
Factor
3
1
Antwortverhalten
technisch - menschlich
0,15
0,66
0,13
2
künstlich - natürlich
-0,04
0,80
0,07
3
fremd - vertraut
0,27
0,66
0,14
4
ungewöhnlich - gewöhnlich
0,30
0,25
0,05
5
langsam - schnell
0,22
0,48
0,09
6
unangenehm - angenehm
0,19
0,75
0,11
7
unsympathisch - sympathisch
0,09
0,81
-0,01
8
unfreundlich - freundlich
0,18
0,66
0,09
9
langweilig - unterhaltsam
0,00
0,68
0,10
10
Antwortqualität
unverständlich - verständlich
0,38
-0,02
0,14
11
unlogisch - logisch
0,34
-0,01
0,55
12
unpassend - passend
0,24
-0,03
0,74
13
nutzlos - nützlich
0,00
0,14
0,76
14
nicht hilfreich - hilfreich
0,17
0,09
0,82
15
umständlich - einfach
0,39
-0,10
0,42
16
uninteressant - interessant
0,22
0,28
0,41
17
unintelligent - intelligent
0,02
0,33
0,61
18
unklar - klar
0,61
0,02
0,47
19
undeutlich - deutlich
0,41
0,21
0,53
20
veraltet - aktuell
0,23
0,26
0,49
21
Verständlichkeit
kompliziert - einfach
0,84
0,19
0,09
22
ungenau - genau
0,78
0,16
0,17
23
unsinnig - sinnig
0,72
0,23
0,15
24
nicht eindeutig - eindeutig
0,82
0,15
0,13
25
unlogisch - logisch
0,75
0,17
0,04
26
unverständlich - verständlich
0,78
0,26
0,16
27
unerwartet - erwartet
0,68
0,05
0,21
28
unklar - klar
0,79
0,14
0,25
29
rätselhaft - erklärbar
0,75
0,20
0,23
30
schwierig - leicht
0,72
0,15
0,17
Table 5: Loadings of candidate items (varimax rotation)
The selection of the items that finally should represent the factors in the
corresponding UEQ+ scales are based on the varimax rotation
9"
"
2.3.4 Response behaviour (varimax rotation)
Concerning table 5, the following candidate items are used.
No.
Items
(German Original Version)
Items
(English Translation)
Loading
1
technisch
menschlich
technical
human
0.66
2
künstlich
natürlich
artificial
natural
0.80
3
fremd
vertraut
unfamiliar
familiar
0.66
4
ungewöhnlich
gewöhnlich
unusual
usual
0.25
5
langsam
schnell
slow
fast
0.48
6
unangenehm
angenehm
unpleasant
pleasant
0.75
7
unsympathisc
h
sympathisch
unlikeable
likable
0.81
8
unfreundlich
freundlich
unfriendly
friendly
0.66
9
langweilig
unterhaltsam
boring
entertaining
0.68
Table 6: Set of items representing factor 2 (varimax rotation).
The four items with the highest loading were selected to represent the factor name
Response Behaviour corresponding to the factor.
2.3.5 Response quality (varimax rotation)
Concerning table 5, the following candidate items are used.
No.
Items
(German Original Version)
Items
(English Translation)
Loading
10
unverständlich
verständlich
incomprehensible
understandable
0.14
11
unlogisch
logisch
illogical
logical
0.55
12
unpassend
passend
inappropriate
suitable
0.74
13
nutzlos
nützlich
useless
useful
0.76
14
nicht hilfreich
hilfreich
not helpful
helpful
0.82
15
umständlich
einfach
laborious
simple
0.42
16
uninteressant
interessant
uninteresting
interesting
0.41
17
unintelligent
intelligent
unintelligent
intelligent
0.61
18
unklar
klar
Not clear
clear
0.47
19
undeutlich
deutlich
indistinct
exacting
0.53
20
veraltet
aktuell
outdated
current
0.49
Table 7: Set of items and factor 3 (varimax rotation)
10"
"
The four items with the highest loading were selected to represent the factor name
Response Quality corresponding to the factor.
"
"
2.3.6 Comprehensibility (varimax rotation)
Concerning table 5, the following candidate items are used.
No.
Items
(German Original Version)
Items
(English Translation)
Loading
21
kompliziert
einfach
simple
0.84
22
ungenau
genau
accurate
0.78
23
unsinnig
sinnig
apt
0.72
24
nicht eindeutig
eindeutig
ambiguous
0.82
25
unlogisch
logisch
logical
0.75
26
unverständlich
verständlich
understandable
0.78
27
unerwartet
erwartet
expected
0.68
28
unklar
klar
clear
0.79
29
rätselhaft
erklärbar
explainable
0.75
30
schwierig
leicht
easy
0.72
Table 8: Set of items and factor 1 (varimax rotation)
Item 28 showed a slightly higher loading than items 22 and 26. However, we decided
against this item because it is semantically very close to item 21. Thus, items 21, 22,
24 and 26 were chosen to represent the factor name Comprehensibility. Regarding
table 5, item number 18 loads on two factors.
11"
"
3 Conclusion
For devices and systems with VUI, there were no scales inside the UEQ+ framework
(Schrepp & Thomaschewski, 2019) available so far. As a result, three new scales for
voice communication can be presented, each with four pairs of items.
The selection of the relevant scales for a product-related questionnaire depends on
various sources of information. Winter, Hinderks, Schrepp and Thomaschewski
(2017) recommend considering the product-specific UX aspects first and then further
criteria, for example, for marketing.
4 Future work
After a detailed validation, the new scales for voice communication can be integrated
into the existing UEQ+ framework. Furthermore, a short version and construction of
benchmarks are planned as by Schrepp, Hinderks, Thomaschewski (2017) in the
articles "Design and Evaluation of a Short Version of the User Experience
Questionnaire (UEQ-S)" and "Construction of a Benchmark for the User Experience
Questionnaire (UEQ)" are presented.
12"
"
Bibliography
Klein, A. M., Hinderks, A., Schrepp, M., & Thomaschewski, J., (2020). Measuring
User Experience Quality of Voice Assistants. Proceedings of the 15th Iberian
Conference on Information Systems and Technologies: ...
Schrepp, M., & Thomaschewski, J. (2019). Design and validation of a framework for
the creation of user experience questionnaires. International Journal of Interactive
Multimedia and Artificial Intelligence. DOI:10.9781/ijimai.2019.06.006.
Revelle, W. (2018). Psych: Procedures for personality and psychological research,
Northwestern University, Evanston, Illinois, USA, https://CRAN.R-
project.org/package=psych Version = 1.8.12.
Comrey, Andrew L. und Howard B. Lee (2013). A First Course in Factor Analysis.
2nd ed. Hoboken: Taylor and Francis. isbn: 978-0805810622. url:
http://gbv.eblib.com/patron/FullRecord.aspx?p=1562106
Winter. D.. Hinderks. A.. Schrepp. M. & Thomaschewski. J.. (2017). Welche UX
Faktoren sind für mein Produkt wichtig? In: Hess. S. & Fischer. H. (Hrsg.). Mensch
und Computer 2017 - Usability Professionals. Regensburg: Gesellschaft für
Informatik e.V. (S. 191 – 200).
Schrepp, M., A. Hinderks, and J. Thomaschewski. (2017). Design and Evaluation of a
Short Version of the User Experience Questionnaire (UEQ-S). International Journal
of Interactive Multimedia and Artificial Intelligence. 10.9781/ijimai.2017.09.001"
Schrepp, M., A. Hinderks, and J. Thomaschewski. (2017). Construction of a
Benchmark for the User Experience Questionnaire (UEQ). International Journal of
Interactive Multimedia and Artificial Intelligence. 10.9781/ijimai.2017.445
13"
"
Appendix Screenshots of the used questionnaires
The following pages show the used HTML questionnaires as full page screenshots.
Pages are shown in the original German version used for the collection of the data.
14"
"
15"
"
The following points 1, 2 and 3 show the possibilities for selecting socio-demographic
data used at the online questionnaire.
1) Geschlecht: ¡ männlich ¡ weiblich
2) Welchen Sprachassistenten nutzen Sie am meisten? (keine
Mehrfachnennung)
¡ Alexa
¡ Siri
¡ Cortana
¡ Google Assistent
¡ Sonstige und zwar: ___________________________
3) Wie oft nutzen Sie den Sprachassistenten?
¡ mehrmals täglich
¡ ungefähr täglich
¡ mehrmals wöchentlich
¡ ungefähr wöchentlich
¡ mehrmals monatlich
¡ ungefähr monatlich oder seltener
... Then the scales followed in the order Response behaviour, Response quality and Comprehensibility with the corresponding candidate items in the order shown in Tables 1 to 3. The study was done in the German language. Detailed data analysis and screenshots of the online study pages are available in the research protocol [18]. ...
... The factorial analysis of all three candidate item sets shows (Kaiser-Guttman criteria and analysis of the scree plot) that a single factor represents the data sufficiently well (see [18]). In addition, a common analysis of all 30 items together with principal component analysis (Varimax rotation) confirmed the assumption of three factors. ...
Conference Paper
Full-text available
The UEQ+ is a modular framework for the construction of UX questionnaires. The researcher can pick those scales that fit his or her research question from a list of 16 available UX scales. Currently, no UEQ+ scales are available to allow measuring the quality of voice interactions. Given that this type of interaction is increasingly essential for the usage of digital products, this is a severe limitation of the possible products and usage scenarios that can be evaluated using the UEQ+. We describe in this paper the construction of three specific scales to measure the UX of voice interactions. Besides, we discuss how these new scales can be combined with existing UEQ+ scales in evaluation projects. CCS CONCEPTS • Human-centred computing • Human computer interaction • HCI design and evaluation methods
Article
Full-text available
User satisfaction with a product plays a direct role in the purchasing decisions. With the enrichment of material life and the growth of individual requirements, this satisfaction is derived from the requirement for functionality to aesthetics. Conventional product design methods normally focus on achieving the required functions where its design specifications are mainly related to certain functional or usability requirements. In recent years, researchers have made efforts to develop methods for supporting aesthetic design activities during the product conceptual design phase. However, most of these methods hardly consider product aesthetics or the consumers’ emotional needs. Therefore, this study proposed a user-driven conceptual design specification integrating functional reasoning with aesthetic information analysis. The method consisted of two tasks, the construction of a mapping model and the implementation of the mapping model. Firstly, the mapping model was constructed for capturing the relationships between initial design specifications and user experience (UX). Secondly, the proposed design specifications were selected, refined, and optimized based on the mapping model. A case study on digital camera design was carried out to demonstrate the feasibility and effectiveness of the proposed method. The results showed that, compared with the initial design specification candidates, the UX was enhanced by applying the improved design specifications.
Presentation
Full-text available
Präsentation zum World Usability Day in Osnabrück am 12.11.2020
ResearchGate has not been able to resolve any references for this publication.