Content uploaded by Andreas M. Klein
Author content
All content in this area was uploaded by Andreas M. Klein on Apr 18, 2020
Content may be subject to copyright.
1"
"
Construction of
Voice Communication Scales for
the UEQ+ Framework
Protocol for Measuring User Experience Quality of Voice Assistants
Version 1.0 / 2020
Andreas M. Klein
University of Applied Sciences Emden/Leer
Emden, Germany
Andreas Hinderks
University of Seville
Seville, Spain
Martin Schrepp
SAP SE
Walldorf, Germany
Jörg Thomaschewski
University of Applied Sciences Emden/Leer
Emden, Germany
17.04.2020
2"
"
Contents
1"Research Goal".................................................................................................................."3"
2"Scale Construction"..........................................................................................................."3"
2.1" Construction"of"candidate"items".................................................................................."3"
2.2" Setup"of"the"study"........................................................................................................"5"
2.3" Results".........................................................................................................................."5"
2.3.1"Un-rotated"............................................................................................................."6"
2.3.2" Promax"rotation"....................................................................................................."7"
2.3.3" Varimax"rotation"...................................................................................................."8"
2.3.4" Response"behaviour"(varimax"rotation)"................................................................"9"
2.3.5" Response"quality"(varimax"rotation)"....................................................................."9"
2.3.6" Comprehensibility"(varimax"rotation)"................................................................."10"
3"Conclusion"......................................................................................................................."11"
4"Future work"......................................................................................................................"11"
Bibliography"..........................................................................................................................."12"
Appendix – Screenshots of the used questionnaires"......................................................"13"
3"
"
1 Research Goal
Klein, Hinderks, Schrepp and Thomaschewski (2020) describe the construction of
three scales to measure UX aspects specific to voice systems and how these can be
used in the UEQ+ framework (Schrepp & Thomaschewski, 2019) to measure UX of
such systems.
The UEQ+ (Schrepp & Thomaschewski, 2019) is a modular framework that contains
several scales to measure different UX aspects. These scales can be combined to
create a product-related questionnaire that covers the relevant aspects of a given
research question.
The qualities of voice user interfaces (VUI) cannot be captured with the modular
construction kit of the UEQ+ available so far, because the focus is on graphical user
interfaces (GUI).
This research report gives detailed information about data analysis done for VUI
scale construction and the first validations of the extension scales. The description of
the research context and the application of the extension scales are not part of this
document. The publication of Klein, Hinderks, Schrepp and Thomaschewski (2020)
provides detailed information about that topic.
2 Scale Construction
The new scales "Response behaviour", "Response quality" and "Comprehensibility"
are designed to capture the interaction, or more precisely, the UX aspects of the
user's communication with the voice assistants (VA). These scales are derived from
a list of components that significantly influences UX aspects described by Klein,
Hinderks, Schrepp and Thomaschewski (2020).
2.1 Construction of candidate items
For each selected UX aspect, several potential German items in the UEQ+ format
(semantic differential with a 7-point Likert-scale) were constructed by three experts.
We call these items in the following the candidate items. The final selection of
candidate items per factor name can be found in the following table 1:
4"
"
No
Candidate items
(German – Original Version)
Candidate items
(English Translation)
Factor name
1
technisch
menschlich
technical
human
Antwortverhalten
(German – Original Version)
Response behaviour
(English Translation)
2
künstlich
natürlich
artificial
natural
3
fremd
vertraut
unfamiliar
familiar
4
ungewöhnlich
gewöhnlich
unusual
usual
5
langsam
schnell
slow
fast
6
unangenehm
angenehm
unpleasant
pleasant
7
unsympathisch
sympathisch
unlikeable
likable
8
unfreundlich
freundlich
unfriendly
friendly
9
langweilig
unterhaltsam
boring
entertaining
10
unverständlich
verständlich
incomprehensible
understandable
Antwortqualität
(German – Original Version)
Response quality
(English Translation)
11
unlogisch
logisch
illogical
logical
12
unpassend
passend
inappropriate
suitable
13
nutzlos
nützlich
useless
useful
14
nicht hilfreich
hilfreich
not helpful
helpful
15
umständlich
einfach
laborious
simple
16
uninteressant
interessant
uninteresting
interesting
17
unintelligent
intelligent
unintelligent
intelligent
18
unklar
klar
Not clear
clear
19
undeutlich
deutlich
indistinct
exacting
20
veraltet
aktuell
outdated
current
21
kompliziert
einfach
complicated
simple
Verständlichkeit (German
– Original Version)
Comprehensibility
(English Translation)
22
ungenau
genau
inaccurate
accurate
23
unsinnig
sinnig
nonsensical
apt
24
nicht eindeutig
eindeutig
unambiguous
ambiguous
25
unlogisch
logisch
illogical
logical
26
unverständlich
verständlich
incomprehensible
understandable
27
unerwartet
erwartet
unexpected
expected
28
unklar
klar
unclear
clear
29
rätselhaft
erklärbar
enigmatic
explainable
30
schwierig
leicht
difficult
easy
Table 1: Candidate items
5"
"
2.2 Setup of the study
The online questionnaire was sent to several e-mail distribution lists of students and
members of the University of Applied Sciences in Emden/Leer (Germany). A total of
96 persons participated voluntarily. Each participant could choose a VA that he or she
uses (participants were instructed in the introductory mail not to start the
questionnaire if they have no experience with a speech system) and rate it with the
corresponding lists of candidate items. The average age of the participants (59 male,
35 female, 2 no answer) was 35 years. The age ranges from 16 years to 78 years.
The survey took place between 7 and 24 January 2020.
2.3 Results
The resulting data was analysed by factorial analysis using the function principal of
the R package psych (Revelle, 2018).
Figure 1: Scree plot - Results of Principal Component Analysis with all
items.
The scree plot speaks for a solution with 3 or 4 factors. We decided for our scale
construction to use the 3-factor solution since it provides a semantically more evident
interpretation.
Factor
Eigenvalue
Variance
Cumulative
1
10,13
23,51%
23,51%
2
3,34
9,81%
33,32%
3
2,35
9,37%
42,69%
4
1,55
9,35%
52,04%
5
1,51
6,66%
58,71%
6
1,31
6,63%
65,34%
7
1,11
5,67%
71,01%
Table 2: Eigenvalues and variance explained by the extracted factors.
0,00"
2,00"
4,00"
6,00"
8,00"
10,00"
12,00"
1" 2" 3" 4" 5" 6" 7"
Eigenvalue"
Factor"Number"
6"
"
In the following description of the extracted three factor solution and loading of the
items on these factors we use the original German items and factor names. The
English translation of the items can be found in tables 1, 6, 7 and 8.
According to the guidelines of Comrey and Lee (2013), charges with a value > 0.45
are relevant. These are highlighted in colour in following table 3, 4 and 5.
2.3.1 Un-rotated
Pos
Factor name
Items
Factor
1
Factor
2
Factor
3
1
Antwortverhalten
technisch - menschlich
0,46
0,51
-0,02
2
künstlich - natürlich
0,35
0,73
-0,01
3
fremd - vertraut
0,56
0,46
-0,07
4
ungewöhnlich - gewöhnlich
0,37
0,09
-0,12
5
langsam - schnell
0,42
0,33
-0,07
6
unangenehm - angenehm
0,52
0,58
-0,07
7
unsympathisch - sympathisch
0,42
0,69
-0,14
8
unfreundlich - freundlich
0,46
0,50
-0,08
9
langweilig - unterhaltsam
0,34
0,60
0,01
10
Antwortqualität
unverständlich - verständlich
0,35
-0,20
-0,05
11
unlogisch - logisch
0,50
-0,22
0,34
12
unpassend - passend
0,51
-0,21
0,55
13
nutzlos - nützlich
0,40
0,04
0,66
14
nicht hilfreich - hilfreich
0,53
-0,08
0,64
15
umständlich - einfach
0,45
-0,30
0,21
16
uninteressant - interessant
0,48
0,11
0,23
17
unintelligent - intelligent
0,43
0,22
0,49
18
unklar - klar
0,70
-0,30
0,14
19
undeutlich - deutlich
0,65
-0,05
0,26
20
veraltet - aktuell
0,51
0,09
0,30
21
Verständlichkeit
kompliziert - einfach
0,78
-0,21
-0,31
22
ungenau - genau
0,75
-0,21
-0,22
23
unsinnig - sinnig
0,73
-0,12
-0,21
24
nicht eindeutig - eindeutig
0,76
-0,23
-0,27
25
unlogisch - logisch
0,68
-0,18
-0,32
26
unverständlich - verständlich
0,79
-0,13
-0,24
27
unerwartet - erwartet
0,65
-0,28
-0,12
28
unklar - klar
0,79
-0,24
-0,15
29
rätselhaft - erklärbar
0,77
-0,17
-0,15
30
schwierig - leicht
0,71
-0,20
-0,19
Table 3: Loadings of candidate items (un-rotated).
7"
"
2.3.2 Promax rotation
Pos
Factor name
Item
Factor
1
Factor
2
Factor
3
1
Antwortverhalten
technisch - menschlich
0,04
0,06
0,66
2
künstlich - natürlich
-0,17
0,02
0,84
3
fremd - vertraut
0,17
0,04
0,64
4
ungewöhnlich - gewöhnlich
0,30
-0,04
0,21
5
langsam - schnell
0,15
0,01
0,46
6
unangenehm - angenehm
0,08
0,02
0,74
7
unsympathisch - sympathisch
0,00
-0,10
0,84
8
unfreundlich - freundlich
0,09
0,00
0,65
9
langweilig - unterhaltsam
-0,12
0,05
0,70
10
Antwortqualität
unverständlich - verständlich
0,40
0,06
-0,10
11
unlogisch - logisch
0,23
0,53
-0,13
12
unpassend - passend
0,07
0,77
-0,15
13
nutzlos - nützlich
-0,23
0,84
0,07
14
nicht hilfreich - hilfreich
-0,05
0,87
-0,02
15
umständlich - einfach
0,33
0,39
-0,21
16
uninteressant - interessant
0,10
0,38
0,22
17
unintelligent - intelligent
-0,20
0,65
0,28
18
unklar - klar
0,56
0,38
-0,13
19
undeutlich - deutlich
0,28
0,48
0,09
20
veraltet - aktuell
0,08
0,47
0,19
21
Verständlichkeit
kompliziert - einfach
0,91
-0,12
0,04
22
ungenau - genau
0,81
-0,02
0,02
23
unsinnig - sinnig
0,74
-0,02
0,10
24
nicht eindeutig - eindeutig
0,87
-0,07
0,00
25
unlogisch - logisch
0,82
-0,15
0,04
26
unverständlich - verständlich
0,80
-0,03
0,12
27
unerwartet - erwartet
0,71
0,06
-0,09
28
unklar - klar
0,81
0,07
-0,01
29
rätselhaft - erklärbar
0,76
0,05
0,06
30
schwierig - leicht
0,75
0,00
0,02
Table 4: Loadings of candidate items (promax rotation).
8"
"
2.3.3 Varimax rotation
Pos
Factor name
Item
Factor
1
Factor
2
Factor
3
1
Antwortverhalten
technisch - menschlich
0,15
0,66
0,13
2
künstlich - natürlich
-0,04
0,80
0,07
3
fremd - vertraut
0,27
0,66
0,14
4
ungewöhnlich - gewöhnlich
0,30
0,25
0,05
5
langsam - schnell
0,22
0,48
0,09
6
unangenehm - angenehm
0,19
0,75
0,11
7
unsympathisch - sympathisch
0,09
0,81
-0,01
8
unfreundlich - freundlich
0,18
0,66
0,09
9
langweilig - unterhaltsam
0,00
0,68
0,10
10
Antwortqualität
unverständlich - verständlich
0,38
-0,02
0,14
11
unlogisch - logisch
0,34
-0,01
0,55
12
unpassend - passend
0,24
-0,03
0,74
13
nutzlos - nützlich
0,00
0,14
0,76
14
nicht hilfreich - hilfreich
0,17
0,09
0,82
15
umständlich - einfach
0,39
-0,10
0,42
16
uninteressant - interessant
0,22
0,28
0,41
17
unintelligent - intelligent
0,02
0,33
0,61
18
unklar - klar
0,61
0,02
0,47
19
undeutlich - deutlich
0,41
0,21
0,53
20
veraltet - aktuell
0,23
0,26
0,49
21
Verständlichkeit
kompliziert - einfach
0,84
0,19
0,09
22
ungenau - genau
0,78
0,16
0,17
23
unsinnig - sinnig
0,72
0,23
0,15
24
nicht eindeutig - eindeutig
0,82
0,15
0,13
25
unlogisch - logisch
0,75
0,17
0,04
26
unverständlich - verständlich
0,78
0,26
0,16
27
unerwartet - erwartet
0,68
0,05
0,21
28
unklar - klar
0,79
0,14
0,25
29
rätselhaft - erklärbar
0,75
0,20
0,23
30
schwierig - leicht
0,72
0,15
0,17
Table 5: Loadings of candidate items (varimax rotation)
The selection of the items that finally should represent the factors in the
corresponding UEQ+ scales are based on the varimax rotation
9"
"
2.3.4 Response behaviour (varimax rotation)
Concerning table 5, the following candidate items are used.
No.
Items
(German – Original Version)
Items
(English Translation)
Loading
1
technisch
menschlich
technical
human
0.66
2
künstlich
natürlich
artificial
natural
0.80
3
fremd
vertraut
unfamiliar
familiar
0.66
4
ungewöhnlich
gewöhnlich
unusual
usual
0.25
5
langsam
schnell
slow
fast
0.48
6
unangenehm
angenehm
unpleasant
pleasant
0.75
7
unsympathisc
h
sympathisch
unlikeable
likable
0.81
8
unfreundlich
freundlich
unfriendly
friendly
0.66
9
langweilig
unterhaltsam
boring
entertaining
0.68
Table 6: Set of items representing factor 2 (varimax rotation).
The four items with the highest loading were selected to represent the factor name
Response Behaviour corresponding to the factor.
2.3.5 Response quality (varimax rotation)
Concerning table 5, the following candidate items are used.
No.
Items
(German – Original Version)
Items
(English Translation)
Loading
10
unverständlich
verständlich
incomprehensible
understandable
0.14
11
unlogisch
logisch
illogical
logical
0.55
12
unpassend
passend
inappropriate
suitable
0.74
13
nutzlos
nützlich
useless
useful
0.76
14
nicht hilfreich
hilfreich
not helpful
helpful
0.82
15
umständlich
einfach
laborious
simple
0.42
16
uninteressant
interessant
uninteresting
interesting
0.41
17
unintelligent
intelligent
unintelligent
intelligent
0.61
18
unklar
klar
Not clear
clear
0.47
19
undeutlich
deutlich
indistinct
exacting
0.53
20
veraltet
aktuell
outdated
current
0.49
Table 7: Set of items and factor 3 (varimax rotation)
10"
"
The four items with the highest loading were selected to represent the factor name
Response Quality corresponding to the factor.
"
"
2.3.6 Comprehensibility (varimax rotation)
Concerning table 5, the following candidate items are used.
No.
Items
(German – Original Version)
Items
(English Translation)
Loading
21
kompliziert
einfach
complicated
simple
0.84
22
ungenau
genau
inaccurate
accurate
0.78
23
unsinnig
sinnig
nonsensical
apt
0.72
24
nicht eindeutig
eindeutig
unambiguous
ambiguous
0.82
25
unlogisch
logisch
illogical
logical
0.75
26
unverständlich
verständlich
incomprehensible
understandable
0.78
27
unerwartet
erwartet
unexpected
expected
0.68
28
unklar
klar
unclear
clear
0.79
29
rätselhaft
erklärbar
enigmatic
explainable
0.75
30
schwierig
leicht
difficult
easy
0.72
Table 8: Set of items and factor 1 (varimax rotation)
Item 28 showed a slightly higher loading than items 22 and 26. However, we decided
against this item because it is semantically very close to item 21. Thus, items 21, 22,
24 and 26 were chosen to represent the factor name Comprehensibility. Regarding
table 5, item number 18 loads on two factors.
11"
"
3 Conclusion
For devices and systems with VUI, there were no scales inside the UEQ+ framework
(Schrepp & Thomaschewski, 2019) available so far. As a result, three new scales for
voice communication can be presented, each with four pairs of items.
The selection of the relevant scales for a product-related questionnaire depends on
various sources of information. Winter, Hinderks, Schrepp and Thomaschewski
(2017) recommend considering the product-specific UX aspects first and then further
criteria, for example, for marketing.
4 Future work
After a detailed validation, the new scales for voice communication can be integrated
into the existing UEQ+ framework. Furthermore, a short version and construction of
benchmarks are planned as by Schrepp, Hinderks, Thomaschewski (2017) in the
articles "Design and Evaluation of a Short Version of the User Experience
Questionnaire (UEQ-S)" and "Construction of a Benchmark for the User Experience
Questionnaire (UEQ)" are presented.
12"
"
Bibliography
Klein, A. M., Hinderks, A., Schrepp, M., & Thomaschewski, J., (2020). Measuring
User Experience Quality of Voice Assistants. Proceedings of the 15th Iberian
Conference on Information Systems and Technologies: ...
Schrepp, M., & Thomaschewski, J. (2019). Design and validation of a framework for
the creation of user experience questionnaires. International Journal of Interactive
Multimedia and Artificial Intelligence. DOI:10.9781/ijimai.2019.06.006.
Revelle, W. (2018). Psych: Procedures for personality and psychological research,
Northwestern University, Evanston, Illinois, USA, https://CRAN.R-
project.org/package=psych Version = 1.8.12.
Comrey, Andrew L. und Howard B. Lee (2013). A First Course in Factor Analysis.
2nd ed. Hoboken: Taylor and Francis. isbn: 978-0805810622. url:
http://gbv.eblib.com/patron/FullRecord.aspx?p=1562106
Winter. D.. Hinderks. A.. Schrepp. M. & Thomaschewski. J.. (2017). Welche UX
Faktoren sind für mein Produkt wichtig? In: Hess. S. & Fischer. H. (Hrsg.). Mensch
und Computer 2017 - Usability Professionals. Regensburg: Gesellschaft für
Informatik e.V. (S. 191 – 200).
Schrepp, M., A. Hinderks, and J. Thomaschewski. (2017). Design and Evaluation of a
Short Version of the User Experience Questionnaire (UEQ-S). International Journal
of Interactive Multimedia and Artificial Intelligence. 10.9781/ijimai.2017.09.001"
Schrepp, M., A. Hinderks, and J. Thomaschewski. (2017). Construction of a
Benchmark for the User Experience Questionnaire (UEQ). International Journal of
Interactive Multimedia and Artificial Intelligence. 10.9781/ijimai.2017.445
13"
"
Appendix – Screenshots of the used questionnaires
The following pages show the used HTML questionnaires as full page screenshots.
Pages are shown in the original German version used for the collection of the data.
14"
"
15"
"
The following points 1, 2 and 3 show the possibilities for selecting socio-demographic
data used at the online questionnaire.
1) Geschlecht: ¡ männlich ¡ weiblich
2) Welchen Sprachassistenten nutzen Sie am meisten? (keine
Mehrfachnennung)
¡ Alexa
¡ Siri
¡ Cortana
¡ Google Assistent
¡ Sonstige und zwar: ___________________________
3) Wie oft nutzen Sie den Sprachassistenten?
¡ mehrmals täglich
¡ ungefähr täglich
¡ mehrmals wöchentlich
¡ ungefähr wöchentlich
¡ mehrmals monatlich
¡ ungefähr monatlich oder seltener