Content uploaded by Martin Schrepp
Author content
All content in this area was uploaded by Martin Schrepp on Jun 26, 2019
Content may be subject to copyright.
1
Construction and first Validation of
Extension Scales for the User Experience
Questionnaire (UEQ)
Martin Schrepp
SAP SE
Jörg Thomaschewski
University of Applied Sciences Emden/Leer
26.06.2019
2
Contents
1 Research Goal ......................................................................................................................... 3
2 Scale Construction .................................................................................................................. 3
2.1 Construction of candidate items .................................................................................... 3
2.2 Setup of the study ............................................................................................................ 3
2.3 Results............................................................................................................................... 3
2.3.1 Aesthetics.................................................................................................................. 5
2.3.2 Adaptability............................................................................................................... 6
2.3.3 Usefulness ................................................................................................................. 7
2.3.4 Intuitive Use .............................................................................................................. 8
2.3.5 Value ......................................................................................................................... 9
2.3.6 Content Quality ...................................................................................................... 10
3 Scale Validation ..................................................................................................................... 11
3.1 Setup of the study .......................................................................................................... 11
3.2 Results............................................................................................................................. 12
3.2.1 Web-Shops .............................................................................................................. 13
3.2.2 Video-Platforms ...................................................................................................... 17
3.2.3 Programming Environments .................................................................................. 21
3.2.4 Overall Results ........................................................................................................ 25
List of Figures ................................................................................................................................. 26
List of Tables .................................................................................................................................. 27
Bibliography ................................................................................................................................... 28
Appendix – Screenshots of the used questionnaires ................................................................. 29
3
1 Research Goal
The UEQ (Laugwitz, Schrepp & Held, 2008) is a frequently used questionnaire that measures
user experience (short UX) on 6 distinct scales (Attractiveness, Efficiency, Perspicuity,
Dependability, Stimulation, Novelty). Of course, these 6 scales do not cover the entire
spectrum of UX. For some products special UX aspects not contained in the UEQ are of high
importance for the overall UX impression. For this reason, some authors already created
extension scales for the UEQ. Hinderks (2016) created a scale to measure Trust. Boos & Brau
(2017) created scales for Acoustics and Haptics (properties important for household
appliances). To cover a broader range of UX we describe the construction and first validation
of several additional extension scales.
This research report gives a detailed description of the data analysis done for scale
construction and first validations of the extension scales. The description of the context of the
research and the application of the extension scales is explained in several other publications
and not part of this document.
2 Scale Construction
A paper of Winter, Hinderks, Schrepp & Thomaschewski (2017) investigates the importance
of different UX aspects for typical product categories. The list of UX aspects used in this paper
is the basis for our scale construction. We selected the following UX aspects, which are of
importance for a wider range of product categories: Aesthetics, Adaptability, Usefulness,
Intuitive use, Value and Content Quality. For each of these aspects we construct an UEQ
extension scale.
2.1 Construction of candidate items
For each selected UX aspect several potential German items in the UEQ format (semantic
differential with a 7-point Likert-scale) were constructed by two experts. We call these items
in the following the candidate items. The candidate items per scale can be found below as part
of the data analysis.
2.2 Setup of the study
191 students of the University of Applied Sciences Emden/Leer (119 male, 72 female, average
age 30,42) rated several products from 10 different product categories (for example, learning
platforms, online banking, social networks, video portals, programming environments)
concerning their UX quality for all constructed candidate items. Participation was on a
voluntary basis.
2.3 Results
The data per scale were analysed with principal component analysis (with varimax rotation)
under the assumption that all items in the scale describe the same UX aspect, i.e the number
of factors was set to one. The R package psych (Revelle, 2017) was used for the analysis.
4
The results showed, except for the scale Content Quality, that a single factor explains the data
quite well. The items showing the highest loading on the factor were selected as
representations of the scale.
For Content Quality a two-factor solution shows a better fit and the two factors had a
reasonable semantic interpretation. Thus, it was decided to split the items into two sets
representing different UX aspects, i.e. to construct two extension scales instead of one. These
scales were named as Quality of Content and Trustworthiness of Content based on the
meaning of the items that load on these scales.
5
2.3.1 Aesthetics
Figure 1: Results of Principal Component Analysis and Factorial Analysis for Aesthetics.
Proportion of Variance explained: 0.64
Fit based upon off diagonal values = 0.99 (values > 0.95 indicate a good fit).
Table 1: Loadings of the candidate items for Aesthetics.
Original German Item
PC1
h2
u2
hässlich / schön
0.89
0.80
0.20
stillos / stilvoll
0.86
0.75
0.25
nicht ansprechend / ansprechend
0.88
0.78
0.22
farblich unschön / farblich schön
0.79
0.62
0.38
unharmonisch / harmonisch
0.84
0.71
0.29
unästhetisch / ästhetisch
0.88
0.77
0.23
nicht kunstvoll / kunstvoll
0.63
0.40
0.60
unüberlegt / durchdacht
0.51
0.26
0.74
English translation of the selected items for the scale (bold font in the table): ugly / beautiful, lacking style /
stylish, unappealing / appealing, unpleasant / pleasant.
6
2.3.2 Adaptability
Figure 2: Results of Principal Component Analysis and Factorial Analysis for Adaptability.
Proportion of Variance explained: 0.75
Fit based upon off diagonal values = 0.99 (values > 0.95 indicate a good fit).
Table 2: Loadings of the candidate items for Adaptability.
Original German Item
PC1
h2
u2
starr / flexibel
0.86
0.73
0.27
nicht anpassbar / anpassbar
0.89
0.79
0.21
nicht veränderbar / veränderbar
0.89
0.79
0.21
nicht erweiterbar / erweiterbar
0.87
0.75
0.25
nicht einstellbar / einstellbar
0.86
0.74
0.26
nicht anpassungsfähig / anpassungsfähig
0.84
0.70
0.30
English translation of the selected items for the scale (bold font in the table): not adjustable / adjustable, not
changeable / changeable, inflexible / flexible, not extendable / extendable.
7
2.3.3 Usefulness
Figure 3: Results of Principal Component Analysis and Factorial Analysis for Usefulness.
Proportion of Variance explained: 0.65
Fit based upon off diagonal values = 0.99 (values > 0.95 indicate a good fit).
Table 3: Loadings of the candidate items for Usefulness.
Original German Item
PC1
h2
u2
unpraktisch / praktisch
0.78
0.62
0.38
nutzlos / nützlich
0.83
0.69
0.31
unbrauchbar / brauchbar
0.79
0.62
0.38
nicht hilfreich / hilfreich
0.84
0.70
0.30
nicht zweckmäßig / zweckmäßig
0.76
0.58
0.42
nicht vorteilhaft / vorteilhaft
0.86
0.74
0.26
nicht lohnend / lohnend
0.80
0.64
0.36
unproduktiv / produktiv
0.76
0.58
0.42
English translation of the selected items for the scale (bold font in the table): useless / useful, not helpful / helpful,
not beneficial / beneficial, not rewarding / rewarding.
8
2.3.4 Intuitive Use
Figure 4: Results of Principal Component Analysis and Factorial Analysis for Intuitive Use.
Proportion of Variance explained: 0.74
Fit based upon off diagonal values = 0.99 (values > 0.95 indicate a good fit).
Table 4: Loadings of the candidate items for Intuitive Use.
Original German Item
PC1
h2
u2
nicht intuitiv / intuitiv
0.86
0.74
0.26
nicht direkt / direkt
0.86
0.73
0.27
nicht spontan / spontan
0.69
0.48
0.52
unklar / klar
0.87
0.76
0.24
mühevoll / mühelos
0.88
0.78
0.22
unlogisch / logisch
0.90
0.82
0.18
nicht einleuchtend / einleuchtend
0.89
0.80
0.20
nicht schlüssig / schlüssig
0.90
0.81
0.19
English translation of the selected items for the scale (bold font in the table): difficult / easy, illogical / logical,
not plausible / plausible, inconclusive / conclusive.
9
2.3.5 Value
Figure 5: Results of Principal Component Analysis and Factorial Analysis for Value.
Proportion of Variance explained: 0.52
Fit based upon off diagonal values = 0.96 (values > 0.95 indicate a good fit).
Table 5: Loadings of the candidate items for Value.
Original German Item
PC1
h2
u2
minderwertig / wertvoll
0.83
0.68
0.32
stilvoll / stillos
0.48
0.23
0.77
nicht vorzeigbar / vorzeigbar
0.79
0.63
0.37
nicht geschmackvoll / geschmackvoll
0.81
0.65
0.35
konzeptlos / kunstfertig
0.50
0.25
0.75
laienhaft / fachmännisch
0.68
0.47
0.53
nicht elegant / elegant
0.82
0.67
0.33
unvollkommen / vollkommen
0.77
0.59
0.41
English translation of the selected items for the scale (bold font in the table): inferior / valuable, not presentable
/ presentable, tasteless / tasteful, not elegant / elegant.
10
2.3.6 Content Quality
Figure 6: Results of Principal Component Analysis and Factorial Analysis for Content Quality.
The Kaiser-Gutmann Criterion (eigenvalues < 1) as well as the scree plot indicate that a two-
factor solution may fit the data better than a one-factor solution in PCA. Since the two factors
could also be interpreted semantically two scales were defined.
Proportion of Variance explained: 0.54 (Factor 1), 0.46 (Factor 2).
Fit based upon off diagonal values = 97 (values > 0.95 indicate a good fit).
Table 6: Loadings of the candidate items for Content Quality.
Original German Item
PC1
PC2
h2
u2
veraltet / aktuell
0.32
0.64
0.51
0.49
nicht informativ / informativ
0.55
0.56
0.61
0.39
uninteressant / interessant
0.21
0.68
0.50
0.50
schlecht aufbereitet / gut aufbereitet
0.30
0.77
0.68
0.32
minderwertig / hochwertig
0.10
0.78
0.62
0.38
unverständlich / gut verständlich
0.58
0.58
0.67
0.33
nutzlos / nützlich
0.68
0.33
0.57
0.43
unglaubwürdig / glaubwürdig
0.90
0.18
0.84
0.16
unseriös / seriös
0.89
0.20
0.83
0.17
ungenau / genau
0.77
0.28
0.66
0.34
11
English translation of the selected items for the scale PC2 (named Quality of Content): obsolete / up-to-date, not
interesting / interesting, poorly prepared / well prepared, incomprehensible / comprehensible.
English translation of the selected items for the scale PC1 (named Trustworthiness of Content): useless / useful,
implausible / plausible, untrustworthy / trustworthy, inaccurate / accurate.
3 Scale Validation
We report in the following first validation results for the new scales. Of course, scale validation
will be continued in further studies.
3.1 Setup of the study
To evaluate the scale quality the three product categories Web Shops, Video Platforms and
Programming Environments were selected. Per product category two well-known products
from that category were selected (Web Shops: Otto.de, Zalando.de; Video Platforms: Netflix,
Amazon Prime; Programming Environments: Eclipse, Visual Studio).
Per product category a specialized UX questionnaire containing the scales that seems to be
most important for products of this category (see Winter, Hinderks, Schrepp &
Thomaschewski, 2017 for details) was constructed. The scales per category can be found in
the results section below. As scales the new extension scales and existing scales from the UEQ
were used.
Participants were recruited per E-Mail campaigns and by links posted to web sites. Each
participant had the choice to rate one product that he or she used regularly from one of the
product categories, thus we have different numbers of ratings for the different products.
The following data were collected per questionnaire:
• Age
• Gender
• Per extension scale:
o The 4 items of the scale
o An overall rating concerning the importance of the scale for the evaluated
product.
• A rating for the overall satisfaction with the evaluated product.
Screen shots of the used HTML questionnaires can be found in the Appendix.
The importance ratings and the ratings of the items for a scale are used to calculate an UX KPI.
Calculation is done using the same procedure as described in Hinderks, Schrepp, Mayo,
Escalona & Thomaschewski (2019).
12
3.2 Results
The following Table 7 shows some results concerning participation and some demographic
data of the participants.
Table 7: Overview of participation and demographic information for all studies. Time
represents the time in ms between starting the online questionnaire and the click on the submit
button. Clicks is the average number of clicks by the participant. Scales shows the number of
scales used in the study (detailed scales used in a study are described below).
Product
N
Av. Age
Gender
Time
Clicks
Scales
otto.de
42
34
16 m, 25 f, 1 NA
202899
54
8
zalando.de
46
31
20 m, 24 f, 2 NA
187803
53
8
Netflix
73
31
42 m, 27 f, 4 NA
211112
48
7
Amazon Prime
57
32
36 m, 21 f
259491
47
7
Eclipse
14
36
7 m, 4 f, 3 NA
368552
42
7
Visual Studio
29
32
25 m, 1 f, 3 NA
225006
50
7
Please note that per scale 4 items and the importance of the scale must be rated. Thus, for 8
scales this requires 40 clicks. In addition, the overall satisfaction must be rated and some clicks
are required to state age and gender.
Thus, filling out the corresponding questionnaires seems to be not much effort for the
participants. They spend around 4 Min. in answering the questions and in addition selected
answers seem to be not changed too often afterwards. This indicates that the used terms are
not problematic or difficult to understand.
13
3.2.1 Web-Shops
Results for Otto.de
Table 8: Scale means, standard deviation and confidence per scale for Otto.de.
Scale
Mean
Variance
Std.dev.
N
Confidence
Confidence Interval
Attractiveness
1.30
1.43
1.19
42
0.36
0.94
1.66
Dependability
1.58
1.18
1.08
42
0.33
1.26
1.91
Intuitive Use
1.57
1.19
1.09
42
0.33
1.24
1.89
Visual Aesthetics
0.89
2.01
1.41
42
0.43
0.46
1.32
Quality of Content
1.35
1.28
1.13
42
0.34
1.00
1.69
Trustworthiness of Content
1.33
1.32
1.15
42
0.35
0.98
1.67
Trust
1.28
1.45
1.20
42
0.36
0.92
1.64
Value
0.93
1.56
1.24
42
0.38
0.56
1.31
Figure 7: Scale means and confidence intervals per scale for Otto.de.
14
Table 9: Importance ratings and confidence intervals for Otto.de.
Scale
Mean
Variance
Std.dev.
N
Confidence
Confidence Interval
Attractiveness
1.14
1.74
1.30
42
0.39
0.75
1.54
Dependability
1.59
1.05
1.01
42
0.31
1.28
1.89
Intuitive Use
1.86
1.25
1.10
42
0.33
1.52
2.19
Visual Aesthetics
0.95
2.19
1.46
42
0.44
0.51
1.39
Quality of Content
1.71
1.11
1.04
42
0.32
1.39
2.02
Trustworthiness of Content
1.81
0.99
0.98
42
0.30
1.51
2.11
Trust
2.02
1.07
1.02
42
0.31
1.71
2.33
Value
0.67
1.50
1.21
42
0.37
0.30
1.03
Figure 8: Importance ratings and their confidence intervals for Otto.de.
Cronbach Alpha values for the scales:
• Attractiveness: 0.93
• Dependability: 0.82
• Intuitive Use: 0.94
• Visual Aesthetics: 0.95
• Quality of Content: 0.89
• Trustworthiness of Content: 0.86
• Trust: 0.90
• Value: 0.93
15
Results for Zalando.de
Table 10: Scale means, standard deviation and confidence per scale for Zalando.de.
Scale
Mean
Variance
Std.dev.
N
Confidence
Confidence
Interval
Attractiveness
1.68
1.20
1.09
46
0.32
1.36
2.00
Dependability
2.02
0.79
0.89
46
0.26
1.76
2.27
Intuitive Use
2.13
0.76
0.87
46
0.25
1.88
2.38
Visual Aesthetics
1.47
1.68
1.29
46
0.37
1.09
1.84
Quality of Content
1.91
0.93
0.96
46
0.28
1.63
2.19
Trustworthiness of Content
1.73
1.05
1.02
46
0.30
1.43
2.02
Trust
1.26
1.42
1.19
46
0.34
0.92
1.60
Value
1.58
1.34
1.16
46
0.33
1.25
1.91
Figure 9: Scale means and confidence intervals per scale for Zalando.de.
16
Table 11: Importance ratings and confidence intervals for Zalando.de.
Scale
Mean
Variance
Std.dev.
N
Confidence
Confidence
Interval
Attractiveness
0.93
1.93
1.37
46
0.40
0.54
1.33
Dependability
1.89
0.81
0.89
46
0.26
1.63
2.15
Intuitive Use
2.17
0.95
0.96
46
0.28
1.90
2.45
Visual Aesthetics
1.35
2.10
1.43
46
0.41
0.93
1.76
Quality of Content
1.87
1.09
1.03
46
0.30
1.57
2.17
Trustworthiness of Content
2.04
0.93
0.95
46
0.28
1.77
2.32
Trust
2.22
0.80
0.88
46
0.26
1.96
2.47
Value
1.27
1.75
1.31
45
0.38
0.88
1.65
Figure 10: Importance ratings and their confidence intervals for Zalando.de.
Cronbach Alpha values for the scales:
• Attractiveness: 0.92
• Dependability: 0.85
• Intuitive Use: 0.90
• Visual Aesthetics: 0.95
• Quality of Content: 0.78
• Trustworthiness of Content: 0.81
• Trust: 0.93
• Value: 0.88
17
3.2.2 Video-Platforms
Results for Netflix
Table 12: Scale means, standard deviation and confidence per scale for Netflix.
Scale
Mean
Variance
Std.dev.
N
Confidence
Confidence Interval
Attractiveness
2.13
1.14
1.06
73
0.24
1.88
2.37
Perspicuity
2.04
1.41
1.19
73
0.27
1.77
2.31
Intuitive Use
1.86
1.35
1.16
73
0.27
1.60
2.13
Visual Aesthetics
1.58
1.37
1.17
73
0.27
1.32
1.85
Quality of Content
1.83
1.51
1.23
73
0.28
1.55
2.11
Trustworthiness of Content
1.48
1.26
1.12
73
0.26
1.22
1.74
Trust
1.03
1.97
1.40
73
0.32
0.71
1.35
Figure 11: Scale means and confidence intervals per scale for Netflix.
18
Table 13: Importance ratings and confidence intervals for Netflix.
Scale
Mean
Variance
Std.dev.
N
Confidence
Confidence Interval
Attractiveness
1.45
1.58
1.25
73
0.29
1.17
1.74
Perspicuity
2.22
0.65
0.80
73
0.18
2.04
2.40
Intuitive Use
1.96
1.11
1.05
73
0.24
1.72
2.20
Visual Aesthetics
1.10
1.95
1.39
73
0.32
0.78
1.41
Quality of Content
1.79
1.19
1.08
73
0.25
1.55
2.04
Trustworthiness of Content
1.08
2.19
1.47
73
0.34
0.75
1.42
Trust
1.85
1.06
1.02
73
0.23
1.61
2.08
Figure 12: Importance ratings and their confidence intervals for Netflix.
Cronbach Alpha values for the scales:
• Attractiveness: 0.95
• Perspicuity: 0.80
• Intuitive Use: 0.90
• Visual Aesthetics: 0.89
• Quality of Content: 0.84
• Trustworthiness of Content: 0.87
• Trust: 0.90
19
Results for Amazon Prime
Table 14: Scale means. standard deviation and confidence per scale for Amazon Prime.
Scale
Mean
Variance
Std.dev.
N
Confidence
Confidence Interval
Attractiveness
1.61
1.30
1.14
57
0.29
1.31
1.90
Perspicuity
1.62
1.99
1.41
57
0.37
1.26
1.99
Intuitive Use
1.36
1.90
1.38
57
0.36
1.00
1.71
Visual Aesthetics
1.01
1.64
1.28
57
0.33
0.68
1.34
Quality of Content
1.49
1.63
1.27
57
0.33
1.16
1.82
Trustworthiness of Content
1.46
1.47
1.21
57
0.31
1.15
1.78
Trust
0.71
2.99
1.73
57
0.45
0.26
1.16
Figure 13: Scale means and confidence intervals per scale for Amazon Prime.
20
Table 15: Importance ratings and confidence intervals for Amazon Prime.
Scale
Mean
Variance
Std.dev.
N
Confidence
Confidence Interval
Attractiveness
1.61
0.99
0.99
57
0.26
1.36
1.87
Perspicuity
2.27
0.71
0.83
57
0.22
2.05
2.48
Intuitive Use
1.86
1.09
1.03
57
0.27
1.59
2.13
Visual Aesthetics
1.11
1.88
1.36
57
0.35
0.75
1.46
Quality of Content
1.63
1.27
1.12
57
0.29
1.34
1.92
Trustworthiness of Content
1.49
1.29
1.13
57
0.29
1.20
1.78
Trust
1.91
1.01
1.00
57
0.26
1.65
2.17
Figure 14: Importance ratings and their confidence intervals for Amazon Prime.
Cronbach Alpha values for the scales:
• Attractiveness: 0.90
• Perspicuity: 0.91
• Intuitive Use: 0.94
• Visual Aesthetics: 0.94
• Quality of Content: 0.82
• Trustworthiness of Content: 0.87
• Trust: 0.96
21
3.2.3 Programming Environments
Results for Eclipse
Table 16: Scale means. standard deviation and confidence per scale for Eclipse.
Scale
Mean
Variance
Std.dev.
N
Confidence
Confidence Interval
Attractiveness
0.48
2.98
1.71
14
0.90
-0.41
1.38
Dependability
0.84
3.30
1.80
14
0.94
-0.10
1.78
Perspicuity
0.11
2.86
1.68
14
0.88
-0.77
0.99
Efficiency
0.71
2.39
1.53
14
0.80
-0.09
1.52
Usefulness
1.21
3.08
1.74
14
0.91
0.30
2.13
Personalization
1.25
2.48
1.56
14
0.82
0.43
2.07
Value
0.32
2.73
1.64
14
0.86
-0.54
1.18
Figure 15: Scale means and confidence intervals per scale for Eclipse.
22
Table 17: Importance ratings and confidence intervals for Eclipse.
Scale
Mean
Variance
Std.dev.
N
Confidence
Confidence Interval
Attractiveness
1.21
1.72
1.26
14
0.66
0.55
1.88
Dependability
1.79
0.49
0.67
14
0.35
1.43
2.14
Perspicuity
1.57
1.19
1.05
14
0.55
1.02
2.12
Efficiency
1.57
1.34
1.12
14
0.58
0.99
2.16
Usefulness
1.00
1.23
1.07
14
0.56
0.44
1.56
Personalization
1.38
1.09
1.00
14
0.53
0.86
1.91
Value
0.14
1.98
1.36
14
0.71
-0.57
0.85
Figure 16: Importance ratings and their confidence intervals for Eclipse.
Cronbach Alpha values for the scales:
• Attractiveness: 0.93
• Dependability: 0.97
• Perspicuity: 0.93
• Efficiency: 0.90
• Usefulness: 0.98
• Adaptability: 0.96
• Value: 0.93
23
Results for Visual Studio
Table 18: Scale means. standard deviation and confidence per scale for Visual Studio.
Scale
Mean
Variance
Std.dev.
N
Confidence
Confidence Interval
Attractiveness
1.67
0.69
0.83
29
0.30
1.37
1.97
Dependability
1.77
0.68
0.82
29
0.30
1.47
2.07
Perspicuity
0.93
1.35
1.16
29
0.42
0.51
1.35
Efficiency
1.44
1.05
1.02
29
0.37
1.07
1.81
Usefulness
2.00
0.92
0.96
29
0.35
1.65
2.35
Personalization
1.78
0.83
0.91
29
0.33
1.45
2.11
Value
1.66
1.18
1.08
29
0.39
1.27
2.06
Figure 17: Scale means and confidence intervals per scale for Visual Studio.
24
Table 19: Importance ratings and confidence intervals for Visual Studio.
Scale
Mean
Variance
Std.dev.
N
Confidence
Confidence
Interval
Attractiveness
1.38
1.03
1.00
29
0.36
1.02
1.74
Dependability
2.07
0.92
0.94
29
0.34
1.73
2.41
Perspicuity
1.90
0.60
0.76
29
0.28
1.62
2.17
Efficiency
1.93
0.57
0.74
29
0.27
1.66
2.20
Usefulness
1.68
1.19
1.07
29
0.39
1.29
2.07
Personalization
1.68
1.49
1.20
29
0.44
1.24
2.11
Value
0.79
2.32
1.50
29
0.54
0.24
1.33
Figure 18: Importance ratings and their confidence intervals for Visual Studio.
Cronbach Alpha values for the scales:
• Attractiveness: 0.76
• Dependability: 0.83
• Perspicuity: 0.86
• Efficiency: 0.80
• Usefulness: 0.82
• Adaptability: 0.80
• Value: 0.79
25
3.2.4 Overall Results
The following Table 20 shows for all studies the mean and standard deviation of the rating
concerning the overall satisfaction and the calculated KPI. In addition, the correlation
between the overall satisfaction and the KPI is shown.
Table 20: Overall satisfaction and KPI for all products.
Product
Overall Satisfaction
KPI
Corr
Mean
Std.Dev.
Mean
Std.Dev.
otto.de
5.48
1.24
1.27
0.90
0.71
zalando.de
5.65
0.91
1,70
0.69
0.66
Netflix
6.06
0.99
1.73
0.74
0.77
Amazon Prime
5.30
1.08
1.35
0.87
0.78
Eclipse
4.21
1.74
0.40
1.37
0.83
Visual Studio
5.55
0.97
1.59
0.57
0.71
Thus, the KPI is a very good predictor of the overall satisfaction. This is true for all investigated
products and all combinations of used scales. It is therefore maybe possible to establish a
benchmark based on the KPI.
26
List of Figures
Figure 1: Results of PCA and Factorial Analysis for Aesthetics …………………………………….
5
Figure 2: Results of PCA and Factorial Analysis for Adaptability ………………………………….
6
Figure 3: Results of PCA and Factorial Analysis for Usefulness …………………………………….
7
Figure 4: Results of PCA and Factorial Analysis for Intuitive Use …………………………………
8
Figure 5: Results of PCA and Factorial Analysis for Value ……………………………………………
9
Figure 6: Results of PCA and Factorial Analysis for Content Quality ……………………………
10
Figure 7: Scale means and confidence intervals per scale for Otto.de
13
Figure 8: Importance ratings and their confidence intervals for Otto.de
14
Figure 9: Scale means and confidence intervals per scale for Zalando.de
15
Figure 10: Importance ratings and their confidence intervals for Zalando.de
16
Figure 11: Scale means and confidence intervals per scale for Netflix
17
Figure 12: Importance ratings and their confidence intervals for Netflix
18
Figure 13: Scale means and confidence intervals per scale for Amazon Prime
19
Figure 14: Importance ratings and their confidence intervals for Amazon Prime
20
Figure 15: Scale means and confidence intervals per scale for Eclipse
21
Figure 16: Importance ratings and their confidence intervals for Eclipse
22
Figure 17: Scale means and confidence intervals per scale for Visual Studio
23
Figure 18: Importance ratings and their confidence intervals for Visual Studio
24
27
List of Tables
Table 1: Loadings of the candidate items for Aesthetics …………………………………………….
5
Table 2: Loadings of the candidate items for Adaptability …………………..…………………….
6
Table 3: Loadings of the candidate items for Usefulness …………………………….…………….
7
Table 4: Loadings of the candidate items for Intuitive Use ………………………………..………
8
Table 5: Loadings of the candidate items for Value ………………………………………………..…
9
Table 6: Loadings of the candidate items for Content Quality ……………………………………
10
Table 7: Overview of participation and demographic information for all studies ……….
12
Table 8: Scale means, standard deviation and confidence per scale for Otto.de ………..
13
Table 9: Importance ratings and confidence intervals for Otto.de ……………………………..
14
Table 10: Scale means, standard deviation and confidence per scale for Zalando.de …
15
Table 11: Importance ratings and confidence intervals for Zalando.de ………………………
16
Table 12: Scale means, standard deviation and confidence per scale for Netflix ………..
17
Table 13: Importance ratings and confidence intervals for Netflix ……………………………..
18
Table 14: Scale means. standard dev. and confidence per scale for Amazon Prime ……
19
Table 15: Importance ratings and confidence intervals for Amazon Prime …………………
20
Table 16: Scale means. standard deviation and confidence per scale for Eclipse ………..
21
Table 17: Importance ratings and confidence intervals for Eclipse …………………………….
22
Table 18: Scale means. standard dev. and confidence per scale for Visual Studio ……..
23
Table 19: Importance ratings and confidence intervals for Visual Studio ……………………
24
Table 20: Overall satisfaction and KPI for all products ………………………………………………..
25
28
Bibliography
Laugwitz. B.. Schrepp. M. & Held. T. (2008). Construction and evaluation of a user experience
questionnaire. In: Holzinger. A. (Ed.): USAB 2008. LNCS 5298. S. 63-76.
Boos. B. & Brau. H.. (2017). Erweiterung des UEQ um die Dimensionen Akustik und Haptik. In:
Hess. S. & Fischer. H. (Hrsg.). Mensch und Computer 2017 – Usability Professionals.
Regensburg: Gesellschaft für Informatik e.V.. S. 321 – 327.
Hinderks. A. (2016). Modifikation des User Experience Questionnaire (UEQ) zur Verbesserung
der Reliabilität und Validität. Unveröffentlichte Masterarbeit. University of Applied Sciences
Emden/Leer.
Hinderks, A., Schrepp, M., Mayo, F. J. D., Escalona, M. J., & Thomaschewski, J. (2019).
Developing a UX KPI based on the user experience questionnaire. Computer Standards &
Interfaces.
Winter. D.. Hinderks. A.. Schrepp. M. & Thomaschewski. J.. (2017). Welche UX Faktoren sind
für mein Produkt wichtig? In: Hess. S. & Fischer. H. (Hrsg.). Mensch und Computer 2017 -
Usability Professionals. Regensburg: Gesellschaft für Informatik e.V. (S. 191 – 200).
Revelle. W. R. (2017). psych: Procedures for personality and psychological research.
https://CRAN.R-project.org/package=psych
29
Appendix – Screenshots of the used questionnaires
The following pages show the used HTML questionnaires as full page screenshots. Pages are
shown in the original German version used for the collection of the data.
30
31
32
33
34
35