ArticlePDF Available

Abstract and Figures

The System Usability Scale (SUS) is a widely adopted and studied questionnaire for usability evaluation. It is technology independent and has been used to evaluate the perceived usability of a broad range of products, including hardware, software, and websites. In this article we present a Slovene translation of the SUS (the SUS-SI) along with the procedure used in its translation and psychometric evaluation. The results indicated that the SUS-SI has properties similar to the English version. Slovene usability practitioners should be able to use the SUS-SI with confidence when conducting user research.
Content may be subject to copyright.
This article was downloaded by: [James R. Lewis]
On: 13 March 2015, At: 08:32
Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
Click for updates
International Journal of Human-Computer Interaction
Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/hihc20
A Slovene Translation of the System Usability Scale:
The SUS-SI
Bojan Blažicaa & James R. Lewisb
a XLAB Research, Ljubljana, Slovenia
b IBM Corporation, Software Group, Boca Raton, Florida, USA
Accepted author version posted online: 02 Dec 2014.Published online: 13 Jan 2015.
To cite this article: Bojan Blažica & James R. Lewis (2015) A Slovene Translation of the System Usability Scale: The SUS-SI,
International Journal of Human-Computer Interaction, 31:2, 112-117, DOI: 10.1080/10447318.2014.986634
To link to this article: http://dx.doi.org/10.1080/10447318.2014.986634
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained
in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no
representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the
Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and
are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and
should be independently verified with primary sources of information. Taylor and Francis shall not be liable for
any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever
or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of
the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://
www.tandfonline.com/page/terms-and-conditions
Intl. Journal of Human–Computer Interaction, 31: 112–117, 2015
Copyright © Taylor & Francis Group, LLC
ISSN: 1044-7318 print / 1532-7590 online
DOI: 10.1080/10447318.2014.986634
A Slovene Translation of the System Usability Scale: The SUS-SI
Bojan Blažica1and James R. Lewis2
1XLAB Research, Ljubljana, Slovenia
2IBM Corporation, Software Group, Boca Raton, Florida, USA
The System Usability Scale (SUS) is a widely adopted and
studied questionnaire for usability evaluation. It is technology
independent and has been used to evaluate the perceived usability
of a broad range of products, including hardware, software, and
websites. In this article we present a Slovene translation of the SUS
(the SUS-SI) along with the procedure used in its translation and
psychometric evaluation. The results indicated that the SUS-SI has
properties similar to the English version. Slovene usability prac-
titioners should be able to use the SUS-SI with confidence when
conducting user research.
1. INTRODUCTION
1.1. The System Usability Scale
The System Usability Scale (SUS) was created in the 1980s
by John Brooke at DEC (published in 1996). Since then,
usability practitioners have used it to evaluate the perceived
usability of different types of systems including websites, hard-
ware products, and consumer software. It has even been used
to assess systems based on technologies that did not exist when
it was developed (Bangor, Kortum, & Miller, 2008). Sauro and
Lewis (2009) reported that in a collection of 90 unpublished
usability studies, the SUS was the most commonly used stan-
dardized usability questionnaire, accounting for 43% of posttest
questionnaire usage. It has been cited in more than 1,200 pub-
lications and incorporated into commercial usability toolkits
(Brooke, 2013).
The SUS is a 10-item questionnaire (see Table 1) in which
respondents indicate their level of agreement with each item on
a scale from 1 (strongly disagree)to5(strongly agree). The odd-
numbered items have a positive tone and the even-numbered
items have a negative tone. To align the mixed-tone items, it
is necessary to transform the raw scores by subtracting 1 from
responses to odd items and subtracting the responses for even
numbers from 5, resulting in transformed scores that range from
0(low perceived usability)to4(high perceived usability). The
Address correspondence to Bojan Blažica, XLAB Research, Pot
za Brdom 100, SI-1000, Ljubljana, Slovenia. E-mail: bojan.blazica@
xlab.si
final SUS score is the sum of the converted scores multiplied
by 2.5, producing scores that can range from 0 to 100. The
conversion of SUS scores to a scale that can range from 0 to
100 should make it easier for usability practitioners and product
managers to communicate (Brooke, 2013).
1.2. The Need for Translation
Standardized usability questionnaires such as the SUS are
a basic building block of usability research (Kirakowski &
Murphy, 2009). When questionnaires are available only in
English, they are useful only with people who are fluent in
English. Even in that case, cultural differences between native
English speakers and nonnative speakers may affect their valid-
ity (Finstad, 2006; van de Vijver & Leung, 2001). Thus, the
primary motivation for translating and validating these ques-
tionnaires is to extend their use to groups who are not fluent
in English.
To the best of our knowledge, there has been no Slovene
translation of a standardized usability questionnaire that has
included psychometric evaluation. Several research papers pro-
vide models for translating standardized usability question-
naires. A recent example is the translation of the Computer
System Usability Questionnaire into Turkish (Erdinç & Lewis,
2013;Lewis,1995). For the SUS, Raita and Oulasvirta (2011)
reported the use of a Finnish translation, two recent German
translations are available (Lohmann & Schäffer, 2013; Rummel,
Ruegenhagen, & Reinhardt, 2013), and Göransson (2011) has
provided a Swedish translation. There have been three rela-
tively recent Slovene translations of the SUS (Kodžoman, 2012;
Pipan, 2011; Stojmenova, 2009).
The Slovene, Finnish, and Swedish translations were, how-
ever, ad hoc translations in the sense that they lacked any
psychometric evaluation (or at least did not report steps to
achieve validation). One of the German translations (Rummel
et al., 2013) reported validation with back-translation—a trans-
lation of a translated text back into the language of the
original text, made without reference to the original text.
Back-translation alone, however, does not provide evidence
that a translated questionnaire and the original have similar
psychometric properties.
112
Downloaded by [James R. Lewis] at 08:32 13 March 2015
SLOVENE TRANSLATION OF THE SYSTEM USABILITY SCALE 113
TABLE 1
Items of the System Usability Scale (SUS) and Their Translation Into Slovene
English Version of SUS Slovene Version (SUS-SI)
1. I think that I would like to use this system frequently. 1. Menim, da bi ta sistem rad pogosto uporabljal.
2. I found the system unnecessarily complex. 2. Sistem se mi je zdel po nepotrebnem zapleten.
3. I thought the system was easy to use. 3. Sistem se mi je zdel enostaven za uporabo.
4. I think that I would need the support of a technical
person to be able to use this system.
4. Menim, da bi za uporabo tega sistema potreboval pomoˇ
c
tehnika.
5. I found the various functions in this system were well
integrated.
5. Razliˇ
cne funkcije tega sistema so se mi zdele dobro
povezane v smiselno celoto.
6. I thought there was too much inconsistency in this
system.
6. Sistem se mi je zdel preveˇ
c nekonsistenten.
7. I would imagine that most people would learn to use this
system very quickly.
7. Menim, da bi se veˇ
cina uporabnikov zelo hitro nauˇ
cila
uporabljati ta sistem.
8. I found the system very cumbersome/awkward to use. 8. Sistem se mi je zdel neroden za uporabo.
9. I felt very confident using the system. 9. Pri uporabi sistema sem bil zelo suveren.
10. I needed to learn a lot of things before I could get going
with this system.
10. Preden sem osvojil uporabo tega sistema, sem se moral
nauˇ
citi veliko stvari.
The anchors: strongly agree 1 2 3 4 5 strongly disagree The anchors: sploh se ne strinjam 12345sepovsemstrinjam
1.3. Previous Psychometric Evaluations of the SUS
The primary focus of psychometric evaluation is to deter-
mine a questionnaire’s reliability, validity (content, construct,
and concurrent) and sensitivity.
Reliability. Reliability was assessed using coefficient alpha
(Cronbach, 1951). Strictly speaking, coefficient alpha is a
measure of internal consistency, but it is the most widely
used method for estimating reliability (Sauro & Lewis, 2012).
Despite some criticisms against its use (Sijtsma, 2009), it has a
mathematical relationship to more direct estimates of reliabil-
ity (e.g., test–retest) in that it provides a lower bound estimate
of reliability. Thus, estimates of coefficient alpha provide a
conservative estimate of reliability. Furthermore, there are well-
established guidelines for acceptable values of coefficient alpha
in the development of standardized questionnaires, with an
acceptable range from .70 to .95 (Landauer, 1997; Lindgaard
& Kirakowski, 2013; Nunnally, 1978). The earliest report of the
reliability of the SUS was .85 (Lucey, 1991). More recent large-
sample evaluations indicate reliability just over .9 (Bangor et al.,
2008; Lewis & Sauro, 2009; Sauro & Lewis, 2011).
Validity. Content validity results from the method used to
select items for a questionnaire. The initial item pool for the
SUS contained 50 items with content related to usability. From
that initial set, Brooke (1996) selected the 10 that most discrim-
inated between two systems, one known to be relatively easy to
use and one known to be more difficult.
Concurrent validity refers to the correlations between met-
rics collected at the same time and expected to have some
relationship. A typical minimum criterion for evidence of
concurrent validity is a correlation of .30 (Nunnally, 1978).
Bangor et al. (2008) reported a significant correlation (r=
.81) between SUS and a single 7-point rating of user friend-
liness. The SUS also correlates significantly (r=.62) with
a 10-point rating of likelihood-to-recommend (LTR; Sauro
&Lewis,2012). Reported correlations between the Usability
Metric for User Experience and SUS are highly significant,
ranging from .90 to .97 (Finstad, 2010; Lewis, Utesch, & Maher,
2013).
A standardized questionnaire has construct validity when a
factor analysis shows that its items align as expected with its
hypothesized factors. The initial expectation was that the SUS
would have one underlying factor (Brooke, 2013), but recent
research has indicated that this is probably not the case. Lewis
and Sauro (2009), analyzing their own data and reanalyzing data
from Bangor et al. (2008), found two underlying factors (Items
4 and 10 aligned with a factor labeled Learnable; the remaining
factors aligned with a factor labeled Usable). An indepen-
dent evaluation using a different analytical method replicated
this finding (Borsci, Federici, & Lauriola, 2009). More recent
research has continued to indicate two underlying factors but
has not replicated exactly the same factor structure. Lewis et al.
(2013) essentially replicated the Usable/Learnable structure,
but Item 1 had a smaller than expected loading with Usable.
Sauro and Lewis (2011) found Items 6 and 8 aligning with Items
4 and 10.
Thus, the common finding across the various structural anal-
yses is that Items 4 and 10 consistently align on a second
factor. When other items align on that factor, they are also
Downloaded by [James R. Lewis] at 08:32 13 March 2015
114 B. BLAŽICA AND J. R. LEWIS
even-numbered (negative-tone) items. A number of researchers
have noted the tendency for positive- and negative-tone items
to load on separate factors (Barnette, 2000;Davis,1989; Pilotte
& Gable, 1990; Sauro & Lewis, 2011; Schmitt & Stuits, 1985;
Schriesheim & Hill, 1981; Wong, Rindfleisch, & Burroughs,
2003).
Sensitivity. When a metric responds appropriately to
manipulation, that metric is sensitive. Metrics that are reli-
able and valid tend to be sensitive. The original item-selection
strategy for the SUS was to include the items that best discrim-
inated between test systems of known poor and good usability
(Brooke, 1996). Numerous other researchers have reported the
detection of significant differences using the SUS. For example,
Bangor et al. (2008) found the SUS to be sensitive to differ-
ences among types of interfaces and changes made to a product.
Kortum and Bangor (2013) reported widely varying mean SUS
scores for different types of everyday products.
Using a different approach, Tullis and Stetson (2004) con-
ducted an experiment to investigate the relative sensitivities of
five poststudy usability questionnaires (SUS, QUIS, Computer
System Usability Questionnaire, Words, Fidelity question-
naire). One hundred twenty-three Fidelity employees attempted
tasks at two financial websites in counterbalanced order, com-
pleting the same randomly assigned questionnaire after experi-
encing each site. There was a clear difference in the perceived
usability of the sites, regardless of questionnaire. Tullis and
Stetson then conducted Monte Carlo experiments with sam-
ple sizes varying from six to 14 to see which questionnaire
most frequently identified (via a significant ttest) the more
usable website. The SUS was the fastest to converge, with 75%
agreement at a sample size of eight and 100% agreement at 12.
1.4. SUS Norms
By itself, a score (individual or average) has no meaning.
One way to provide meaning is through comparison (ttest
or Ftest), either against an established benchmark or com-
paring two sets of data (e.g., different products, different user
groups). Another approach, relatively rare in usability work, is
comparison with norms based on data collected from represen-
tative sets of users completing representative tasks. Comparison
with norms allows assessment of how good or bad a score is,
although one must always be cautious regarding the extent to
which a new sample matches the normative sample (Anastasi,
1976).
In recent years, a number of researchers have published data
from their use of the SUS suitable for the development of
norms. Sauro (2011), using data from 500 unpublished indus-
trial usability studies, found an average SUS score of 68. For a
curved grading scale based on this data, see Sauro and Lewis
(2012, p. 204, Table 8.6). Using that scale, scores between
65 and 71 receive a C (average). An A– ranges from 78.9 to
80.7. To get an A+, the mean SUS score needs to exceed 84.1.
They also provided a breakdown by product type (Table 8.7,
p. 205).
Kortum and Bangor (2013) used the SUS to survey the per-
ceived usability of 14 everyday products. Respondents did not
perform any tasks but rather rated the products based on past
experience. The sample sizes for the different products ranged
from 252 to 980. The lowest scoring product was Excel (M=
56.5, Sauro-Lewis grade of D), 95% confidence interval (CI)
[55.3, 57.7]. The highest scoring was Google search (M=93.4,
Sauro-Lewis grade of A+), 95% CI [92.7, 94.1]. Also scor-
ing relatively high was Gmail (M=83.5, Sauro-Lewis grading
scale ranging from A to A–), 95% CI [82.2, 84.8].
2. TRANSLATION OF THE SUS INTO SLOVENE
There were three stages in the translation process. First,
10 reviewers from the fields of computer and natural sciences
individually reviewed a draft translation. Second, the final trans-
lation incorporated their comments. The third stage was to
perform a back-translation. Three independent translators, with-
out reference to the original, translated the final draft back into
English. The translators were native Slovene speakers fluent in
English. For all 10 items, all three translators provided back-
translations with the same meaning as the original and, in some
cases, exactly the same wording. For example, Item 9, “I felt
very confident using the system,” was back-translated to “I was
very self-reliant when using the system,” “I felt very confident
using this system,” and “I felt confident when using the system.”
Table 1 shows the original English and final Slovene versions of
the items.
3. PSYCHOMETRIC EVALUTION OF THE SUS-SI
3.1. Method
Using the method of Kortum and Bangor (2013), we con-
ducted an online survey in which 182 respondents (114 male,
68 female) provided ratings of Gmail using the SUS-SI. The
survey was disseminated among native speakers from Slovenia.
Respondents also provided a standard rating of LTR using a
0-to-10 point scale (Sauro & Lewis, 2012). The participants’
ages ranged from 19 to 67 with an average of 29. With regard
to education, 106 were college graduates. Most respondents
(142) reported using Gmail at least once each day.
3.2. Reliability
The scale reliability, assessed using coefficient alpha, was
.81. This is a bit lower than the value typically reported for the
English version (.92) but is well over the minimum criterion of
.70 for acceptable reliability (Landauer, 1997; Nunnally, 1978).
3.3. Concurrent Validity
The correlation between the overall SUS score and LTR was
a statistically significant .52, t(179) =8.25, p<.0001, 95% CI
[.41, .62]. This is significantly greater than the typical criterion
of at least 0.3 and has an upper limit matching the correlation of
.62 reported by Sauro and Lewis (2012) for the standard SUS.
Downloaded by [James R. Lewis] at 08:32 13 March 2015
SLOVENE TRANSLATION OF THE SYSTEM USABILITY SCALE 115
Current (SUS-SI)
95% CI Upper Limit
95% CI Lower Limit
Mean
75.0
80.0
85.0
90.0
95.0
100.0
Study
SUS Score (Gmail)
Kortum & Bangor (2013, SUS)
FIG. 1. Comparison of the Slovene version of the System Usability Scale (SUS-SI) rating of Gmail with System Usability Scale (SUS) Rating from Kortum and
Bangor (2013). Note. CI =confidence interval.
3.4. Construct Validity
Table 2 shows the two-factor solution for the SUS-SI.
Consistent with the pattern reported in Sauro and Lewis (2011),
Items 1, 2, 3, 5, 7, and 9 aligned with the first factor, and Items
4, 6, 8, and 10 aligned with the second.
3.5. Sensitivity
The SUS-SI was sensitive to differences in frequency of use,
F(3, 180) =5.8, p=.001. Respondents who reported a greater
TABLE 2
Varimax-Rotated Two-Factor Solution for the Slovene
Version of the System Usability Scale
Item Factor 1 Factor 2
1.600 .115
2.504 .440
3.549 .175
4 .126 .617
5.601 .119
6 .451 .571
7.439 .145
8 .361 .722
9.400 .177
10 .081 .589
Bold typeface denotes dominating factor for each item in
questionnaire.
frequency of use tended to provide higher SUS ratings, r(181) =
.20, p=.01.
3.6. Normative Comparison
The overall mean SUS-SI was 81.7 with a standard devia-
tion of 13.5, 95% CI [79.7, 83.7]. This is close to the Gmail
value reported by Kortum and Bangor (2013). Their Gmail data,
collected using the English version of the SUS, had a mean of
83.5 (n=605; SD =15.9), 95% CI [82.2, 84.8]. As shown in
Figure 1, these confidence intervals overlap substantially, indi-
cating that the Gmail results for the SUS-SI corresponded with
the norm published by Kortum and Bangor.
4. DISCUSSION
The primary goal of this research was to translate and val-
idate the SUS for use by speakers of Slovene (the SUS-SI).
The multistage translation process included the steps of initial
translation, expert review, and back-translation. Psychometric
evaluation of the SUS-SI indicated an acceptable level of relia-
bility. A strong correlation between the SUS-SI and a rating of
LTR provided evidence of concurrent validity. Factor analysis
consistent with a structure reported for the standard SUS indi-
cated appropriate construct validity. Consistent with expectation
given its reliability and validity, the SUS-SI was significantly
sensitive to differences in reported frequency of use. Finally,
the overall mean for the SUS-SI rating of Gmail was consis-
tent with published norms for the standard SUS. These results
Downloaded by [James R. Lewis] at 08:32 13 March 2015
116 B. BLAŽICA AND J. R. LEWIS
indicate that Slovene usability practitioners should be able to
use the SUS-SI with confidence when conducting user research.
Future work with the SUS-SI should concentrate on two
areas. Researchers should extend this work to the evaluation
of different products, focusing on the products investigated
by Kortum and Bangor (2013) to see if the correspondence
between the SUS-SI and SUS for Gmail holds for other products
and product types. It would also be useful to conduct experi-
ments on systems of varying usability to check for consistency
of sensitivity between the SUS-SI and the SUS.
ORCID
Bojan Blažica http://orcid.org/0000-0003-4597-5947
FUNDING
This research was funded in part by the European Union,
European Social Fund, Operational Program for Human
Resources, Development for the Period 2007–2013. We thank
all who contributed to the creation and validation of SUS-SI by
translating, back-translating or participating in the survey.
REFERENCES
Anastasi, A. (1976). Psychological testing. New York, NY: Macmillan.
Bangor, A., Kortum, P. T., & Miller, J. T. (2008). An empirical evaluation
of the System Usability Scale. International Journal of Human–Computer
Interaction,24, 574–594.
Barnette, J. J. (2000). Effects of stem and Likert response option reversals on
survey internal consistency: If you feel the need, there is a better alterna-
tive to using those negatively worded stems. Educational and Psychological
Measurement,60, 361–370.
Borsci, S., Federici, S., & Lauriola, M. (2009). On the dimensionality of the
system usability scale: A test of alternative measurement models. Cognitive
Processes,10, 193–197.
Brooke, J. (1996). SUS—A “quick and dirty” usability scale. In P. W. Jordan
(Eds.), Usability evaluation in industry (pp. 189–194). London, UK: Taylor
& Francis.
Brooke, J. (2013). SUS: A retrospective. Journal of Usability Studies,8(2),
29–40.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests.
Psychometrika,16, 297–334.
Davis, D. (1989). Perceived usefulness, perceived ease of use, and user accep-
tance of information technology. MIS Quarterly,13, 319–339.
Erdinç, O., & Lewis, J. R. (2013). Psychometric evaluation of the T-CSUQ:
The Turkish version of the Computer System Usability Questionnaire.
International Journal of Human-Computer Interaction,29, 319–326.
Finstad, K. (2006). The System Usability Scale and non-native English speak-
ers. Journal of Usability Studies,1, 185–188.
Finstad, K. (2010). The usability metric for user experience. Interacting with
Computers,22, 323–327.
Göransson, B. (2011). SUS Svensk: System Usability Scale in Swedish.
Available from http://rosenfeldmedia.com/books/survey-design/blog/
sus_svensk_system_usability_sc/
Kirakowski, J., & Murphy, R. (2009). A comparison of current approaches
to usability measurement. Proceedings of the Irish Human–Computer
Interaction Conference 2009 TCD, pp. 13–17.
Kodžoman, J. (2012). Model ocenjevanja uporabnost: Spletnih trgovin v
tujem jeziku [The evaluation model for usability of Internet shops in
foreign language]. (Unpublished bachelor thesis). Univerza v Ljubljani,
Ljubljana, Slovenia. Retrieved from http://eprints.fri.uni-lj.si/1678/1/
Kod%C5%BEoman-1.pdf
Kortum, P. T., & Bangor, A. (2013). Usability ratings for everyday prod-
ucts measured with the System Usability Scale. International Journal of
Human–Computer Interaction,29, 67–76.
Landauer, T. K. (1997). Behavioral research methods in human–computer inter-
action. In M. Helander, K. T. Landauer, & P. Prabhu (Eds.), Handbook
of human–computer interaction (2nd ed., pp. 203–227). Amsterdam, the
Netherlands: Elsevier.
Lewis, J. R. (1995). IBM computer usability satisfaction questionnaires:
Psychometric evaluation and instructions for use. International Journal of
Human–Computer Interaction,7,57–78.
Lewis, J. R., & Sauro, J. (2009). The factor structure of the system
usability scale. In M. Kurosu (Ed.), Human-centered design (pp. 94–103).
Heidelberg, Germany: Springer-Verlag.
Lewis, J. R., Utesch, B. S., & Maher, D. E. (2013). UMUX-LITE—When
there’s no time for the SUS. In Proceedings of CHI 2013 (pp. 2099–2102).
Paris, France: ACM.
Lindgaard, G., & Kirakowski, J. (2013). Introduction to the special issue:
The tricky landscape of developing rating scales in HCI. Interacting with
Computers,25, 271–277.
Lohmann, K., & Schäffer, J. (2013). System Usability Scale (SUS)—An
improved German translation of the questionnaire. Retrieved from
http://minds.coremedia.com/2013/09/18/sus-scale-an-improved-german-
translation-questionnaire/
Lucey, N. M. (1991). More than Meets the I: User-Satisfaction of Computer
Systems. (Unpublished thesis for diploma in Applied Psychology). Cork,
Ireland: University College Cork.
Nunnally, J. C. (1978). Psychometric theory. New York, NY: McGraw-Hill.
Pilotte, W. J., & Gable, R. K. (1990). The impact of positive and negative
item stems on the validity of a computer anxiety scale. Educational and
Psychological Measurement,50, 603–610.
Pipan, M. (2011, March 17). Uporaba technologije sledenja ci pri ocen-
jevanju uporabnosti e-uˇcnega okolja [The use of eye-tracking for
usability evaluation of e-learning environments] Paper presented at
Konferenca o E-Izobraževanju. Retrieved from http://www.b2.eu/portals/0/
E-izobrazevanje/konferenca/09-IJS-Matic-Pipan.pdf
Raita, E., & Oulasvirta, A. (2011). Too good to be bad: Favorable product expec-
tations boost subjective usability ratings. Interacting with Computers,23,
363–371.
Rummel, B., Ruegenhagen, E., & Reinhardt, W. (2013). System Usability Scale
[German trans.]. Retrieved from http://www.sapdesignguild.org/resources/
sus.asp
Sauro, J. (2011). A practical guide to the System Usability Scale (SUS):
Background, bench-marks & best practices. Denver, CO: Measuring
Usability LLC.
Sauro, J., & Lewis, J. R. (2009). Correlations among prototypical usability met-
rics: Evidence for the construct of usability. In Proceedings of CHI 2009
(pp. 1609–1618). Boston, MA: ACM.
Sauro, J., & Lewis, J. R. (2011). When designing usability questionnaires,
does it hurt to be positive? In Proceedings of CHI 2011 (pp. 2215–2223).
Vancouver, Canada: ACM.
Sauro, J., & Lewis, J. R. (2012). Quantifying the user experience: Practical
statistics for user research. Burlington, MA: Morgan-Kaufmann.
Schmitt, N., & Stuits, D. (1985) Factors defined by negatively keyed items:
The result of careless respondents? Applied Psychological Measurement,9,
367–373.
Schriesheim, C. A., & Hill, K. D. (1981). Controlling acquiescence response
bias by item reversals: The effect on questionnaire validity. Educational and
Psychological Measurement,41, 1101–1114.
Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of
Cronbach’s alpha. Psychometrika,74, 107–120.
Stojmenova, E. (2009). Ocenjevanje uporabnosti spletne aplikacije Web
Communicator [Usability evaluation of web-based application Web
Communicator] (Unpublished bachelor thesis). FERI. Available at http://
dkum.uni-mb.si/IzpisGradiva.php?id=11082
Tullis, T. S., & Stetson, J. N. (2004). A comparison of questionnaires for
assessing website usability. Paper presented at the Usability Professionals
Association Annual Conference. Minneapolis, MN: UPA. Available from
http://www.userfocus.co.uk/articles/satisfaction.html
Downloaded by [James R. Lewis] at 08:32 13 March 2015
SLOVENE TRANSLATION OF THE SYSTEM USABILITY SCALE 117
van de Vijver, F. J. R., & Leung, K. (2001). Personality in cultural context:
Methodological issues. Journal of Personality,69, 1007–1031.
Wong, N., Rindfleisch, A., & Burroughs, J. (2003). Do reverse-worded items
confound measures in cross-cultural consumer research? The case of the
materialvalues scale. Journal of Consumer Research,30, 72–91.
ABOUT THE AUTHORS
Bojan Blažica is an electrical engineer with a Ph.D. from
the fields of Human-Computer Interaction and Artificial intel-
ligence. He is focused on the context awareness of natural
user interfaces in general and multitouch displays specifically
as well as usability evaluation and user experience design.
He is one of the initiators of the Slovenian HCI community
(http://www.hci.si).
James R. Lewis is a senior human factors engineer (at IBM
since 1981), focusing on the design/evaluation of speech appli-
cations. He has published influential papers in the areas of
usability testing and measurement. His books include Practical
Speech User Interface Design and (with Jeff Sauro) Quantifying
the User Experience.
Downloaded by [James R. Lewis] at 08:32 13 March 2015
... Various measures have been employed to assess usability, including the Software Usability Scale (SUS) [79], Software Usability Measurement Inventory (SUMI) [80], User Experience Questionnaire (UEQ) [81], and Technology Acceptance Model (TAM) [75,82]. Brooke created the System Usability Scale (SUS) to assess perceived usability [79]. ...
... Blažica and Lewis presented a Slovenian translation, and the results indicated that the SUS-SI has properties similar to the English version and can therefore be used with confidence when conducting user research [82]. Lewis points out that usability can only be defined by reference to specific contexts and not as an absolute concept [84]. ...
... To evaluate the usability of the "Compliments and Comments Tool" and answer RQ2, H2, and H3, this study utilized an adapted version of the System Usability Scale (SUS) [79] and the Technology Acceptance Model (TAM) [75,82], aligned with ISO 9241-11 standards [74]. These instruments were adapted to the educational context and specifically focused on the application of the tool to teachers. ...
Article
Full-text available
Numerous studies have explored the integration of technology-enhanced feedback systems in education. However, there is still a need for further investigation into their specific impact on teacher satisfaction, which is essential for effective feedback delivery to students. This study addresses this gap by analyzing teachers’ satisfaction with the “Compliments and Comments Tool”, a technology-enhanced system developed to provide written feedback to students. Using a quantitative approach, this study examined teachers’ perceptions of the tool’s usability in the Slovenian education context, involving a diverse group of 3412 primary and secondary school teachers. Data were collected through surveys employing the System Usability Scale (SUS) and Technology Acceptance Model (TAM) for quantitative analysis, complemented by qualitative insights. The results showed high teacher satisfaction, valuing the tool for facilitating feedback and supporting a positive learning environment. These findings suggest that the “Compliments and Comments Tool” is a valuable addition to educational technology, promoting effective teaching and enhancing student engagement. This study emphasizes the critical role of user-centered design and system usability in educational technology, particularly in fostering effective feedback and promoting student self-regulation.
... The SUS has been translated into numerous languages such as Chinese [13], Finnish [14], French [15], Hindi [15], Indonesian [16], and Polish [17]. It has undergone psychometric validation [18], including in Arabic [19], Danish [20], Dutch [21], German [22], Italian [23], Malay [24], Persian [25], Portuguese [26], Slovene [27], and Spanish [28]. The psychometric properties that have resulted from these studies show that adapted versions of the SUS are a reliable tool for usability assessments. ...
... However, a number of these studies adopted a general focus and examined only the total sum in the test. Only a small number of studies have tested the instrument at an item level [19][20][21][22][23][24][25][26][27][28], with none of them in Swedish. This emphasizes the need for a comprehensive testing of a Swedish version of the SUS on an item level. ...
Article
Background The Swedish health care system is undergoing a transformation. eHealth technologies are increasingly being used. The System Usability Scale is a widely used tool, offering a standardized and reliable measure for assessing the usability of digital health solutions. However, despite the existence of several translations of the System Usability Scale into Swedish, none have undergone psychometric validation. This highlights the urgent need for a validated and standardized Swedish version of the System Usability Scale to ensure accurate and reliable usability evaluations. Objective The aim of the study was to translate and psychometrically evaluate a Swedish version of the System Usability Scale. Methods The study utilized a 2-phase design. The first phase translated the System Usability Scale into Swedish and the second phase tested the scale’s psychometric properties. A total of 62 participants generated a total of 82 measurements. Descriptive statistics were used to visualize participants’ characteristics. The psychometric evaluation consisted of data quality, scaling assumptions, and acceptability. Construct validity was evaluated by convergent validity, and reliability was evaluated by internal consistency. Results The Swedish version of the System Usability Scale demonstrated high conformity with the original version. The scale showed high internal consistency with a Cronbach α of .852 and corrected item-total correlations ranging from 0.454 to 0.731. The construct validity was supported by a significant positive correlation between the System Usability Scale and domain 5 of the eHealth Literacy Questionnaire ( P =.001). Conclusions The Swedish version of the System Usability Scale demonstrated satisfactory psychometric properties. It can be recommended for use in a Swedish context. The positive correlation with domain 5 of the eHealth Literacy Questionnaire further supports the construct validity of the Swedish version of the System Usability Scale, affirming its suitability for evaluating digital health solutions. Additional tests of the Swedish version of the System Usability Scale, for example, in the evaluation of more complex eHealth technology, would further validate the scale.
... For this phase, we followed the method used by Kortum and Bangor (2013), Blažica and Lewis (2015) and more recently by Miranda et al. (2021) of providing meaning to the questionnaire scores by comparing two groups of data based on people's prior experiences. ...
Article
Full-text available
Self-determination theory (SDT), a foundational psychological framework, has emerged as a pivotal lens through which to understand the dynamics of human-computer interaction (HCI) and Games User Research (GUR). Central to SDT is the conceptualization of intrinsic motivation, characterized by voluntary behaviors arising from personal interest. An established method for assessing intrinsic motivation across various contexts is the employment of the Intrinsic Motivation Inventory (IMI). This study presents a meticulous translation and cross-cultural adaptation of the 22-item version of IMI, known as the Task Evaluation Questionnaire, into Brazilian Portuguese (IMI Teq Br). The process adhered to the comprehensive methodology outlined by Beaton, encompassing Translation , Synthesis, Back-translation, Expert Committee review, and Pre-testing phases. Statistical analyses, including Student's T-test for independent samples, Exploratory Factor Analysis (EFA), and Confirmatory Factor Analysis (CFA), were conducted to ensure the validity and reliability of the translated instrument. Our findings corroborate the robustness of the adapted questionnaire, affirming its suitability for use within the Brazilian Portuguese-speaking context. This paper meticulously delineates the adaptation process and resultant statistical outcomes, offering insights into the significance of IMI Teq Br for engagement and motivation research grounded in SDT principles. Additionally, we thorougly discuss the challenges inherent to this context in Brazil, providing valuable considerations for future endeavors.
... Odd-numbered statements convey positive sentiments, while even-numbered statements express negative ones. This structure provides a balanced evaluation of a product's usability [18]. ...
Article
Unipro Store, a retail shop specializing in cellphone accessories, faces significant challenges in recording transactions and managing inventory due to its reliance on manual processes. All transactions, whether purchases from suppliers or sales conducted through various channels such as WhatsApp, Tokopedia, Shopee, or in-store are still documented manually using a notebook. This manual system makes it difficult for shop owners to retrieve historical transaction data and increases the risk of data loss, as there is no backup if the notebook is lost or damaged. To address these issues, a transaction recording and inventory management application has been designed to streamline the management of product stock, sales, and purchase data. The application supports two types of users: the owner and the admin. Its development follows the Software Development Life Cycle (SDLC) using the waterfall model. During implementation, HTML, CSS, and JavaScript were utilized alongside the Bootstrap and ASP.NET frameworks to develop the application, with Microsoft SQL Server selected as the database solution. In the testing phase, user acceptance testing (UAT) was conducted using a black box testing approach, successfully passing all test scenarios. Additionally, a System Usability Scale (SUS) questionnaire was distributed, yielding a final score of 81.67, which falls under grade A in the usability assessment.
... SUS has been translated to numerous languages and validated in diverse cultural contexts such as Turkish (Demiṙkol and Şeneler, 2018), German (Rummel, 2015), Polish (Borkowska and Jach, 2017), Chinese (Wang et al., 2020), Slovene (Blažica and Lewis, 2015), Indonesian (Sharfina and Santoso, 2016), etc. A great deal of studies utilized SUS in the assessment of usability of diverse type of products and interactive end user systems. ...
Article
Full-text available
Article Classification: Research Article Purpose-Mobile banking applications have revolutionized the way individuals manage their finances, offering convenient access to banking services at any time and place. These apps have become indispensable for banks seeking to remain competitive, address their customers' changing needs, and gather valuable data on consumer behavior. Ensuring optimal system usability is vital for the success of such technologies, as it significantly influences user experience and adoption rates. This research investigates the perceived system usability scores of private and public banks in Türkiye. Design/Methodology/Approach-This study employs machine learning techniques to predict perceived system usability scores of mobile banking apps. The System Usability Scale (SUS) has been utilized to evaluate the perceived system usability. Factors considered in the prediction model include demographic data, system usage data, and device technical specifications. Findings-The research demonstrates that machine learning techniques can effectively predict the perceived usability of both public and private bank mobile apps. Key predictors of perceived usability included demographic details, mobile app usage experience, and technical specifications of the devices used. Factors such as age, gender, education, occupation, screen size, mobile operating system, mobile app usage frequency, and previous app experience significantly influence perceived usability. Discussion-The study highlights the potential of machine learning as a powerful tool in social science research, offering valuable insights into complex data sets and patterns. The findings of this research can offer valuable insights to system interface designers and human-computer interaction researchers examining system usability challenges.
... To ensure item understandability when testing in different cultures, the SUS has a wide range of cross-cultural adaptations that have undergone extensive translation and verification processes in more than a dozen languages. For example, it has been modified for use in English (Finstad, 2006), Slovenian (Blažica and Lewis, 2015), Arabic (AlGhannam et al., 2018), Indonesian (Sharfina and Santoso, 2016), Chinese (Wang et al., 2020b), French (Gronier and Baudet, 2021), Urdu (Anam et al., 2020), and Dutch (Ensink et al., 2022). These cross-cultural versions include modified expressions of some original items to adapt to the desired language. ...
Article
This paper describes the development of a cartoon animation system usability scale (A-SUS) based on the established text-based SUS questionnaire. We propose a methodology and design a short graphic interchange format (.GIF) animation for each SUS item. Experimental evaluations confirm that the scale has satisfactory psychometric properties (e.g., structural validity, reliability factor structure, concurrent validity, and sensitivity). A second experiment is used to evaluate and compare the questionnaire experiences associated with the SUS, a pictorial SUS (P-SUS), and the developed A-SUS. The results indicate that A-SUS performs well in terms of recommendations, aesthetics, motivations, and completion time. Compared with the SUS and P-SUS, the animated version is more interesting, and the overall questionnaire experience is better.
... After the presentation lecture on the tool, the participants had time to explore the functionalities of the IPM Adviser web tool by themselves. After they were familiar with the tool, the participants were presented with the validated Slovene translation [70] of the SUS questionnaire in an online format. ...
Article
Full-text available
Decision support systems (DSSs) enable the optimisation of pesticide application timing to increase pesticide efficacy and thus reduce pesticide use without compromising yield quality and quantity. Limited access to information about available DSSs for use in integrated pest management (IPM) is a major barrier to the uptake of DSSs for IPM across Europe. To overcome this barrier, a typology for DSSs for IPM in Europe was developed, introducing a systematic approach to describe the ever-growing number of DSSs for IPM. The developed IPM-DSS typology was implemented in the free web tool “IPM Adviser”, where currently 79 IPM DSSs are described with over 50 attributes describing their structural and performance characteristics. The information about IPM DSSs, which was previously scattered on different websites and difficult to compare, is now standardised and presented in a uniform way, so that it is possible to compare different IPM DSSs on the basis of all the attributes described. The presented IPM-DSS typology implemented in the web tool IPM Adviser facilitates the dissemination and uptake of DSSs for IPM and thus contributes to the achievement of the EU targets for the sustainable use of pesticides.
Article
Full-text available
Article
Full-text available
The SUS translation has been moved and is now available here: https://blogs.sap.com/2016/02/01/system-usability-scale-jetzt-auch-auf-deutsch/ Best, Bernard
Article
Full-text available
Fifteen organisations experienced in usability testing agreed to carry out an evaluation of a commercial web site for time on task, effectiveness, and user satisfaction. Other metrics could also be used. Reports were done to consultancy standards and shared at a workshop at the Usability Professionals' Association conference in June 2009. This paper looks at the ways data were collected and analysed, and draws the following conclusions: time on task needs refinement as a measure of usability; the SUS questionnaire is not optimal for such evaluations; mental effort measurement and the WAMMI questionnaire however are promising. Remote unattended testing is also a promising methodology although at present it may generate data of unknown precision. More attention should be paid to the context of use when testing.
Conference Paper
Full-text available
In this paper we present the UMUX-LITE, a two-item questionnaire based on the Usability Metric for User Experience (UMUX) [6]. The UMUX-LITE items are This system's capabilities meet my requirements and This system is easy to use." Data from two independent surveys demonstrated adequate psychometric quality of the questionnaire. Estimates of reliability were .82 and .83 -- excellent for a two-item instrument. Concurrent validity was also high, with significant correlation with the SUS (.81, .81) and with likelihood-to-recommend (LTR) scores (.74, .73). The scores were sensitive to respondents' frequency-of-use. UMUX-LITE score means were slightly lower than those for the SUS, but easily adjusted using linear regression to match the SUS scores. Due to its parsimony (two items), reliability, validity, structural basis (usefulness and usability) and, after applying the corrective regression formula, its correspondence to SUS scores, the UMUX-LITE appears to be a promising alternative to the SUS when it is not desirable to use a 10-item instrument.
Book
You're being asked to quantify your usability improvements with statistics. But even with a background in statistics, you are hesitant to statistically analyze their data, as they are often unsure which statistical tests to use and have trouble defending the use of small test sample sizes. The book is about providing a practical guide on how to solve common quantitative problems arising in usability testing with statistics. It addresses common questions you face every day such as: Is the current product more usable than our competition? Can we be sure at least 70% of users can complete the task on the 1st attempt? How long will it take users to purchase products on the website? This book shows you which test to use, and how provide a foundation for both the statistical theory and best practices in applying them. The authors draw on decades of statistical literature from Human Factors, Industrial Engineering and Psychology, as well as their own published research to provide the best solutions. They provide both concrete solutions (excel formula, links to their own web-calculators) along with an engaging discussion about the statistical reasons for why the tests work, and how to effectively communicate the results. *Provides practical guidance on solving usability testing problems with statistics for any project, including those using Six Sigma practices *Show practitioners which test to use, why they work, best practices in application, along with easy-to-use excel formulas and web-calculators for analyzing data *Recommends ways for practitioners to communicate results to stakeholders in plain English. © 2012 Jeff Sauro and James R. Lewis Published by Elsevier Inc. All rights reserved.
Article
Numerous papers containing scales intended to assess various subjective aspects of usability as well as the wider user experience have been published in the HCI literature. All six papers selected for discussion report the development process and validation of an opinion-based rating scale. Thematically, the papers fall into three logical categories with two papers in each, namely perceived usability, user satisfaction, and aspects of user engagement in digital games. While the original papers are cited, it was not possible to reproduce them. However, both the commentaries and the original authors' responses provide sufficient information about those papers to convey the perceived strengths and weaknesses allowing the readers to follow the arguments. If scale development in the sphere of user satisfaction is tricky, attempts to assess various aspects of interactive digital gaming, involves consideration of entirely different, no less complex, variables.
Chapter
This chapter discusses the conduct of research to guide the development of more useful and usable computer systems. Experimental research in human-computer interaction involves varying the design or deployment of systems, observing the consequences, and inferring from observations what to do differently. For such research to be effective, it must be owned—instituted, trusted and heeded—by those who control the development of new systems. Thus, managers, marketers, systems engineers, project leaders, and designers as well as human factors specialists are important participants in behavioral human-computer interaction research. This chapter is intended as much for those with backgrounds in computer science, engineering, or management as for human factors researchers and cognitive systems designers. It is argued in this chapter that the special goals and difficulties of human-computer interaction research make it different from most psychological research as well as from traditional computer engineering research. The main goal, the improvement of complex, interacting human-computer systems, requires behavioral research but is not sufficiently served by the standard tools of experimental psychology such as factorial controlled experiments on pre-planned variables. The chapter contains about equal quantities of criticism of inappropriate general research methods, description of valuable methods, and prescription of specific useful techniques.