ArticlePDF Available

On the dimensionality of the System Usability Scale: A test of alternative measurement models

Authors:

Abstract and Figures

The System Usability Scale (SUS), developed by Brooke (Usability evaluation in industry, Taylor & Francis, London, pp 189-194, 1996), had a great success among usability practitioners since it is a quick and easy to use measure for collecting users' usability evaluation of a system. Recently, Lewis and Sauro (Proceedings of the human computer interaction international conference (HCII 2009), San Diego CA, USA, 2009) have proposed a two-factor structure-Usability (8 items) and Learnability (2 items)-suggesting that practitioners might take advantage of these new factors to extract additional information from SUS data. In order to verify the dimensionality in the SUS' two-component structure, we estimated the parameters and tested with a structural equation model the SUS structure on a sample of 196 university users. Our data indicated that both the unidimensional model and the two-factor model with uncorrelated factors proposed by Lewis and Sauro (Proceedings of the human computer interaction international conference (HCII 2009), San Diego CA, USA, 2009) had a not satisfactory fit to the data. We thus released the hypothesis that Usability and Learnability are independent components of SUS ratings and tested a less restrictive model with correlated factors. This model not only yielded a good fit to the data, but it was also significantly more appropriate to represent the structure of SUS ratings.
Content may be subject to copyright.
LETTER TO THE EDITOR
On the dimensionality of the System Usability Scale:
a test of alternative measurement models
Simone Borsci ÆStefano Federici ÆMarco Lauriola
Received: 30 May 2009 / Revised: 12 June 2009 / Accepted: 15 June 2009 / Published online: 30 June 2009
ÓMarta Olivetti Belardinelli and Springer-Verlag 2009
Abstract The System Usability Scale (SUS), developed by
Brooke (Usability evaluation in industry, Taylor & Francis,
London, pp 189–194, 1996), had a great success among
usability practitioners since it is a quick and easy to use
measure for collecting users’ usability evaluation of a system.
Recently, Lewis and Sauro (Proceedings of the human
computer interaction international conference (HCII 2009),
San Diego CA, USA, 2009) have proposed a two-factor
structure—Usability (8 items) and Learnability (2 items)—
suggesting that practitioners might take advantage of these
new factors to extract additional information from SUS data.
In order to verify the dimensionality in the SUS’ two-com-
ponent structure, we estimated the parameters and tested with
a structural equation model the SUS structure on a sample of
196 university users. Our data indicated that both the unidi-
mensional model and the two-factor model with uncorrelated
factors proposed by Lewis and Sauro (Proceedings of the
human computer interaction international conference (HCII
2009), San Diego CA, USA, 2009) had a not satisfactory fit to
the data. We thus released the hypothesis that Usability and
Learnability are independent components of SUS ratings and
tested a less restrictive model with correlated factors. This
model not only yielded a good fit to the data, but it was also
significantly more appropriate to represent the structure of
SUS ratings.
Keywords Questionnaire Usability evaluation
System Usability Scale
Introduction
The System Usability Scale (SUS) developed in 1986 by Digital
Equipment CorporationÓis a ten-item scale giving a global
assessment of Usability, operatively defined as the subjective
perception of interaction with a system (Brooke 1996). The SUS
items have been developed according to the three usability
criteria defined by the ISO 9241-11: (1) the ability of users to
complete tasks using the system, and the quality of the output of
those tasks (i.e., effectiveness), (2) the level of resource con-
sumed in performing tasks (i.e., efficiency), and (3) the users’
subjective reactions using the system (i.e., satisfaction).
Practitioners have considered the SUS as unidimensional
(Brooke 1996;Kirakowski1994) since the scoring system of
this scale results in a single summated rating of overall
usability. Such scoring procedure is strongly based on the
assumption that a single latent factor loads on all items. So far
this assumption has been tested with inconsistent results.
Whereas Bangor et al. (2008) retrieved a single principal
component of SUS items, Lewis and Sauro (2009) suggested a
two-factor orthogonal structure, which practitioners may use
to score the SUS on independent Usability and Learnability
dimensions. This latter finding is very inconsistent with the
unidimensional SUS scoring system as items loading on
independent factors of Usability and Learnability cannot be
summated according to the classical test theory (Carmines and
Zeller 1992). Furthermore, these factor analyses of the SUS
have been carried out by exploratory techniques, nevertheless
S. Borsci (&)
ECoNA, Interuniversity Centre for Research on Cognitive
Processing in Natural and Artificial Systems,
University of Rome ‘La Sapienza’, Rome, Italy
e-mail: simone.borsci@uniroma1.it; siomone.bo21@alice.it
S. Federici
Department of Human and Educational Sciences,
University of Perugia, Perugia, Italy
M. Lauriola
Department of Psychology of Socialization and Development
Processes, University of Rome ‘La Sapienza’, Rome, Italy
123
Cogn Process (2009) 10:193–197
DOI 10.1007/s10339-009-0268-9
these techniques lack of the necessary formal developments to
test which of the two proposed factor solutions is the best
account of collected data.
Unlike exploratory factor analysis, confirmatory factor
analysis (CFA) is a theory-driven approach who needs a pri-
ori specification of the number of latent variables (i.e., the
factors), of the observed-latent variables correlations (i.e., the
factor loadings) as well as of the correlations among latent
variables (Fabrigar et al. 1999). Once the model’s parameters
have been estimated, the hypothesized model is evaluated
according to its ability to replicate sample’s data. These
features make the CFA approach the state of the art most
accurate methodology to compare alternative factorial
structures and eventually decide which is the best one.
Purpose
In the present study, we aim at comparing three alternative factor
models of the SUS items: the one-factor solution with an overall
usability factor (overall SUS) resulting from Bangor et al. (2008)
(Fig. 1a); the two-factor solution resulting from Lewis and
Sauro (2009) with uncorrelated Usability and Learnability fac-
tors (Fig. 1b) and its less restrictive alternative assuming
Usability and Learnability as correlated factors (Fig. 1c).
Methods
Procedure
One hundred and ninety-six Italian students of University
of Rome ‘‘La Sapienza’’ (28 males, 168 females, age
mean =21) were asked to navigate a website (http://www.
serviziocivile.it) in three consecutive sections (all the stu-
dents declared they never had previous surfing experience
with the website):
1. In the first 20-min pre-experimental training section,
the participants were asked to navigate the website
freely in order to learn features, graphic layouts,
information structures and lays of the interface.
2. Afterwards, in the second no-time-limit-scenario-
based navigation section, the participants were asked
to navigate the website following four scenario targets.
3. Finally, in the third usability evaluation section, the
SUS-Italian version was administered to the partici-
pants (Table 1).
Statistical analyses
All models were estimated by the Maximum Likelihood
Robust Method as the data were not normally distributed
(Mardia’s normalized coefficient =10.72). This method
provided us with the Satorra–Bentler scaled chi-square
statistic (S–Bv
2
), which is an adjusted measure of fit for
non-normal data that is more accurate than the standard
ML statistic (Satorra and Bentler 2001). According to the
inspection of the model’s v
2
, virtually any factor model can
be rejected if the sample size is large enough, therefore
many authors (McDonald and Ho 2002; Widaman and
Thompson 2003) recommended to supplement the evalu-
ation of the model’s fit by some more ‘‘practical’’ indices.
The so-called Comparative Fit Index (Bentler 1990) was
purposefully designed to take sample size into account, as
Fig. 1 SUS models tested: one-factor (a), two uncorrelated factors (b), two correlated factors (c)
194 Cogn Process (2009) 10:193–197
123
it compares the hypothesized model’s v
2
with the null
model’s v
2
. By convention (Hu and Bentler 2004), a CFI
greater than 0.90 indicates an acceptable fit to the data,
with values greater 0.95 being strongly recommended. A
second suggested index is the Root Mean Square Error of
Approximation (Browne and Cudeck 1993). Like the CFI,
the RMSEA is relatively insensitive to sample size, as it
measures the difference between the reproduced covari-
ance matrix and the population covariance matrix. Unlike
the CFI, the RMSEA is a ‘‘badness of fit’’ index as a value
of 0 indicates perfect fit and the greater the RMSEA the
worse the model’s fit. By convention (Hu and Bentler
2004), a RMSEA less than 0.05 corresponds to a ‘‘good’’ fit
and an RMSEA less than 0.08 corresponds to an ‘‘accept-
able’’ fit.
Results
Table 2shows that the S–Bv
2
was statistically significant
for all the models we tested regardless of the number of
factors and of whether the factors were correlated or not
(Bentler 2004). The inspection of the CFI and RMSEA fit
indexes indicated, however, that the less restrictive model
assuming Usability and Learnability as correlated factors
(Fig. 1c) resulted in a good fit (i.e., CFI [0.95 and
RMSEA \0.06), whereas the unidimensional factor model
(Fig. 1a) proposed by Bangor et al. (2008) resulted only in
an acceptable fit (i.e., CFI [0.90 and RMSEA \0.00).
Differently, the two-factor model proposed by Lewis and
Sauro (2009) with uncorrelated factors (Fig. 1b) did not
meet with any of the recommended fit indexes.
Since both the Bangor’s and the Lewis and Sauro’s
factor models are nested within the less restrictive and best
fitting model (i.e., the model with Usability and Learna-
bility as correlated factors) we could formally compare the
fit of each of the model proposed in the literature to the fit
of the model which they were nested in. Nevertheless,
given that we used the Satorra–Bentler scaled v
2
measure
for not multivariate normal data, we could not merely
assess the v
2
difference of two nested models. Rather we
have assessed the scaled S–Bv
2
difference according to the
procedures devised by Satorra and Bentler (2001). The first
contrast, which involved the comparison of the Lewis and
Table 1 Synoptical table of the
English and Italian versions of
the SUS
Original English version Italian version
1. I think I would like to use this system
frequently
1. Penso che mi piacerebbe utilizzare questo
sistema frequentemente
2. I found the system unnecessarily complex. 2. Ho trovato il sistema complesso senza che ce
ne fosse bisogno
3. I thought the system was easy to use 3. Ho trovato il sistema molto semplice da usare
4. I think I would need the support of a
technical person to be able to use this
system
4. Penso che avrei bisogno del supporto di una
persona gia
`in grado di utilizzare il sistema
5. I found the various functions in this system
were well integrated
5. Ho trovato le varie funzionalita
`del sistema
bene integrate
6. I thought there was too much inconsistency
in this system
6. Ho trovato incoerenze tra le varie funzionalita
`
del sistema
7. I would imagine that most people would
learn to use this system very quickly
7. Penso che la maggior parte delle persone
potrebbero imparare ad utilizzare il sistema
facilmente
8. I found the system very cumbersome to use 8. Ho trovato il sistema molto macchinoso da
utilizzare
9. I felt very confident using the system 9. Ho avuto molta confidenza con il sistema
durante l’uso
10. I needed to learn a lot of things before I
could get going with this system
10. Ho avuto bisogno di imparare molti processi
prima di riuscire ad utilizzare al meglio il
sistema
Table 2 Exact and close fit confirmatory factor analysis statistics/indices maximum likelihood estimation for the system usability scale
Model S–Bv
2
(df) CFI RMSEA RMSEA CI
One-factor, overall usability 76.50 (35) 0.921 0.079 0.054–0.103
Two-factor, usability and learnability, uncorrelated 108.58 (35) 0.857 0.105 0.083–0.127
Two-factor, usability and learnability, correlated 54.81 (34) 0.959 0.057 0.026–0.083
All v
2
measures were statistically significant at the 0.001 level
Cogn Process (2009) 10:193–197 195
123
Sauro’s (2009) model (Fig. 1b) to the less restrictive two-
factor model with correlated factors (Fig. 1c), was statis-
tically significant (DS–Bv
2
=30.17; df =1; p\0.001).
Likewise, the second contrast, which involved the com-
parison of the unidimensional model (Bangor et al. 2008)
(Fig. 1a) to the less restrictive two-factor model with cor-
related factors (Fig. 1c), was also statistically significant
(DS–Bv
2
=28.54; df =1; p\0.001). Based on the
inspection of absolute and relative fit indexes as well as on
the results of formal tests of v
2
differences, we may con-
clude that the two-factor model with correlated factors
outperformed both the factor models proposed in the lit-
erature to account for the measurement model of the SUS.
The inspection of model parameters assessed for the best
fitting model (Table 3) indicated that all the SUS items
significantly loaded on the appropriate factor, with factor
loadings ranging from |0.44| to |0.74| for Usability and
greater than 0.70 for Learnability. Accordingly, the factor
reliability assessed by the xcoefficient
1
yielded fairly high
values, such as 0.81 and 0.76, respectively, for Usability
and Learnability factors. The correlation of Usability and
Learnability was positive and significant (r=0.70) thus
showing that the greater the perceived Usability the greater
the perceived Learnability.
Conclusions
Despite the SUS is one of the most used questionnaires to
evaluate usability of systems, recent contributions have
provided inconsistent results regarding the factorial struc-
ture of its items, which in turn has important consequences
in determining the most appropriate scoring system of this
scale for practitioners and researchers. The traditional
unidimensional structure (Brooke 1996; Kirakowski 1994;
Bangor et al. 2008) has been challenged by the more recent
view of Lewis and Sauro (2009), assuming Learnability
and Usability as independent factors. Based on a relatively
large sample of users’ evaluations of an existing website,
we tested which of the two alternative models was the best
for SUS ratings. Our data indicated that both the proposed
models had a not satisfactory fit to the data with the uni-
dimensional model—being too narrow to represent the
contents of all SUS items—and with the two-factor model
with uncorrelated factors—being too restrictive for its
psychometric assumptions. We thus released the hypothe-
sis that Usability and Learnability are independent com-
ponents of SUS ratings and tested a less restrictive model
with correlated factors. This model not only yielded a good
fit to the data, but it was also significantly more appropriate
to represent the structure of SUS ratings. Albeit the liter-
ature reported greater reliability coefficients (e.g., [0.80)
of the Overall SUS scale, the reliability of the two Lear-
nability and Usability factors was in keeping with required
psychometric standards for short scales (Carmines and
Zeller 1992). Thus, we propose that future usability studies
may evaluate systems according to the scoring rule sug-
gested by Lewis and Sauro (2009) which is very consistent
with the bidimensional and best fitting model we have
retrieved in this study. However, since we have found a
relative correlation of Usability factors with Learnability
ones, future studies should clarify under which circum-
stances researchers may expect to obtain Usability scores
dissociated from Learnability (e.g., systems with high
Learnability but low Usability). In the present study, users
evaluated a single system (i.e., the serviziocivile.it website)
and this might have boosted up the association of the two
factors. Alternatively, our sample of users, who is com-
prised of college students, might be considered a sample
with high computer skills compared to the general popu-
lation and this might have also boosted up the factor cor-
relation. Other studies of the SUS should, then, consider
different combinations of systems and users to test the
generality of the correlation of the two factors.
References
Bangor A, Kortum PT, Miller JT (2008) An empirical evaluation of
the system usability scale. Int J Hum Comp Interact 24:574–594
Bentler PM (1990) Comparative fit indexes in structural models.
Psychol Bull 107:238–246
Bentler PM (2004) EQS structural equations modeling software
(Version 6.1) (Computer software). Multivariate Software,
Encino
Table 3 Maximum likelihood standardized solution for the two-
factor model of the system usability scale
Item k
Usability
k
Learnability
Var e
Q1 0.440 0.898
Q2 -0.737 0.676
Q3 0.750 0.662
Q4 0.752 0.660
Q5 0.629 0.777
Q6 -0.578 0.816
Q7 0.670 0.742
Q8 -0.600 0.800
Q9 0.681 0.732
Q10 0.712 0.702
1
x¼Pki
ðÞ
2
Pki
ðÞ
2þPVar ei
ðÞ
where k
i
the standardized factor loadings for
the factor and Var(e
i
) the error variance associated with the individual
indicator variables (both reported in Table 3).
196 Cogn Process (2009) 10:193–197
123
Brooke J (1996) SUS: a ’quick and dirty’ usability scale. In: Jordan
PW, Thomas B, Weerdmeester BA, McClelland IL (eds)
Usability evaluation in industry. Taylor & Francis, London,
pp 189–194
Browne MW, Cudeck R (1993) Alternative ways of assessing model
fit. In: Bollen KA, Long JS (eds) Testing structural equation
models. Sage, Beverly Hills, pp 136–162
Carmines EG, Zeller RA (1992) Reliability and validity assessment.
SAGE, Beverly Hills
Fabrigar LR, Wegener DT, MacCallum RC, Strahan EJ (1999)
Evaluating the use of exploratory factor analysis in psycholog-
ical research. Psychol Meth 4:272–299
Hu L, Bentler PM (2004) Cutoff criteria for fit indexes in covariance
structure analysis: conventional criteria versus new alternatives.
Struct Equ Model 6:1–55
Kirakowski J (1994) The use of questionnaire methods for usabil-
ity assessment (unpublished manuscript). http://sumi.ucc.ie/
sumipapp.html
Lewis JR, Sauro J (2009) The factor structure of the system usability
scale. In: Proceedings of the human computer interaction
international conference (HCII 2009), San Diego CA, USA
McDonald RP, Ho MR (2002) Principles and practice in reporting
structural equation analyses. Psychol Meth 7:64–82
Satorra A, Bentler PM (2001) A scaled difference chi-square test
statistic for moment structure analysis. Psychometrika 66:507–
514
Widaman KF, Thompson JS (2003) On specifying the null model for
incremental fit indices in structural equation modeling. Psychol
Meth 8:16–37
Cogn Process (2009) 10:193–197 197
123
... Three main instruments were used for measuring, respectively, nursing diagnostic accuracy, overall satisfaction, and system usability. They were the (1) Ordinal Scale for Degrees of Accuracy, 9 (2) Italian Version of the 10-item System Usability Scale, 24,25 (3) and the Total System Usability Scale Nursing diagnostic accuracy has been measured using the Ordinal Scale for Degrees of Accuracy 9 adapted by the authors according to the aims of the study; the overall score is the sum, according to the PES format, of the scores of the three correctly inserted components: (1) nursing diagnoses labels (problem), (2) related factors/associated conditions (etiology), and (3) defining characteristics (signs and symptoms) identified by the students for each clinical scenario. The assessment of the overall score as well of the three components was performed independently by two raters, and any disagreement among the assessors was solved by discussion with a third author until 100% agreement was reached. ...
... Total System Usability Scale score ranges from 0 to 100, with scores greater than 68 identified an acceptable usability. 24,25 Data Analysis Continuous variables were described by central tendency (mean, median) and dispersion measures such as SD, interquartile range, and range (minimum-maximum). Categorical variables were described as numbers (n) and percentages (%). ...
Article
Computer-based technologies have been widely used in nursing education, although the best educational modality to improve documentation and nursing diagnostic accuracy using electronic health records is still under investigation. It is important to address this gap and seek an effective way to address increased accuracy around nursing diagnoses identification. Nursing diagnoses are judgments that represent a synthesis of data collected by the nurse and used to guide interventions and to achieve desirable patients' outcomes. This current investigation is aimed at comparing the nursing diagnostic accuracy, satisfaction, and usability of a computerized system versus a traditional paper-based approach. A total of 66 nursing students solved three validated clinical scenarios using the NANDA-International terminologies traditional paper-based approach and then the computer-based Clinical Decision Support System. Study findings indicated a significantly higher nursing diagnostic accuracy (P < .001) in solving cancer and stroke clinical scenarios, whereas there was no significant difference in acute myocardial infarction scenario. The use of the electronic system increased the number of correct diagnostic indicators (P < .05); however, the level of students' satisfaction was similar. The usability scores highlighted the need to make the electronic documentation systems more user-friendly.
... To perform a usability test with eGLUbox, an observer (i.e., a webmaster or experimenter) accesses the platform as a "usercreator" in order to define the tasks to be performed by the users on the specific website under evaluation (Fig. 1). Furthermore, the user-creator can select one or more self-report questionnaires to be completed by the user at the end of the test, in order to measure: (i) usability, based on the System Usability Scale (SUS) [7] and UMUX-LITE [8]; and (ii) promotability, based on the Net Promoter Score (NPS) [9]. The user-creator also has the ability to add new questionnaires via the interface. ...
... After the interaction with the interface, three self-report questionnaires were administered in order to measure: (i) usability, through the SUS [7]; (ii) cognitive workload, through the Nasa Task Load Index (NASA TLX) [10]; and (iii) promotability, based on the NPS [9] for the eGLU-box application. The Partial Concurrent Thinking Aloud (PCTA) technique [11,12] was used, in which participants were required to silently interact with the interface and ring a bell whenever they detected a problem; this bell represented a reminder signal, with the aim of aiding memorization of the moment at which the participant encountered the problem. ...
Chapter
Smartphones and tablets now offer consumers unique advantages such as portability and accessibility. Developers are also working with a mobile-first approach, and are prioritizing mobile applications over desktop versions. This study introduces eGLU-box Mobile, an application for performing a drive usability test directly from a smartphone. An experimental study was conducted in which the participants were divided into two groups: an experimental group, which used the new mobile application from a smartphone, and a control group, which used the desktop application from a computer. The participants’ behavior was assessed using explicit (self-report questionnaires) and implicit measures (eye movement data). The results were encouraging, and showed that both the mobile and desktop versions of eGLU-box enabled participants to test the usability with a similar level of UX, despite some minimal (although significant) differences in terms of satisfaction of use.
... Amongst them, the System Usability Scale (Brooke, 1996) is one of the most established and most often cited questionnaires in the usability domain (Lewis, 2018). Several reasons might have contributed to the popularity of the SUS, such as the availability of validated versions in various target languages being Arabic, Chinese, French, German, Hindi, Italian, Persian, Polish, Portuguese, Slovene, and Spanish (Gao et al., 2020;Lewis, 2018), the development of norms (Bangor et al., 2008(Bangor et al., , 2009Lewis and Sauro, 2017;Sauro and Lewis, 2016), but also broad empirical evidence of a large number of validation studies (for a detailed overview see Lewis, 2018) and independent analyses of its factor structure and its relationship with other usability instruments (e. g. Borsci et al., 2009Borsci et al., , 2015. ...
... The Italian version of the System Usability Scale (SUS), [23] was employed to assess the Playball® system. The SUS is a ten-item questionnaire that operationally defines the subjective perception of interaction with a system [24]. ...
Article
Full-text available
Background: Interactive videogames, virtual reality, and robotics represent a new opportunity for multimodal treatments in many rehabilitation contexts. However, several commercial videogames are designed for leisure and are not oriented toward definite rehabilitation goals. Among the many, Playball® (Playwork, Alon 10, Ness Ziona, Israel) is a therapeutic ball that measures both movement and pressure applied on it while performing rehabilitation games. This study aimed: (i) to evaluate whether the use of this novel digital therapy gaming system was clinically effective during shoulder rehabilitation; (ii) to understand whether this gaming rehabilitation program was effective in improving patients' engagement (perceived enjoyment and self-efficacy during therapy; attitude and intention to train at home) in comparison to a control non-gaming rehabilitation program. Methods: A randomized controlled experimental design was outlined. Twenty-two adults with shoulder pathologies were recruited for a rehabilitation program of ten consecutive sessions. A control (CTRL; N = 11; age: 62.0 ± 10.9 yrs) and an intervention group (PG; N = 11; age: 59.9 ± 10.2 yrs) followed a non-digital and a digital therapy, respectively. The day before (T0) and after (T1) the rehabilitation program, pain, strength, and mobility assessments were performed, together with six questionnaires: PENN shoulder Score, PACES-short, Self-efficacy, Attitudes to train at home, Intention to train at home, and System usability scale (SUS). Results: MANOVA analysis showed significant improvements in pain (p < 0.01), strength (p < 0.05), and PENN Shoulder Score (p < 0.001) in both groups. Similarly, patients' engagement improved, with significant increments in Self-efficacy (p < 0.05) and attitude (p < 0.05) scores in both groups after the rehabilitation. Pearson correlation showed significant correlations of the Δ scores (T1 - T0) between PACES and Self-efficacy (r = 0.623; p = 0.041) and between PACES and Intention to train at home (r = 0.674; p = 0.023) only in the PG. SUS score after the rehabilitation (74.54 ± 15.60) overcame the cut-off value of 68, representative of good usability of a device. Conclusions: The investigated digital therapy resulted as effective as an equivalent non-digital therapy in shoulder rehabilitation. The reported positive relationship between the subject's enjoyment during digital therapy and intention to train at home suggests promising results in possible patient's exercise engagement at home after the rehabilitation in the medical center. Retrospectively registered: NCT05230056.
Article
Upper limb motor recovery is highly relevant for individuals with tetraplegia after spinal cord injury (SCI). Experimental interventions based on Brain–Computer Interfaces and Functional Electrical Stimulation (BCI-FES) could provide clinical benefits. However, their effects have been scarcely reported. For this reason, a pilot study was performed for assessing the feasibility of a BCI-FES intervention in tetraplegia. Six chronic cervical SCI patients completed 12 intervention sessions with a BCI–FES controlled with the motor intention of paralyzed upper extremities. Differences in the Action Research Arm Test (ARAT), Spinal Cord Independence Measure (SCIM-III), Capabilities of Upper Extremity Questionnaire (CUE-Q), Upper Extremity Motor Score (UEMS) and Life Satisfaction Questionnaire were assessed to measure recovery. Patients’ performance and experience with the system were also evaluated. Half of the patients had a significant ARAT function improvement of more than 5.7 points. Five out of six patients had more independence, measured by a median increase of 9 points of the SCIM-III. Two patients had noticeable gains in life satisfaction and in the UEMS. There were no noticeable changes in the CUE-Q. Patients had an average success rate of 80% while attempting to control the system and rated their experience with the BCI-FES in the range of excellent with a workload perceived as moderately high. Relevant clinical improvements associated with the BCI-FES intervention were observed in most of the patients. Moreover, the BCI-FES was successfully operated by patients and offered an adequate usability and workload. Although preliminary in nature, the observed effects of the BCI-FES intervention confirm its feasibility and potential for the neurorehabilitation of chronic cervical SCI patients, that currently have limited treatment options.
Article
Background Fatigue is one of the most common symptoms treated in primary care and can lead to deficits in mental health and functioning. Light therapy can be an effective treatment for symptoms of fatigue; however, the feasibility, scalability, and individual-level heterogeneity of light therapy for fatigue are unknown. Objective This study aimed to evaluate the feasibility, acceptability, and effectiveness of a series of personalized (N-of-1) interventions for the virtual delivery of bright light (BL) therapy and dim light (DL) therapy versus usual care (UC) treatment for fatigue in 60 participants. Methods Participants completed satisfaction surveys comprising the System Usability Scale (SUS) and items assessing satisfaction with the components of the personalized trial. Symptoms of fatigue were measured using the Patient-Reported Outcomes Measurement Information System (PROMIS) daily, PROMIS weekly, and ecological momentary assessment (EMA) questionnaires delivered 3 times daily. Comparisons of fatigue between the BL, DL, and UC treatment periods were conducted using generalized linear mixed model analyses between participants and generalized least squares analyses within individual participants. Results Participants rated the usability of the personalized trial as acceptable (average SUS score=78.9, SD 15.6), and 92% (49/53) of those who completed satisfaction surveys stated that they would recommend the trial to others. The levels of fatigue symptoms measured using the PROMIS daily fatigue measure were lower or improved in the BL (B=−1.63, 95% CI −2.63 to −0.63) and DL (B=−1.44, 95% CI −2.50 to −0.38) periods relative to UC. The treatment effects of BL and DL on the PROMIS daily measure varied among participants. Similar findings were demonstrated for the PROMIS weekly and EMA measures of fatigue symptoms. Conclusions The participant scores on the SUS and satisfaction surveys suggest that personalized N-of-1 trials of light therapy for fatigue symptoms are both feasible and acceptable. Both interventions produced significant (P<.05) reductions in participant-reported PROMIS and EMA fatigue symptoms relative to UC. However, the heterogeneity of these treatment effects across participants indicated that the effect of light therapy was not uniform. This heterogeneity along with high ratings of usability and satisfaction support the use of personalized N-of-1 research designs in evaluating the effect of light therapy on fatigue for each patient. Furthermore, the results of this trial provide additional support for the use of a series of personalized N-of-1 research trials. Trial Registration ClinicalTrials.gov NCT04707846; https://clinicaltrials.gov/ct2/show/NCT04707846
Chapter
Alongside the traditional concept of usability, dealing with the ease of use of a product, over the years the concept of learnability has been introduced, which refers to the possibility of learning to use a system easily and quickly. In contexts characterised by a significant level of user involvement, the traditional concept of usability has evolved into the more complex concept of user experience, which also includes emotional, cognitive or physical responses. In the specific case of virtual environments, an important factor of user experience is the sense of presence, defined as the subjective experience of being in a place.This work aims at experimentally assessing which components of presence are significantly influenced by usability and learnability in a Virtual Reality environment for the navigation in the human body developed for the HTC Vive. The presence factors analysed in this work were Realism, Ability to act, Interface quality, Ability to examine and Self-assessment of performance. The results showed a significant impact of usability mainly on the perception of realism and a limited impact of learnability on the ability to act in the virtual environment.KeywordsUsabilityLearnabilityUser ExperiencePresenceRealism
Article
Full-text available
Bu çalışmanın amacı işitme engelli bireyler için tasarlanmış Engelsiz Sağlık İletişim Merkezi (ESİM) mobil uygulamasının sistem kullanılabilirlik değerlendirmesini gerçekleştirmektir. İşitme engelli bireyler için tasarlanmış ESİM mobil uygulamasının sistem kullanılabilirlik düzeyin belirlenmeye çalışılacağı bu çalışmada betimsel araştırma yöntemi kullanılmıştır. Bu amaçla Brooke (1996) tarafından geliştirilen 10 maddeden oluşan 5’li Likert Sistem Kullanılabilirlik Ölçeği kullanılmıştır. Bu çalışma kapsamında ESİM mobil uygulamasını kullanan 8 işitme engelli birey ile ESİM sistem kullanılabilirlik değerlendirilmesi gerçekleştirilmiştir. Sistem Kullanılabilirlik Ölçeği’nin Cronbach’s Alpha katsayısı 0,78’dir. Araştırma sonucunda elde edilen demografik verilere ait tanımlayıcı istatistikler yüzde olarak verilmiş olup, sistem kullanılabilirlik düzeyi skor olarak hesaplanarak yorumda bulunulmuştur. Sistem Kullanılabilirlik Ölçeği için 68 puan eşik değer olarak kabul edilmektedir. 68 puanın üzerinde bir değer alan sistemin, kullanılabilirlik bağlamında ortalamanın üstü olduğu ifade edilebilmekte; 68 puanın altı ise ortalamanın altı olarak kabul edilerek ilgili referanslara göre sonuçlar değerlendirilmektedir. ESİM mobil uygulamasını kullanan işitme engelli bireyler tarafından verilen yanıtlara göre toplam sistem kullanılabilirlik puanına ait ortalama 54,4’tür (Minumum: 40, Maksimum: 70). Bu değer aralıklarına göre çalışma kapsamında ele alınan ESİM mobil uygulamasının sistem kullanılabilirlik düzeyine ait puan ortalamasının ortalamanın altında kaldığı ve geliştirilmesi gerektiği sonucuna ulaşılmıştır.
Chapter
Full-text available
Usability and user experience (UX) are important concepts in the design and evaluation of products or systems intended for human use. This chapter introduces the fundamentals of design for usability and UX, focusing on the application of science, art, and craft to their principled design. It reviews the major methods of usability assessment, focusing on usability testing. The concept of UX casts a broad net over all of the experiential aspects of use, primarily subjective experience. User-centered design and design thinking are methods used to produce initial designs, after which they typically use iteration for design improvement. Service design is a relatively new area of design for usability and UX practitioners. The fundamental goal of usability testing is to help developers produce more usable products. The primary activity in diagnostic problem discovery tests is the discovery, prioritization, and resolution of usability problems.
Chapter
It is increasingly common for modern people to use Ride-Hailing services. The Ride-Hailing platform needs to design applications (Apps) that meet users’ needs in a limited display area. Providing an excellent interactive experience is the goal of the Ride-Hailing service platform's continuous efforts. This study selected three representative ride-hailing service platforms, i.e., Uber, Lyft, and Gojek, and designed five operational tasks according to the commonly used functions of users, namely setting the destination, modifying personnel information, setting payment methods, finding past ride records, and adjusting setting parameters. This study adopted MODAO to make the experiment model, and the experimental equipment is iPhone X. This study invited 30 participants for the experiment via convenience sampling method. Except for finding the past ride records, the operation tasks’ results significantly differed from other tasks. There was no significant difference in the System Usability Scale (SUS) results. Uber is 66.75, Lyft is 60.25, and Gojek is 62.75. Combining post-experiment interviews, observation methods, and quantitative results, the following results are drawn: (1) If the Ride-Hailing Apps page area is over the device display zone, this page has to add “Signifiers” for users to help them understand all page information. (2) Pages and information settings unrelated to Ride-Hailing Apps can be integrated into a single modular tab or a module tab with the collapsible panel. (3) The Ride-Hailing Apps should supply switch models without advertisement. In this model, users can directly interact with the Ride-Hailing Apps and receive no bother. (4) Frequently used functions should be set on the main page or sub-page. The extended advertising page and food delivery service page can be used as an App or an independent interface so that users can quickly identify the page and service items they are using, avoid operational errors, and generate misunderstandings. (5) Select an avatar or personnel display zone that can modify or update photos and personnel information in the personnel setting option. The Apps do not need to set the next layer for this operation. KeywordsInterface usabilitySystem usability scale (SUS)Usability evaluationInteraction designRide-Hailing application
Article
Full-text available
Despite the widespread use of exploratory factor analysis in psychological research, researchers often make questionable decisions when conducting these analyses. This article reviews the major design and analytical decisions that must be made when conducting a factor analysis and notes that each of these decisions has important consequences for the obtained results. Recommendations that have been made in the methodological literature are discussed. Analyses of 3 existing empirical data sets are used to illustrate how questionable decisions in conducting factor analyses can yield problematic results. The article presents a survey of 2 prominent journals that suggests that researchers routinely conduct analyses using such questionable methods. The implications of these practices for psychological research are discussed, and the reasons for current practices are reviewed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Usability does not exist in any absolute sense; it can only be defined with reference to particular contexts. This, in turn, means that there are no absolute measures of usability, since, if the usability of an artefact is defined by the context in which that artefact is used, measures of usability must of necessity be defined by that context too. Despite this, there is a need for broad general measures which can be used to compare usability across a range of contexts. In addition, there is a need for "quick and dirty" methods to allow low cost assessments of usability in industrial systems evaluation. This chapter describes the System Usability Scale (SUS) a reliable, low-cost usability scale that can be used for global assessments of systems usability.
Conference Paper
Full-text available
Since its introduction in 1986, the 10-item System Usability Scale (SUS) has been assumed to be unidimensional. Factor analysis of two independent SUS data sets reveals that the SUS actually has two factors - Usability (8 items) and Learnability (2 items). These new scales have reasonable reliability (coefficient alpha of .91 and .70, respectively). They correlate highly with the overall SUS ( r = .985 and .784, respectively) and correlate significantly with one another ( r = .664), but at a low enough level to use as separate scales. A sensitivity analysis using data from 19 tests had a significant Test by Scale interaction, providing additional evidence of the differential utility of the new scales. Practitioners can continue to use the current SUS as is, but, at no extra cost, can also take advantage of these new scales to extract additional information from their SUS data.
Article
This article examines the adequacy of the “rules of thumb” conventional cutoff criteria and several new alternatives for various fit indexes used to evaluate model fit in practice. Using a 2‐index presentation strategy, which includes using the maximum likelihood (ML)‐based standardized root mean squared residual (SRMR) and supplementing it with either Tucker‐Lewis Index (TLI), Bollen's (1989) Fit Index (BL89), Relative Noncentrality Index (RNI), Comparative Fit Index (CFI), Gamma Hat, McDonald's Centrality Index (Mc), or root mean squared error of approximation (RMSEA), various combinations of cutoff values from selected ranges of cutoff criteria for the ML‐based SRMR and a given supplemental fit index were used to calculate rejection rates for various types of true‐population and misspecified models; that is, models with misspecified factor covariance(s) and models with misspecified factor loading(s). The results suggest that, for the ML method, a cutoff value close to .95 for TLI, BL89, CFI, RNI, and Gamma Hat; a cutoff value close to .90 for Mc; a cutoff value close to .08 for SRMR; and a cutoff value close to .06 for RMSEA are needed before we can conclude that there is a relatively good fit between the hypothesized model and the observed data. Furthermore, the 2‐index presentation strategy is required to reject reasonable proportions of various types of true‐population and misspecified models. Finally, using the proposed cutoff criteria, the ML‐based TLI, Mc, and RMSEA tend to overreject true‐population models at small sample size and thus are less preferable when sample size is small.