ArticlePDF Available

On the dimensionality of the System Usability Scale: A test of alternative measurement models

Authors:

Abstract and Figures

The System Usability Scale (SUS), developed by Brooke (Usability evaluation in industry, Taylor & Francis, London, pp 189-194, 1996), had a great success among usability practitioners since it is a quick and easy to use measure for collecting users' usability evaluation of a system. Recently, Lewis and Sauro (Proceedings of the human computer interaction international conference (HCII 2009), San Diego CA, USA, 2009) have proposed a two-factor structure-Usability (8 items) and Learnability (2 items)-suggesting that practitioners might take advantage of these new factors to extract additional information from SUS data. In order to verify the dimensionality in the SUS' two-component structure, we estimated the parameters and tested with a structural equation model the SUS structure on a sample of 196 university users. Our data indicated that both the unidimensional model and the two-factor model with uncorrelated factors proposed by Lewis and Sauro (Proceedings of the human computer interaction international conference (HCII 2009), San Diego CA, USA, 2009) had a not satisfactory fit to the data. We thus released the hypothesis that Usability and Learnability are independent components of SUS ratings and tested a less restrictive model with correlated factors. This model not only yielded a good fit to the data, but it was also significantly more appropriate to represent the structure of SUS ratings.
Content may be subject to copyright.
LETTER TO THE EDITOR
On the dimensionality of the System Usability Scale:
a test of alternative measurement models
Simone Borsci ÆStefano Federici ÆMarco Lauriola
Received: 30 May 2009 / Revised: 12 June 2009 / Accepted: 15 June 2009 / Published online: 30 June 2009
ÓMarta Olivetti Belardinelli and Springer-Verlag 2009
Abstract The System Usability Scale (SUS), developed by
Brooke (Usability evaluation in industry, Taylor & Francis,
London, pp 189–194, 1996), had a great success among
usability practitioners since it is a quick and easy to use
measure for collecting users’ usability evaluation of a system.
Recently, Lewis and Sauro (Proceedings of the human
computer interaction international conference (HCII 2009),
San Diego CA, USA, 2009) have proposed a two-factor
structure—Usability (8 items) and Learnability (2 items)—
suggesting that practitioners might take advantage of these
new factors to extract additional information from SUS data.
In order to verify the dimensionality in the SUS’ two-com-
ponent structure, we estimated the parameters and tested with
a structural equation model the SUS structure on a sample of
196 university users. Our data indicated that both the unidi-
mensional model and the two-factor model with uncorrelated
factors proposed by Lewis and Sauro (Proceedings of the
human computer interaction international conference (HCII
2009), San Diego CA, USA, 2009) had a not satisfactory fit to
the data. We thus released the hypothesis that Usability and
Learnability are independent components of SUS ratings and
tested a less restrictive model with correlated factors. This
model not only yielded a good fit to the data, but it was also
significantly more appropriate to represent the structure of
SUS ratings.
Keywords Questionnaire Usability evaluation
System Usability Scale
Introduction
The System Usability Scale (SUS) developed in 1986 by Digital
Equipment CorporationÓis a ten-item scale giving a global
assessment of Usability, operatively defined as the subjective
perception of interaction with a system (Brooke 1996). The SUS
items have been developed according to the three usability
criteria defined by the ISO 9241-11: (1) the ability of users to
complete tasks using the system, and the quality of the output of
those tasks (i.e., effectiveness), (2) the level of resource con-
sumed in performing tasks (i.e., efficiency), and (3) the users’
subjective reactions using the system (i.e., satisfaction).
Practitioners have considered the SUS as unidimensional
(Brooke 1996;Kirakowski1994) since the scoring system of
this scale results in a single summated rating of overall
usability. Such scoring procedure is strongly based on the
assumption that a single latent factor loads on all items. So far
this assumption has been tested with inconsistent results.
Whereas Bangor et al. (2008) retrieved a single principal
component of SUS items, Lewis and Sauro (2009) suggested a
two-factor orthogonal structure, which practitioners may use
to score the SUS on independent Usability and Learnability
dimensions. This latter finding is very inconsistent with the
unidimensional SUS scoring system as items loading on
independent factors of Usability and Learnability cannot be
summated according to the classical test theory (Carmines and
Zeller 1992). Furthermore, these factor analyses of the SUS
have been carried out by exploratory techniques, nevertheless
S. Borsci (&)
ECoNA, Interuniversity Centre for Research on Cognitive
Processing in Natural and Artificial Systems,
University of Rome ‘La Sapienza’, Rome, Italy
e-mail: simone.borsci@uniroma1.it; siomone.bo21@alice.it
S. Federici
Department of Human and Educational Sciences,
University of Perugia, Perugia, Italy
M. Lauriola
Department of Psychology of Socialization and Development
Processes, University of Rome ‘La Sapienza’, Rome, Italy
123
Cogn Process (2009) 10:193–197
DOI 10.1007/s10339-009-0268-9
these techniques lack of the necessary formal developments to
test which of the two proposed factor solutions is the best
account of collected data.
Unlike exploratory factor analysis, confirmatory factor
analysis (CFA) is a theory-driven approach who needs a pri-
ori specification of the number of latent variables (i.e., the
factors), of the observed-latent variables correlations (i.e., the
factor loadings) as well as of the correlations among latent
variables (Fabrigar et al. 1999). Once the model’s parameters
have been estimated, the hypothesized model is evaluated
according to its ability to replicate sample’s data. These
features make the CFA approach the state of the art most
accurate methodology to compare alternative factorial
structures and eventually decide which is the best one.
Purpose
In the present study, we aim at comparing three alternative factor
models of the SUS items: the one-factor solution with an overall
usability factor (overall SUS) resulting from Bangor et al. (2008)
(Fig. 1a); the two-factor solution resulting from Lewis and
Sauro (2009) with uncorrelated Usability and Learnability fac-
tors (Fig. 1b) and its less restrictive alternative assuming
Usability and Learnability as correlated factors (Fig. 1c).
Methods
Procedure
One hundred and ninety-six Italian students of University
of Rome ‘‘La Sapienza’’ (28 males, 168 females, age
mean =21) were asked to navigate a website (http://www.
serviziocivile.it) in three consecutive sections (all the stu-
dents declared they never had previous surfing experience
with the website):
1. In the first 20-min pre-experimental training section,
the participants were asked to navigate the website
freely in order to learn features, graphic layouts,
information structures and lays of the interface.
2. Afterwards, in the second no-time-limit-scenario-
based navigation section, the participants were asked
to navigate the website following four scenario targets.
3. Finally, in the third usability evaluation section, the
SUS-Italian version was administered to the partici-
pants (Table 1).
Statistical analyses
All models were estimated by the Maximum Likelihood
Robust Method as the data were not normally distributed
(Mardia’s normalized coefficient =10.72). This method
provided us with the Satorra–Bentler scaled chi-square
statistic (S–Bv
2
), which is an adjusted measure of fit for
non-normal data that is more accurate than the standard
ML statistic (Satorra and Bentler 2001). According to the
inspection of the model’s v
2
, virtually any factor model can
be rejected if the sample size is large enough, therefore
many authors (McDonald and Ho 2002; Widaman and
Thompson 2003) recommended to supplement the evalu-
ation of the model’s fit by some more ‘‘practical’’ indices.
The so-called Comparative Fit Index (Bentler 1990) was
purposefully designed to take sample size into account, as
Fig. 1 SUS models tested: one-factor (a), two uncorrelated factors (b), two correlated factors (c)
194 Cogn Process (2009) 10:193–197
123
it compares the hypothesized model’s v
2
with the null
model’s v
2
. By convention (Hu and Bentler 2004), a CFI
greater than 0.90 indicates an acceptable fit to the data,
with values greater 0.95 being strongly recommended. A
second suggested index is the Root Mean Square Error of
Approximation (Browne and Cudeck 1993). Like the CFI,
the RMSEA is relatively insensitive to sample size, as it
measures the difference between the reproduced covari-
ance matrix and the population covariance matrix. Unlike
the CFI, the RMSEA is a ‘‘badness of fit’’ index as a value
of 0 indicates perfect fit and the greater the RMSEA the
worse the model’s fit. By convention (Hu and Bentler
2004), a RMSEA less than 0.05 corresponds to a ‘‘good’’ fit
and an RMSEA less than 0.08 corresponds to an ‘‘accept-
able’’ fit.
Results
Table 2shows that the S–Bv
2
was statistically significant
for all the models we tested regardless of the number of
factors and of whether the factors were correlated or not
(Bentler 2004). The inspection of the CFI and RMSEA fit
indexes indicated, however, that the less restrictive model
assuming Usability and Learnability as correlated factors
(Fig. 1c) resulted in a good fit (i.e., CFI [0.95 and
RMSEA \0.06), whereas the unidimensional factor model
(Fig. 1a) proposed by Bangor et al. (2008) resulted only in
an acceptable fit (i.e., CFI [0.90 and RMSEA \0.00).
Differently, the two-factor model proposed by Lewis and
Sauro (2009) with uncorrelated factors (Fig. 1b) did not
meet with any of the recommended fit indexes.
Since both the Bangor’s and the Lewis and Sauro’s
factor models are nested within the less restrictive and best
fitting model (i.e., the model with Usability and Learna-
bility as correlated factors) we could formally compare the
fit of each of the model proposed in the literature to the fit
of the model which they were nested in. Nevertheless,
given that we used the Satorra–Bentler scaled v
2
measure
for not multivariate normal data, we could not merely
assess the v
2
difference of two nested models. Rather we
have assessed the scaled S–Bv
2
difference according to the
procedures devised by Satorra and Bentler (2001). The first
contrast, which involved the comparison of the Lewis and
Table 1 Synoptical table of the
English and Italian versions of
the SUS
Original English version Italian version
1. I think I would like to use this system
frequently
1. Penso che mi piacerebbe utilizzare questo
sistema frequentemente
2. I found the system unnecessarily complex. 2. Ho trovato il sistema complesso senza che ce
ne fosse bisogno
3. I thought the system was easy to use 3. Ho trovato il sistema molto semplice da usare
4. I think I would need the support of a
technical person to be able to use this
system
4. Penso che avrei bisogno del supporto di una
persona gia
`in grado di utilizzare il sistema
5. I found the various functions in this system
were well integrated
5. Ho trovato le varie funzionalita
`del sistema
bene integrate
6. I thought there was too much inconsistency
in this system
6. Ho trovato incoerenze tra le varie funzionalita
`
del sistema
7. I would imagine that most people would
learn to use this system very quickly
7. Penso che la maggior parte delle persone
potrebbero imparare ad utilizzare il sistema
facilmente
8. I found the system very cumbersome to use 8. Ho trovato il sistema molto macchinoso da
utilizzare
9. I felt very confident using the system 9. Ho avuto molta confidenza con il sistema
durante l’uso
10. I needed to learn a lot of things before I
could get going with this system
10. Ho avuto bisogno di imparare molti processi
prima di riuscire ad utilizzare al meglio il
sistema
Table 2 Exact and close fit confirmatory factor analysis statistics/indices maximum likelihood estimation for the system usability scale
Model S–Bv
2
(df) CFI RMSEA RMSEA CI
One-factor, overall usability 76.50 (35) 0.921 0.079 0.054–0.103
Two-factor, usability and learnability, uncorrelated 108.58 (35) 0.857 0.105 0.083–0.127
Two-factor, usability and learnability, correlated 54.81 (34) 0.959 0.057 0.026–0.083
All v
2
measures were statistically significant at the 0.001 level
Cogn Process (2009) 10:193–197 195
123
Sauro’s (2009) model (Fig. 1b) to the less restrictive two-
factor model with correlated factors (Fig. 1c), was statis-
tically significant (DS–Bv
2
=30.17; df =1; p\0.001).
Likewise, the second contrast, which involved the com-
parison of the unidimensional model (Bangor et al. 2008)
(Fig. 1a) to the less restrictive two-factor model with cor-
related factors (Fig. 1c), was also statistically significant
(DS–Bv
2
=28.54; df =1; p\0.001). Based on the
inspection of absolute and relative fit indexes as well as on
the results of formal tests of v
2
differences, we may con-
clude that the two-factor model with correlated factors
outperformed both the factor models proposed in the lit-
erature to account for the measurement model of the SUS.
The inspection of model parameters assessed for the best
fitting model (Table 3) indicated that all the SUS items
significantly loaded on the appropriate factor, with factor
loadings ranging from |0.44| to |0.74| for Usability and
greater than 0.70 for Learnability. Accordingly, the factor
reliability assessed by the xcoefficient
1
yielded fairly high
values, such as 0.81 and 0.76, respectively, for Usability
and Learnability factors. The correlation of Usability and
Learnability was positive and significant (r=0.70) thus
showing that the greater the perceived Usability the greater
the perceived Learnability.
Conclusions
Despite the SUS is one of the most used questionnaires to
evaluate usability of systems, recent contributions have
provided inconsistent results regarding the factorial struc-
ture of its items, which in turn has important consequences
in determining the most appropriate scoring system of this
scale for practitioners and researchers. The traditional
unidimensional structure (Brooke 1996; Kirakowski 1994;
Bangor et al. 2008) has been challenged by the more recent
view of Lewis and Sauro (2009), assuming Learnability
and Usability as independent factors. Based on a relatively
large sample of users’ evaluations of an existing website,
we tested which of the two alternative models was the best
for SUS ratings. Our data indicated that both the proposed
models had a not satisfactory fit to the data with the uni-
dimensional model—being too narrow to represent the
contents of all SUS items—and with the two-factor model
with uncorrelated factors—being too restrictive for its
psychometric assumptions. We thus released the hypothe-
sis that Usability and Learnability are independent com-
ponents of SUS ratings and tested a less restrictive model
with correlated factors. This model not only yielded a good
fit to the data, but it was also significantly more appropriate
to represent the structure of SUS ratings. Albeit the liter-
ature reported greater reliability coefficients (e.g., [0.80)
of the Overall SUS scale, the reliability of the two Lear-
nability and Usability factors was in keeping with required
psychometric standards for short scales (Carmines and
Zeller 1992). Thus, we propose that future usability studies
may evaluate systems according to the scoring rule sug-
gested by Lewis and Sauro (2009) which is very consistent
with the bidimensional and best fitting model we have
retrieved in this study. However, since we have found a
relative correlation of Usability factors with Learnability
ones, future studies should clarify under which circum-
stances researchers may expect to obtain Usability scores
dissociated from Learnability (e.g., systems with high
Learnability but low Usability). In the present study, users
evaluated a single system (i.e., the serviziocivile.it website)
and this might have boosted up the association of the two
factors. Alternatively, our sample of users, who is com-
prised of college students, might be considered a sample
with high computer skills compared to the general popu-
lation and this might have also boosted up the factor cor-
relation. Other studies of the SUS should, then, consider
different combinations of systems and users to test the
generality of the correlation of the two factors.
References
Bangor A, Kortum PT, Miller JT (2008) An empirical evaluation of
the system usability scale. Int J Hum Comp Interact 24:574–594
Bentler PM (1990) Comparative fit indexes in structural models.
Psychol Bull 107:238–246
Bentler PM (2004) EQS structural equations modeling software
(Version 6.1) (Computer software). Multivariate Software,
Encino
Table 3 Maximum likelihood standardized solution for the two-
factor model of the system usability scale
Item k
Usability
k
Learnability
Var e
Q1 0.440 0.898
Q2 -0.737 0.676
Q3 0.750 0.662
Q4 0.752 0.660
Q5 0.629 0.777
Q6 -0.578 0.816
Q7 0.670 0.742
Q8 -0.600 0.800
Q9 0.681 0.732
Q10 0.712 0.702
1
x¼Pki
ðÞ
2
Pki
ðÞ
2þPVar ei
ðÞ
where k
i
the standardized factor loadings for
the factor and Var(e
i
) the error variance associated with the individual
indicator variables (both reported in Table 3).
196 Cogn Process (2009) 10:193–197
123
Brooke J (1996) SUS: a ’quick and dirty’ usability scale. In: Jordan
PW, Thomas B, Weerdmeester BA, McClelland IL (eds)
Usability evaluation in industry. Taylor & Francis, London,
pp 189–194
Browne MW, Cudeck R (1993) Alternative ways of assessing model
fit. In: Bollen KA, Long JS (eds) Testing structural equation
models. Sage, Beverly Hills, pp 136–162
Carmines EG, Zeller RA (1992) Reliability and validity assessment.
SAGE, Beverly Hills
Fabrigar LR, Wegener DT, MacCallum RC, Strahan EJ (1999)
Evaluating the use of exploratory factor analysis in psycholog-
ical research. Psychol Meth 4:272–299
Hu L, Bentler PM (2004) Cutoff criteria for fit indexes in covariance
structure analysis: conventional criteria versus new alternatives.
Struct Equ Model 6:1–55
Kirakowski J (1994) The use of questionnaire methods for usabil-
ity assessment (unpublished manuscript). http://sumi.ucc.ie/
sumipapp.html
Lewis JR, Sauro J (2009) The factor structure of the system usability
scale. In: Proceedings of the human computer interaction
international conference (HCII 2009), San Diego CA, USA
McDonald RP, Ho MR (2002) Principles and practice in reporting
structural equation analyses. Psychol Meth 7:64–82
Satorra A, Bentler PM (2001) A scaled difference chi-square test
statistic for moment structure analysis. Psychometrika 66:507–
514
Widaman KF, Thompson JS (2003) On specifying the null model for
incremental fit indices in structural equation modeling. Psychol
Meth 8:16–37
Cogn Process (2009) 10:193–197 197
123
... After the participants had used the internet-based solution, System Usability Scale (SUS) [22] was administered in order to assess the usability and functionality of the online solution and technological tools (smartphone and tablet for telephone and/or video-call) by participants. ...
Article
Full-text available
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has led to an increasing demand for online psychological intervention. The aim of this study is to evaluate the efficacy of received support in internet-based psychological intervention group (I-IG) patients, compared with a wait-list control group (CG). The Impact of Event Scale—Revised, Patient Health Questionnaire 9-item and Generalized Anxiety Disorder scale 7-item were administered. After participants had used the internet-based solution, the System Usability Scale was administered. In total, 221 patients (194 patients supported by internet-based interventions and 27 patients supported onsite) were included in intervention group, and 194 patients were included in CG. In a 6-month follow-up, participants in the I-IG demonstrated significant improvements in terms of PTSD risk (p < 0.0001, d = 0.64), depression (p < 0.0001, d = 0.68), and anxiety (p < 0.0001, d = 1.33), compared to the CG. Significant improvements in onsite intervention group patients with a large to very large effect size of PTSD risk (p < 0.0001, d = 0.91), depression (p < 0.0001, d = 0.81), and anxiety (p < 0.0001, d = 1.62) were found. After internet-based solution use, I-IG patients reported a very high usability and functionality (72.87 ± 13.11) of online intervention. In conclusion, SARS-CoV-2-related mental health problems can be improved by internet-based psychological intervention. The usability and functionality evaluation of online solutions by technological tools showed very positive results for the I-IG patients.
... The SUS questionnaire can be decomposed into two factors (Lewis and Sauro 2009), namely usability and learnability, which are weakly correlated (Borsci et al. 2009). In particular, the learnability factor is made up of items 4 ("I think that I would need the support of a technical person to be able to use this system") and 10 ("I needed to learn a lot of things before I could get going with this system"), whereas item 7 ("I would imagine that most people would learn to use this system very quickly"), contrary to what it may seem, belongs to the usability factor together with the remaining items, probably because it refers to the skills of other users (Lewis and Sauro 2009). ...
Article
Full-text available
For software applications with a significant level of user involvement, the traditional concept of usability has evolved into the more complex idea of user experience, which also covers emotional, cognitive or physical responses. In virtual reality, user experience also depends on the user perception related to some peculiarities of immersive environments, where also the devices employed for user interaction play a determinant role. This has led to the design of the Presence Questionnaire (PQ) for the evaluation of the effectiveness of virtual environments. This work analyzes the effects of two different interaction modalities on usability and sense of presence: in particular, the Myo armband, a gesture-based device for touchless interaction, is compared with the Vive handheld controller bundled with the HTC Vive headset. A total of 84 subjects were recruited to test the virtual environment and asked them to fill in a questionnaire obtained by combining the Usability Metric for User eXperience (UMUX) questionnaire, the System Usability Scale (SUS) and the presence questionnaire (PQ), which was specifically designed for virtual environments. A comparison between the scores obtained for the two interaction modalities revealed which questionnaire items are significantly influenced by the input interface and deduce some insights about the consequences on human factors.
... The coding grid was created thanks to the contributions by coding systems specialized in detecting non-verbal behavior: Facial Action Coding System (FACS) [35] and Specific Affective Coding System (SPAFF) [36]. After the participants used the robotic solution, the Almere model questionnaire (AMQ) [37] and system usability scale (SUS) [38] were administered in order to assess the acceptability, usability, and functionality of the robotic solution by participants. ...
Article
Full-text available
Background: Emotion recognition skills are predicted to be fundamental features in social robots. Since facial detection and recognition algorithms are compute-intensive operations, it needs to identify methods that can parallelize the algorithmic operations for large-scale information exchange in real time. The study aims were to identify if traditional machine learning algorithms could be used to assess every user emotions separately, to relate emotion recognizing in two robotic modalities: static or motion robot, and to evaluate the acceptability and usability of assistive robot from an end-user point of view. Methods: Twenty-seven hospital employees (M = 12; F = 15) were recruited to perform the experiment showing 60 positive, negative, or neutral images selected in the International Affective Picture System (IAPS) database. The experiment was performed with the Pepper robot. Concerning experimental phase with Pepper in active mode, a concordant mimicry was programmed based on types of images (positive, negative, and neutral). During the experimentation, the images were shown by a tablet on robot chest and a web interface lasting 7 s for each slide. For each image, the participants were asked to perform a subjective assessment of the perceived emotional experience using the Self-Assessment Manikin (SAM). After participants used robotic solution, Almere model questionnaire (AMQ) and system usability scale (SUS) were administered to assess acceptability, usability, and functionality of robotic solution. Analysis wasperformed on video recordings. The evaluation of three types of attitude (positive, negative, andneutral) wasperformed through two classification algorithms of machine learning: k-nearest neighbors (KNN) and random forest (RF). Results: According to the analysis of emotions performed on the recorded videos, RF algorithm performance wasbetter in terms of accuracy (mean ± sd = 0.98 ± 0.01) and execution time (mean ± sd = 5.73 ± 0.86 s) than KNN algorithm. By RF algorithm, all neutral, positive and negative attitudes had an equal and high precision (mean = 0.98) and F-measure (mean = 0.98). Most of the participants confirmed a high level of usability and acceptability of the robotic solution. Conclusions: RF algorithm performance was better in terms of accuracy and execution time than KNN algorithm. The robot was not a disturbing factor in the arousal of emotions.
Chapter
The Accessibility Requirements Tool for Information and Communication Technologies (FRATIC) was developed within the work of a doctoral project, at the University of Trás-os-Montes and Alto Douro, and may be used at various stages of public procurement processes as well as projects and developments that include ICT products and services. This tool helps to consult, determine and assess the accessibility requirements for ICT products and services in European Standard EN 301 549 that supports the legislation in the field of public procurement for the countries of the European Union – Directive 2014/24/EU. This study focuses on the standardized usability and accessibility features evaluation of the FRATIC prototype, based on ISO 9241-11 metrics, other usability and accessibility evaluation criteria, as well as various standardized measurement tools and methods – such as Single Ease Question (SEQ) and System Usability Scale (SUS) – after conducting usability tests and interviews with 25 experts in the fields of accessibility, assistive technologies, and public procurement.
Article
Background Patients’ spontaneous speech can act as a biomarker for identifying pathological entities, such as mental illness. Despite this potential, audio recording patients’ spontaneous speech is not part of clinical workflows, and health care organizations often do not have dedicated policies regarding the audio recording of clinical encounters. No previous studies have investigated the best practical approach for integrating audio recording of patient-clinician encounters into clinical workflows, particularly in the home health care (HHC) setting. Objective This study aimed to evaluate the functionality and usability of several audio-recording devices for the audio recording of patient-nurse verbal communications in the HHC settings and elicit HHC stakeholder (patients and nurses) perspectives about the facilitators of and barriers to integrating audio recordings into clinical workflows. Methods This study was conducted at a large urban HHC agency located in New York, United States. We evaluated the usability and functionality of 7 audio-recording devices in a laboratory (controlled) setting. A total of 3 devices—Saramonic Blink500, Sony ICD-TX6, and Black Vox 365—were further evaluated in a clinical setting (patients’ homes) by HHC nurses who completed the System Usability Scale questionnaire and participated in a short, structured interview to elicit feedback about each device. We also evaluated the accuracy of the automatic transcription of audio-recorded encounters for the 3 devices using the Amazon Web Service Transcribe. Word error rate was used to measure the accuracy of automated speech transcription. To understand the facilitators of and barriers to integrating audio recording of encounters into clinical workflows, we conducted semistructured interviews with 3 HHC nurses and 10 HHC patients. Thematic analysis was used to analyze the transcribed interviews. Results Saramonic Blink500 received the best overall evaluation score. The System Usability Scale score and word error rate for Saramonic Blink500 were 65% and 26%, respectively, and nurses found it easier to approach patients using this device than with the other 2 devices. Overall, patients found the process of audio recording to be satisfactory and convenient, with minimal impact on their communication with nurses. Although, in general, nurses also found the process easy to learn and satisfactory, they suggested that the audio recording of HHC encounters can affect their communication patterns. In addition, nurses were not aware of the potential to use audio-recorded encounters to improve health care services. Nurses also indicated that they would need to involve their managers to determine how audio recordings could be integrated into their clinical workflows and for any ongoing use of audio recordings during patient care management. Conclusions This study established the feasibility of audio recording HHC patient-nurse encounters. Training HHC nurses about the importance of the audio-recording process and the support of clinical managers are essential factors for successful implementation.
Article
Purpose: To evaluate the usability and long-term adherence to the mobile hyperacuity app Alleye in patients with retinal pathology. Methods: We enroled 72 patients (95 eyes) mainly treated for wet AMD (48/95; 50.5%). We calculated changes of clinical characteristics and the System Usability Score (SUS), and personal ratings of usefulness and number of tests performed per month at a follow-up visit of eighteen months. Results: At baseline, mean best corrected visual acuity (BCVA) was 74.9 letters (SD 14.8), mean age was 69.9 (SD 11.4) and 39/72 (54.2%) were female. Of included patients, 47/72 (65.2%) reported to use mobile devices daily. The retention rate until last follow-up was 73.6 % (53/72). The median SUS score at baseline was 90 (interquartile range (IQR) 82.5-95) and 92.5 (IQR 82.5-95) in the follow-up. No association between changes of SUS and clinical characteristics was seen. At baseline, 76.4% (55/72) stated that they would recommend the app to a friend, 83.3% (60/72) were very satisfied with the app and 58/72 (80.6%) of respondents said they trusted the app. These assessments remained similar among patients remaining on the program until the follow-up. Patients who dropped out of the study (n = 19) did not differ in age, gender, BCVA, and SUS at baseline, but stated that they did not use the mobile device daily (Odds Ratio 7.40 (95%CI: 2.32-23.65); p = 0.001). Conclusions: The majority of users willing to perform home monitoring with the Alleye app are satisfied with the usability and have a positive attitude towards its trustworthiness and usefulness.
Article
The ideal brain-computer interface (BCI) adapts to the user's state to enable optimal BCI performance. Two methods of BCI adaptation are commonly applied: User-centered design (UCD) responds to individual user needs and requirements. Passive BCIs can adapt via online analysis of electrophysiological signals. Despite similar goals, these methods are rarely discussed in combination. Hence, we organized a workshop for the 8th International BCI Meeting 2021 to discuss the combined application of both methods. Here we expand upon the workshop by discussing UCD in more detail regarding its utility for end-users as well as non-end-user-based early-stage BCI development. Furthermore, we explore electrophysiology-based online user state adaptation concerning consciousness and pain detection. The integration of the numerous BCI user state adaptation methods into a unified process remains challenging. Yet, further systematic accumulation of specific knowledge about assessment and integration of internal user states bears great potential for BCI optimization. ARTICLE HISTORY
Article
Full-text available
Despite the widespread use of exploratory factor analysis in psychological research, researchers often make questionable decisions when conducting these analyses. This article reviews the major design and analytical decisions that must be made when conducting a factor analysis and notes that each of these decisions has important consequences for the obtained results. Recommendations that have been made in the methodological literature are discussed. Analyses of 3 existing empirical data sets are used to illustrate how questionable decisions in conducting factor analyses can yield problematic results. The article presents a survey of 2 prominent journals that suggests that researchers routinely conduct analyses using such questionable methods. The implications of these practices for psychological research are discussed, and the reasons for current practices are reviewed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Usability does not exist in any absolute sense; it can only be defined with reference to particular contexts. This, in turn, means that there are no absolute measures of usability, since, if the usability of an artefact is defined by the context in which that artefact is used, measures of usability must of necessity be defined by that context too. Despite this, there is a need for broad general measures which can be used to compare usability across a range of contexts. In addition, there is a need for "quick and dirty" methods to allow low cost assessments of usability in industrial systems evaluation. This chapter describes the System Usability Scale (SUS) a reliable, low-cost usability scale that can be used for global assessments of systems usability.
Conference Paper
Full-text available
Since its introduction in 1986, the 10-item System Usability Scale (SUS) has been assumed to be unidimensional. Factor analysis of two independent SUS data sets reveals that the SUS actually has two factors - Usability (8 items) and Learnability (2 items). These new scales have reasonable reliability (coefficient alpha of .91 and .70, respectively). They correlate highly with the overall SUS ( r = .985 and .784, respectively) and correlate significantly with one another ( r = .664), but at a low enough level to use as separate scales. A sensitivity analysis using data from 19 tests had a significant Test by Scale interaction, providing additional evidence of the differential utility of the new scales. Practitioners can continue to use the current SUS as is, but, at no extra cost, can also take advantage of these new scales to extract additional information from their SUS data.
Article
This article examines the adequacy of the “rules of thumb” conventional cutoff criteria and several new alternatives for various fit indexes used to evaluate model fit in practice. Using a 2‐index presentation strategy, which includes using the maximum likelihood (ML)‐based standardized root mean squared residual (SRMR) and supplementing it with either Tucker‐Lewis Index (TLI), Bollen's (1989) Fit Index (BL89), Relative Noncentrality Index (RNI), Comparative Fit Index (CFI), Gamma Hat, McDonald's Centrality Index (Mc), or root mean squared error of approximation (RMSEA), various combinations of cutoff values from selected ranges of cutoff criteria for the ML‐based SRMR and a given supplemental fit index were used to calculate rejection rates for various types of true‐population and misspecified models; that is, models with misspecified factor covariance(s) and models with misspecified factor loading(s). The results suggest that, for the ML method, a cutoff value close to .95 for TLI, BL89, CFI, RNI, and Gamma Hat; a cutoff value close to .90 for Mc; a cutoff value close to .08 for SRMR; and a cutoff value close to .06 for RMSEA are needed before we can conclude that there is a relatively good fit between the hypothesized model and the observed data. Furthermore, the 2‐index presentation strategy is required to reject reasonable proportions of various types of true‐population and misspecified models. Finally, using the proposed cutoff criteria, the ML‐based TLI, Mc, and RMSEA tend to overreject true‐population models at small sample size and thus are less preferable when sample size is small.