Content uploaded by James R. Lewis
Author content
All content in this area was uploaded by James R. Lewis on Aug 07, 2014
Content may be subject to copyright.
PSYCHOMETRIC EVALUATION OF AN AFTER-SCENARI
O
QUESTIONNAIRE FOR COMPUTER USABILITY STUDIES
: TH
E
AS
Q
JAMES R
. LEWI
S
Abstract
: A three-item after-scenario questionnaire wa
s
used in three related usability tests in differen
t
areas of the United States
. The studies had eigh
t
scenarios in common . After participants
finished a scenario, they completed the After
-
Scenario Questionnaire (the ASQ)
. A facto
r
analysis of the responses to the ASQ item
s
revealed that an eight-factor solution explaine
d
94 percent of the variability of the 24 (eigh
t
scenarios by three items per scenario) items
.
The varimax-rotated factor pattern showed tha
t
these eight factors were clearly associated wit
h
the eight scenarios
. The benefit of this researc
h
to system designers is that this three-ite
m
questionnaire has acceptable psychometri
c
properties of reliability, sensitivity, an
d
concurrent validity, and may be used wit
h
confidence in other, similar usability studies
.
INTRODUCTIO
N
When developing computer systems, it is important t
o
develop a method for measuring user satisfaction wit
h
existing systems or prototypes of future systems
. Th
e
purpose of this report is to describe a psychometri
c
evaluation of a questionnaire which has been used to asses
s
user satisfaction during participation in scenario-base
d
usability studies
. This questionnaire is named the After
-
Scenario Questionnaire (ASQ), since it is administere
d
after each scenario
.
Psychometric instruments to evaluate computer-use
r
satisfaction are not new (see Ives, Olson, and Baroudi
,
1983 for a review)
. Lewis (1990) has recently reporte
d
favorable psychometric properties of the Post-Stud
y
System Usability Questionnaire (PSSUQ)
. Two other ne
w
instruments, the Questionnaire for User Interfac
e
Satisfaction (QUIS) (Chin, Diehl, and Norman, 1988) an
d
the Computer User Satisfaction Inventory (CUSI
)
(Kirakowski and Corbett, 1988) also seem to have goo
d
reliability and validity characteristics
. None of the abov
e
instruments were developed specifically for use during
a
usability study, although the PSSUQ is intended for us
e
after a usability study
. The ASQ was developed to be use
d
immediately following scenario completion in scenario
-
based usability studies, where a scenario is a collectio
n
of related tasks
.
The purpose of this report is to discuss the steps by whic
h
the ASQ was developed and evaluated
. The questionnair
e
items were treated as constituent items for summative, o
r
Likert scales
. The methods for developing such summativ
e
scales are explained in detail in Nunnally (1978) an
d
McIver and Carmines (1981)
. The topics to be covered i
n
this report are item construction, item selection
,
exploratory factor analysis of the items, estimates of scal
e
reliability, assessment of scale sensitivity, and an estimat
e
of concurrent validity for the ASQ
.
ITEM CONSTRUCTIO
N
The items are 7-point graphic scales, anchored at the en
d
SIGCHI Bulletin January 1991
78
Volume 23, Number I
points with the terms "Strongly agree" for 1 and "Strongl
y
disagree" for 7, and a Not Applicable (N/A) point outsid
e
the scale, as shown in Figure 1
. For a discussion o
f
important properties of rating scales, see Nunnally (1978
,
pp
. 594-602)
.
FIGURE 1
. The After-Scenario Questionnaire (ASQ
)
For each of the questions below, circle the answer of your choice
.
1. Overall, I am satisfied with the ease of completing the tasks in this scenario
.
strongly
strongly
no
t
agree <
disagree applicabl
e
1
2
3
4
5
6
7
N/A
Comments
:
2. Overall, I am satisfied with the amount of time it took to complete the tasks i
n
this scenario
.
strongly
strongly
no
t
agree <
> disagree applicabl
e
1
2
3
4
5
6
7
N/
A
Comments
:
3.
Overall, I am satisfied with the support information (on-line help, messages
,
documentation) when completing the tasks
?
strongly
strongly
no
t
agree <
> disagree applicabl
e
1
2
3
4
5
6
7
N/A
Comments
:
ITEM SELECTIO
N
The three items were selected on the basis of their conten
t
regarding hypothesized constituents of usability
.
Characteristics such as ease of task completion, tim
e
required to complete tasks, and satisfaction with suppor
t
information (on-line help, system messages
,
documentation) would be expected to influence a user'
s
perception of system usability
.
PSYCHOMETRIC EVALUATIO
N
Scenario-based usability studies were conducted t
o
evaluate the usability characteristics of three offic
e
application systems
. One important characteristic o
f
usability was user satisfaction, measured using the AS
Q
after each scenario
.
Participant
s
Forty-eight employees of temporary help agencie
s
participated in the studies, with 15 hired in Hawthorne,
New York
; 15 hired in Boca Raton, Florida
; and 18 hire
d
in Southbury, Connecticut
. Each set of participants
consisted of one-third clerical/secretarial work experienc
e
with no mouse experience (SECNO), one-third busines
s
professionals with no mouse experience (BPNO), and one
-
third business professionals with mouse experienc
e
(BPMS)
. All participants had at least three month
s
experience using some type of computer system
. They ha
d
no programming training or experience, and had no (o
r
very limited) knowledge of the disk operating systems
.
Materials and Apparatu
s
Three office systems (hereafter referred to as System I
,
System II and System III) were put together by installing
a
word processing application, a mail application, a calenda
r
application, and a spreadsheet application on thre
e
different platforms that allowed a certain amount o
f
integration among the applications
. All three platform
s
allowed windowing and used a mouse as a pointing device
.
The systems differed in details of implementation, bu
t
were generally similar
. The three word processing an
d
spreadsheet applications were quite similar, but the mai
l
and calendar applications differed substantially
. Package
s
were prepared for each participant, consisting of
a
background questionnaire, 10 scenarios (with each scenari
o
followed by the ASQ), and an overall system ratin
g
questionnaire (the PSSUQ)
. Eight of the ten scenario
s
were common among the usability studies of the thre
e
systems, and are listed in Table 1
.
TABLE 1
. Scenario Description
s
Scenario
Component Task
s
Mail (M1A)
Open, reply to, and delete a note
.
Mail (M1 B)
Open, reply to, and delete a note
.
Mail (M2)
Open a note, forward with
reply,
sav
e
and print the note
.
Address
(Al)
Create, change, and delete addres
s
entries
.
File Management
(Fl)
Rename, copy, and delete a file
.
Editor
(El)
Create and save a short document
.
Editor (E2)
Locate and edit a document, open a note
,
copy text from the note into th
e
document, save and print the document
.
Decision Support
(Dl)
Create a small spreadsheet, open
a
document, copy the spreadsheet into th
e
document, save and print the document
,
save the spreadsheet
.
Procedur
e
Participants began with a brief tour of the lab, a descriptio
n
of the study's purpose and events of the day, and complete
d
SIGCHI Bulletin January 1991
79
Volume 23, Number 1
the background questionnaire
. Participants using System
I
began by working on an interactive tutorial shipped wit
h
the system, while the other participants were given a brie
f
demonstration about how to move, point and select with
a
mouse; how to open the icons for each product
; and how t
o
maximize and minimize windows
. After this system
-
exploration period (usually about one hour), participant
s
performed the scenarios, completing the ASQ as eac
h
scenario was finished
. As the participant performed th
e
scenario, an observer logged the participant's activities
. I
f
the participant completed the scenario without assistanc
e
and without unrecoverable errors, the scenario wa
s
recorded as successfully completed
. Otherwise, it wa
s
recorded as unsuccessfully completed
. After all scenario
s
had been completed (or at the end of the day, if som
e
scenarios still had not been done), participants rated th
e
system using the PSSUQ
. It usually took a participant
a
full day (eight hours) to complete the study
.
At the end of the three studies, the responses to th
e
PSSUQ, the ASQ, and the scenario completion data wer
e
entered into a data base
. From this data base, a
n
exploratory factor analysis, reliability analyses,
a
sensitivity analysis, and some validity analyses wer
e
conducted
.
RESULT
S
Exploratory Factor Analysis and Reliability Analyse
s
Factor analysis is a statistical procedure which examine
s
the correlations among variables to test for, or discove
r
clusters of variables (Nunnally, 1978)
. Since summate
d
(Likert) scales are more reliable than single-item scale
s
(Nunnally, 1978) and it is easier to present and interpret
a
smaller number of scores, an exploratory factor analysi
s
was conducted on the responses from the ASQ
. Th
e
intention of this exploration was to discover if there was
a
statistical basis for combining the three items into a singl
e
scale
. The SAS (TM) procedure FACTOR was used t
o
perform a principal factor analysis with a varimax rotatio
n
(SAS Institute, 1985)
.
Due to the way the data were collected, either three- o
r
eight-factor solutions were anticipated
. The expecte
d
three-factor solution would have shown grouping by item
,
and the expected eight-factor solution would sho
w
grouping by scenario
. The scree plot did not support
a
three-factor solution, since third and fourth eigenvalue
s
were almost equal in value
. There was a definite brea
k
after the eighth eigenvalue
. The break after the fifth an
d
sixth eigenvalues were suggestive, but were not as easil
y
interpreted as the eight-factor solution
. The seventh an
d
eighth eigenvalues were less than one, but th
e
eigenvalues-less-than-one criterion is only a roug
h
guideline (Cliff, 1987)
. Therefore, the eight-factor solution
seemed to be most appropriate
. The solution was varima
x
rotated, with the rotated factor pattern shown in Table 2
.
The ITEM column in Table 2 shows the scenario and th
e
ASQ item number
. Using a selection criterion of
.6 for th
e
factor loadings, the eight factors were clearly associate
d
with the eight common scenarios
. The eight factor
s
explained almost all (94 percent) of the total variance
. Th
e
reliabilities of the scales created by summing the thre
e
items into scenario scales were assessed with coefficien
t
alpha (Nunnally, 1978), a measure of internal consistency
.
All the coefficient alphas exceeded .9, indicating that th
e
scales are acceptably reliable
. Coefficient alphas this larg
e
are surprising since each scale was based only on thre
e
items, and reliability is largely a function of the number o
f
items (Nunnally, 1978)
.
Table 2
. Varimax Rotated Factor Patter
n
QUES
FAC1
FAC2 FAC3
FAC4 FAC5 FACE FAC7
FAG
S
M1 Al
-0
.06
0
.15
-0
.00
0
.20
0
.80
0
.43
0
.22
0
.0
7
M1 A2
0
.02
0
.35
0
.05
0
.10
0
.73
0
.42
0
.05
0
.2
5
M1A3
0
.27
0
.22
0
.16
0
.23
0
.76
0
.17
0
.16
0
.2
7
MI B1
0
.30
0
.04
0
.07
0
.15
0
.34
0
.83
0
.11
0
.1
2
Mt B2 0
.37
0
.08
0
.02
0
.11
0
.26
0
.82
0
.05
0
.2 0
M
1
B3
0
.52
-0
.01
0
.04
0
.12
0
.39
0
.64
0
.15
0
.2 6
M21
0
.88
0
.12
0
.10
0
.16
0
.12
0
.15
0
.22
-0
.1
5
M22
0
.89
0
.13
0
.04
0
.23
0
.01
0
.24
0
.12
0
.0 8
M23
0
.87
0
.02
0
.00
0
.26
0
.01
0
.23
0
.07
-0
.1
4
All
-0
.04
0
.14
0
.88
0,15
-0
.12
-0
.01
0
.25
0
.1
4
Al2
0
.01
0
.06
0
.86
0
.10
0
.10
-0
.08
0
.13
0
.3
3
A13
0
.21
0
.02
0
.85
-0
.01
0
.21
0
.20
0
.14
0
.1
3
F11
0
.07
0
.91
0,13
0
.09
0
.16
-0
.03
0
.23
0
.0
6
F12
0
.14
0
.93
0
.07
0
.07
0
.07
0
.07
0
.10
0
.1
5
F 13
0
.01
0
.87
0
.00
0
.18
0
.13
0
.08
0
.07
0
.0
7
Ell
0
.10
0
.24
0
.23
0
.15
0
.11
0
.21
0
.87
-0
.0
0
E12 0
.15
0
.24
0
.18
0
.15
0
.05
-0
.02
0
.90
0 .0
6
E13
0
.38
-0
.04
0
.22
0
.00
0
.44
0
.09
0
.68
0 .0
9
E21
0
.21
0
.26
0
.15
0
.80
0
.08
0
.28
0
.23
0
.0
8
E22 0
.28
0
.24
0
.07
0
.83
0
.07
-0
.02
0
.09
0
.1
9
E23 0
.28
-0
.03
0
.06
0
.84
0
.25
0
.12
0
.06
0
.1
9
D11
-0
.14
0
.19
0
.36
0
.22
0
.07
0
.15
0
.09
0
.7 9
D12
-0
.14
0
.25
0
.15
0
.37
0
.11
0
.12
0
.00
0
.8 2
D13
0
.09
-0
.02
0
.27
-0
.02
0
.36
0
.20
0
.02
0
.76
Sensitivity Stud
y
Having determined that the three questionnaire items coul
d
be reasonably summed into a scale, it was important t
o
determine if the scale was sensitive enough to detec
t
differences as a function of the independent variables o
f
interest
. Specifically, could these scores discriminat
e
among the different systems, user groups, or scenario
s
examined in the three usability studies
?
An ANOVA was conducted on the scale scores
. Of the 4
8
participants, 27 completed all the items on the ASQ
. Onl
y
these data were used in the ANOVA
. The main effect o
f
Scenario was highly significant T(7,126)=8
.92, p<
.0001)
.
SIGCHI Bulletin January 1991
80
Volume 23, Number
1
The Scenario by System interaction was also significan
t
((14,126)=1
.75, p=
.05)
. These results suggest that th
e
ASQ scale score was a reasonably sensitive measure
.
Concurrent Validit
y
For each case in which a scenario was attempted and al
l
ASQ questions were answered, the point-biseria
l
correlation between the summed item scores for th
e
scenario and a 0/1 coding to indicate scenari
o
failure/success was -
.40 (n=48, 11(
.01)
. This result shows
a
tendency for participants who successfully completed
a
scenario to give lower (more favorable) ratings
.
LIMITS TO GENERALIZATIO
N
These findings must be considered as preliminary since th
e
sample size for the factor analysis is smaller than is usuall
y
recommended
. Nunnally (1978) recommends a minimu
m
of five participants per item, which would be 12
0
participants for this analysis
. On the other hand, the facto
r
structure is very strong and clear
.
This instrument was designed to assess the attitude o
f
participants following a scenario completion in a forma
l
usability study
. The correlations among items, resultan
t
factors, and validity coefficients may all be influenced b
y
the situation in which the data were collected
. While i
t
would be interesting to use the ASQ in a less forma
l
setting, such as a mailed questionnaire or as an instrumen
t
in a field study, the results presented in this report canno
t
be used as justification for such use
. If such a study i
s
planned, it is important to examine the instrument
s
discussed in the introduction to determine if a mor
e
suitable instrument already exists
.
CONCLUSION
S
The psychometric evaluation of this questionnaire show
s
that the three items can reasonably be condensed into
a
single scale through summation
. This condensation shoul
d
allow easier interpretation and reporting of results whe
n
usability studies use the ASQ
. The ASQ seems to b
e
sensitive enough to be a useful measure for usabilit
y
studies
. The ASQ also seems to have reasonabl
e
concurrent validity when correlated with scenari
o
completion data
. These results should be considered as
preliminary since the sample size for the factor analysi
s
was smaller than is generally recommended
. On the othe
r
hand, the factors and the content of the items summed int
o
scales seem clear
. Others who conduct usability studies
are encouraged to use this questionnaire unless there is
a
strong reason to do otherwise
.
REFERENCE
S
Chin, J
.P
., Diehl, V
.A
., and Norman, K
. (1988)
.
Development of an instrument measuring use
r
satisfaction of the human-computer interface
. I
n
Proceedings of the CHI'88 Conference on Huma
n
Factors in Computing Systems
(pp
. 213-218)
.
New York, NY
: Association for Computin
g
Machinery
.
Cliff, N
. (1987)
.
Analyzing multivariate data
.
San Diego
,
CA
: Harcourt Brace Jovanovich
.
Ives, B
., Olson, M
.H
., and Baroudi, J
.J
. (1983)
. Th
e
measurement of user satisfaction
.
Communications of the ACM,
26, 785-793
.
Kirakowski, J
. and Corbett, M
. (1988)
.
The Compute
r
User Satisfaction Inventory (CUSI)
: Manual an
d
scoring key
.
Ireland
: University College o
f
Cork, Human Factors Research Group
.
Lewis, J.R
. (1990)
.
A psychometric evaluation of apost-
study system usability questionnaire
: Th
e
PSSUO
(IBM Tech
. Report 54
.535)
. Boc
a
Raton, FL
: IBM Corp
.
McIver, J.P
. and Carmines, E
.G
. (1981)
.
Unidimensiona
l
scaling
.
Sage University Paper series o
n
Quantitative Applications in the Social Sciences
,
series no. 07-024
. Beverly Hills, CA
: Sag
e
Publications
.
Nunnally, J
.C
. (1978)
.
Psychometric theory
.
New York
,
NY
: McGraw-Hill
.
SAS Institute
. (1985)
.
SAS user's guide
: statistics
,
version 5 edition
. Cary, NC
: SAS Institute
.
ABOUT THE AUTHO
R
James R
. Lewi
s
James R. Lewis is a staff human factors engineer for IBM
.
He earned his Master
'
s Degree in Engineering Psycholog
y
at New Mexico State University
. His current interest
s
include methods of usability engineering and usabilit
y
measurement
.
SIGCHI Bulletin January 1991
81
Volume 23, Number 1