ArticlePDF Available

Measuring Usability with the USE Questionnaire

Measuring Usability with the USE Questionnaire
By Arnold M. Lund
There are a variety of issues that tend to recur in the life of a user interface designer. I recall the first
time I was asked to write performance requirements for a user interface. How should I go about
deciding on an acceptable level of errors or an acceptable speed of accomplishing a standard task? How
do I know whether I have improved an interface design enough? Of the many problems that need fixing,
which ones should take priority? How do I even know whether improving the user interface of a product
is going to have an impact on sales? At one company, we sold usability so successfully one of the
business units declared they wanted to label each product with a “usability seal of approval.” How
would one go about determining when to award such a seal?
Over the years I have worked with colleagues at Ameritech (where the work began), U.S. WEST
Advanced Technologies, and most recently Sapient to create a tool that has helped in dealing with some
of these questions. The tool that we developed is called the USE Questionnaire. USE stands for
Usefulness, Satisfaction, and Ease of use. These are the three dimensions that emerged most strongly in
the early development of the USE Questionnaire. For many applications, Usability appears to consist of
Usefulness and Ease of Use, and Usefulness and Ease of Use are correlated. Each factor in turn drives
user satisfaction and frequency of use. Users appear to have a good sense of what is usable and what is
not, and can apply their internal metrics across domains.
General Background
Subjective reactions to the usability of a product or application tend to be neglected in favor of
performance measures, and yet it is often the case that these metrics measure the aspects of the user
experience that are most closely tied to user behavior and purchase decisions. While some tools exist
for assessing software usability, they typically are proprietary (and may only be available for a fee).
More importantly, they do not do a good job of assessing usability across domains. When re-engineering
began at Ameritech, it became important to be able to set benchmarks for product usability and to be
able to measure progress against those benchmarks. It also was critical to ensure resources were being
used as efficiently as possible, and so tools to help select the most cost-effective methodology and the
ability to prioritize design problems to be fixed by developers became important. Finally, it became clear
that we could eliminate all the design problems and still end up with a product that would fail in the
It was with this environment as a background that a series of studies began at Ameritech. The first one
was headed by Amy Schwartz, and was a collaboration of human factors, market research in our largest
marketing organization, and a researcher from the University of Michigan business school. Building on
Published in Lund, A. M. (2001). Measuring usability with the USE questionnaire. Usability Interface, 8(2), 3-6
For more detail, contact Arnie Lund at or
that research, I decided to develop a short questionnaire that could be used to measure the most
important dimensions of usability for users, and to measure those dimensions across domains. Ideally it
should work for software, hardware, services, and user support materials. It should allow meaningful
comparisons of products in different domains, even though testing of the products happened at
different times and perhaps under different circumstances. In the best of all worlds, the items would
have a certain amount of face validity for both users and practitioners, and it would be possible to
imagine the aspects of the design that might influence ratings of the items. It would not be intended to
be a diagnostic tool, but rather would treat the dimensions of usability as dependent variables.
Subsequent research would assess how various aspects of a given category of design would impact
usability ratings.
The early studies at Ameritech suggested that a viable questionnaire could be created. Interestingly, the
results of those early studies were consistent with studies conducted in the MIS and technology
diffusion areas, which also had identified the importance of and the relationship between Usefulness,
Satisfaction, and Ease of Use. Furthermore, the rich research tradition in these other areas provides
theory that may be extended to explain the relationships. This is an area that provides a link between
academic research and practice, and it is informed by several disciplines. Some work has already been
published suggesting that at least one publicly available tool drawn from earlier research can be applied
effectively to software interfaces.
How It Developed
The first step in identifying potential items for the questionnaire was to collect a large pool of items to
test. The items were collected from previous internal studies, from the literature, and from
brainstorming. The list was then massaged to eliminate or reword items that could not be applied across
the hardware, software, documentation, and service domains. One goal was to make the items as simply
worded as possible, and as general as possible. As rounds of testing progressed, standard psychometric
techniques were used to weed out additional items that appeared to be too idiosyncratic or to improve
items through ongoing tweaking of the wording. In general, the items contributing to each scale were of
approximately equal weight, the Chronbach's Alphas were very high, and for the most part the items
appeared to tap slightly different aspects of the dimensions being measured.
The questionnaires were constructed as seven-point Likert rating scales. Users were asked to rate
agreement with the statements, ranging from strongly disagree to strongly agree. Various forms of the
questionnaires were used to evaluate user attitudes towards a variety of consumer products. Factor
analyses following each study suggested that users were evaluating the products primarily using three
dimensions, Usefulness, Satisfaction, and Ease of Use. Evidence of other dimensions was found, but
these three served to most effectively discriminate between interfaces. Partial correlations calculated
using scales derived for these dimensions suggested that Ease of Use and Usefulness influence one
another, such that improvements in Ease of Use improve ratings of Usefulness and vice versa. While
both drive Satisfaction, Usefulness is relatively less important when the systems are internal systems
that users are required to use. Users are more variable in their Usefulness ratings when they have had
only limited exposure to a product. As expected from the literature, Satisfaction was strongly related to
the usage (actual or predicted). For internal systems, the items contributing to Ease of Use for other
products actually could be separated into two factors, Ease of Learning and Ease of Use (which were
obviously highly correlated). The items that appeared across tests for the three factors plus Ease of
Learning are listed below. The items in italics loaded relatively less strongly on the factors.
It helps me be more effective.
It helps me be more productive.
It is useful.
It gives me more control over the activities in my life.
It makes the things I want to accomplish easier to get done.
It saves me time when I use it.
It meets my needs.
It does everything I would expect it to do.
Ease of Use
It is easy to use.
It is simple to use.
It is user friendly.
It requires the fewest steps possible to accomplish what I want to do with it.
It is flexible.
Using it is effortless.
I can use it without written instructions.
I don't notice any inconsistencies as I use it.
Both occasional and regular users would like it.
I can recover from mistakes quickly and easily.
I can use it successfully every time.
Ease of Learning
I learned to use it quickly.
I easily remember how to use it.
It is easy to learn to use it.
I quickly became skillful with it.
I am satisfied with it.
I would recommend it to a friend.
It is fun to use.
It works the way I want it to work.
It is wonderful.
I feel I need to have it.
It is pleasant to use.
Work to refine the items and the scales continues. There is some evidence that for websites and certain
consumer products there is an additional dimension of fun or aesthetics associated with making a
product compelling. For the dependent variables of primary interest, however, these items appear to be
reasonably robust. A short form of the questionnaire is easily constructed by using the three or four
most heavily weighted items for each factor.
While the questionnaire has been used successfully by many companies around the world, and as part
of several dissertation projects, the development of the questionnaire is still not over. For the reasons
cited, this is an excellent starting place. The norms I have developed over the years have been useful in
determining when I have achieved sufficient usability to enable success in the market. To truly develop a
standardized instrument, however, the items should be taken through a complete psychometric
instrument development process. A study I have been hoping to run is one that simultaneously uses the
USE Questionnaire and other questionnaires like SUMI or QUIS to evaluate applications. Once a publicly
available (i.e., free) standardized questionnaire is available that applies across domains, a variety of
interesting lines of research are possible. The USE Questionnaire should continue to be useful as it
stands, but I hope the best is yet to come.

Supplementary resource (1)

... These variables were mostly related to usability factors. Of the nine articles that captured usability (Doolani, Owens, et al., 2020;Carlson et al., 2015;deMoura & Sadagic, 2019;Gavish et al., 2015;Loch et al., 2019;Murcia-Lopez & Steed, 2018;Oren et al., 2012;Velaz et al., 2014;Werrlich, Nguyen, et al., 2018), five used the standardized System Usability Scale of Brooke (1996) (Doolani, Owens, et al., 2020;deMoura & Sadagic, 2019;Gavish et al., 2015;Velaz et al., 2014;Werrlich, Nguyen, et al., 2018) and one used the USE questionnaire of Lund (2001) (Loch et al., 2019). The difficulty of the task was measured in six articles (Carlson et al., 2015;deMoura & Sadagic, 2019;Murcia-Lopez & Steed, 2018;Oren et al., 2012;Velaz et al., 2014;Roldan et al., 2019), which was sometimes also operationalized as the difficulty of completing the task while interacting with the device (e.g., Velaz et al., 2014). ...
... In the piping assembly task (Hou et al., 2015), assembly novices reported perceiving the AR system as easier to use and navigate, while experiencing statistically significantly lower workload (NASA TLX; Hart & Staveland, 1988) during the task. Loch et al. (2019) descriptively analyzed the USE questionnaire (Lund, 2001), containing 30 items on usefulness, ease of use, ease of learning, and satisfaction. The authors found that satisfaction and usefulness were rated higher when using the AR projector system compared to the video system. ...
Objective The present scoping review aims to transform the diverse field of research on the effects of mixed reality-based training on performance in manual assembly tasks into comprehensive statements about industrial needs for and effects of mixed reality-based training. Background Technologies such as augmented and virtual reality, referred to as mixed reality, are seen as promising media for training manual assembly tasks. Nevertheless, current literature shows partly contradictory results, which is due to the diversity of the hardware used, manual assembly tasks as well as methodological approaches to investigate the effects of mixed reality-based training. Method Following the methodological approach of a scoping review, we selected 24 articles according to predefined criteria and analyzed them concerning five key aspects: (1) the needs in the industry for mixed reality-based training, (2) the actual use and classification of mixed reality technologies, (3) defined measures for evaluating the outcomes of mixed reality-based training, (4) findings on objectively measured performance and subjective evaluations, as well as (5) identified research gaps. Results Regarding the improvement of performance and effectiveness through mixed reality-based training, promising results were found particularly for augmented reality-based training, while virtual reality-based training is mostly—but not consistently—as good as traditional training. Application Mixed reality-based training is still not consistently better, but mostly at least as good as traditional training. However, depending on the use case and technology used, the training outcomes in terms of assembly performance and subjective evaluations show promising results of mixed reality-based training.
... The usability test (Nielsen 2000;Lund 2001) consisted of a single round, after the release of the POC and before the first iteration (I 1 ). It involved a small user group of CVCE researchers: 4 male, 4 female; age range 25-39 (7) and 40-64 (1); and research background in history (5), language (2), and political studies (1). ...
Full-text available
The paper proposes a methodology that combines theoretical and practical aspects from human-computer interaction (HCI) and genetic criticism to trace and analyse prototype evolution. A case study illustrates this type of enquiry by examining the iterations and the dynamics of change in the design and development of the Transviewer, an interface for digital editions. The initial assumption is that such an analysis can inform existing models in interface design and possibly provide new ground for discussion in humanistic HCI. For instance by fostering broader reflections on software production as a technological and cultural artefact and the gradual shaping of the principles and metaphors underlying the construction of a certain type of knowledge, argument, or interpretation through an interface. Cet article propose une méthodologie qui combine les aspects théoriques et pratiques de l’interaction homme-machine (IHM) et la critique génétique afin de repérer et analyser l’évolution de prototypes. Une étude de cas illustre ce type d’enquête en examinant les itérations et les dynamiques du changement dans la conception et le développement de Transviewer, une interface pour des éditions numériques. La supposition initiale est qu’une telle analyse peut offrir des renseignements sur les modèles existants de la conception d’interface et peut potentiellement fournir de nouvelles informations à la discussion autour de l’IHM humaniste. Par exemple, cela peut faciliter de meilleures réflexions plus élargies sur la production de logiciels comme artefact technologique et culturel, ainsi que sur la formation progressive des principes et métaphores qui sont à la base de la construction d’un certain type de connaissance, d’argument, ou d’interprétation à travers une interface.
... The questionnaire 1 was aimed to evaluate qualitative aspects of the experiment: Ease of use, Ease of learning and Comfort regarding the three conditions. Items concerning Ease of use and Ease of learning derive from the USE questionnaire [11]. The items on Comfort, instead, have been designed to evaluate the movements naturalness. ...
... At this stage, the teacher will explain the operation of the media to avoid obstacles during the data acquisition process using IoT according to the topic chosen from home. Furthermore, a questionnaire will be given at the end of the meeting with reference to aspects of use, see (Lund, 2001). The results of the questionnaire will be analyzed to explain the level of validity of the data on the learning design. ...
Full-text available
One of the impacts of the Covid-19 pandemic in the world of education is a change in the learning process from face-to-face learning to online learning. To compensate for these changes, it is necessary to integrate technology in learning. The goal of this study is to design thematic learning at the elementary school level by integrating the Internet of Things (IoT) related to climate change parameters. The method in this study uses research and development of IoT-based measuring instruments using the Lolin V3 MCU node, with sensors for temperature, air pressure, humidity, altitude, and light intensity. The research respondents were 20 students. Data collection was carried out using a questionnaire containing four aspects, namely, usability, ease of use, aspects of learning ease, and satisfaction. The results showed that the reliability value > 0.6 and the level of student acceptance of the product was 3.4 with effective criteria. Based on the findings, this instructional design has potential opportunities to increase student engagement even though it is limited in networking. So it can be concluded that IoT-based learning media can be applied as a thematic learning model at the elementary school level during the Covid-19 pandemic.
... Our hypothesis that IVAs designed to assist older adults with memory problems have a positive usability, given the data from the table, indicates that there is reason to believe that it is true. [30] Mean 4.33 (SD 0.67) on a scale of 1 to 5 ("strongly agree" to "strongly disagree") Razavi et al [31] Pleasantness: mean 3.38 (SD 0.43); ease of following instructions: mean 3.38 (SD 0.47) on a scale of 1 to 4 Wargnier et al [24] Experimental questionnaire: 1=lowest, 4=highest; mean 3.58 (SD not given) Tokunaga et al [32] No quantitative usability data Tokunaga et al [19] Usability: mean 62.2 for Switzerland and mean 52 for the Netherlands on an unspecified scale (SD not provided) Tsiourti et al [33] USE a questionnaire from Lund [41]; 7-point Likert scale; total score 5.06 (SD 1.10) on a scale of 1 to 7 Jegundo et al [34] No quantitative usability data Oliveira et al [35] a USE: Usefulness, Satisfaction, and Ease of Use. ...
Full-text available
Background: Older adults often have increasing memory problems (amnesia), and approximately 50 million people worldwide have dementia. This syndrome gradually affects a patient over a period of 10-20 years. Intelligent virtual agents may support people with amnesia. Objective: This study aims to identify state-of-the-art experimental studies with virtual agents on a screen capable of verbal dialogues with a target group of older adults with amnesia. Methods: We conducted a systematic search of PubMed, SCOPUS, Microsoft Academic, Google Scholar, Web of Science, and CrossRef on virtual agent and amnesia on papers that describe such experiments. Search criteria were (Virtual Agent OR Virtual Assistant OR Virtual Human OR Conversational Agent OR Virtual Coach OR Chatbot) AND (Amnesia OR Dementia OR Alzheimer OR Mild Cognitive Impairment). Risk of bias was evaluated using the QualSyst tool (University of Alberta), which scores 14 study quality items. Eligible studies are reported in a table including country, study design type, target sample size, controls, study aims, experiment population, intervention details, results, and an image of the agent. Results: A total of 8 studies was included in this meta-analysis. The average number of participants in the studies was 20 (SD 12). The verbal interactions were generally short. The usability was generally reported to be positive. The human utterance was seen in 7 (88%) out of 8 studies based on short words or phrases that were predefined in the agent's speech recognition algorithm. The average study quality score was 0.69 (SD 0.08) on a scale of 0 to 1. Conclusions: The number of experimental studies on talking about virtual agents that support people with memory problems is still small. The details on the verbal interaction are limited, which makes it difficult to assess the quality of the interaction and the possible effects of confounding parameters. In addition, the derivation of the aggregated data was difficult. Further research with extended and prolonged dialogues is required.
Head mounted displays have become popular, but it is uncertain whether the interactive quality of these systems is sufficient for educational and training applications. This work is a longitudinal study into a variety of VR systems, which examines interface restrictions, ease of use, and user preferences with an emphasis on educational settings. Four different systems were examined with a range of interaction elements. Certain interactions failed in some systems and users did not necessarily prefer the highest-end systems. Overlapping interaction elements were also discovered, which may direct future work in later interaction test suites.
Objective Antineoplastic drugs are considered high risk, and computerized systems favor safe administration. The objective of the study was to test the usefulness and safety of a new mobile device compared to the standard device for administering these antineoplastic treatments. Data Sources This multicenter, quasi-experimental pre-post study assessed an intervention in two cancer centers in June and July 2020. Nineteen nurses participated by completing 57 questionnaires. The outcome variables were usefulness, ease of use, efficiency, safety, attitudes, and satisfaction with the new mobile device; they were measured by means of the USE questionnaire (Usefulness, Satisfaction, and Ease of use) and the Technology Attitude Survey (TAS). Professionals rated the new device higher than the standard device and showed a favorable attitude toward technology. Conclusion The tested device was useful, effective, safe, and specific to the antineoplastic treatment administration process, garnering greater satisfaction among professionals than the standard. Implications for Nursing Practice As new technologies can improve care for patients with cancer, it is essential to develop strategies to improve the experience of professionals for optimal implementation.
Background Postpartum depression (PPD) is a serious mental health problem that has a prevalence rate of nearly 20% in the first three months after delivery. The purpose of this study was to evaluate the benefit of Sunnyside, an internet-based cognitive-behavioral intervention, delivered in a group format compared to the same intervention delivered individually for the prevention of PPD. Method 210 people between 20- and 28-weeks gestation and who scored between 5 and 14 on the PHQ-8 and who did not meet criteria for major depression were recruited online. The Inventory of Depression and Anxiety Symptoms (IDAS), the Hamilton Rating Scale for Depression (HAMD), and the depression and anxiety modules of the MINI were obtained at baseline, post-treatment, and 12-weeks postpartum. Intervention adherence was measured by site usage. Results Across self-report and interview measures of depression there were no significant differences in outcome between the group and the individual versions of the program. Rates of major depression and generalized anxiety disorder in the postpartum period were low and adherence to the conditions was similarly high. Participants in the individual condition were significantly more satisfied than participants in the group condition (p < 0.05). Limitations The sample was predominantly white (85%) and recruited online, which may limit generalizability. Conclusions The group intervention was not more effective than the individual intervention. However, ignoring groups, many measures improved over time. The results of this study provide evidence that mood symptoms improve when participating in an online preventive intervention for postpartum depression.
ResearchGate has not been able to resolve any references for this publication.