Conference PaperPDF Available

Reaping the Benefits of Modern Usability Evaluation: The Simon Story

Authors:
  • MeasuringU

Abstract

Simon (TM-Bellsouth Corp.) is a commercially available personal communicator (PC) combining features of a PDA (personal digital assistant) with a full suite of communications features. This paper describes the involvement of human factors engineering in the development of Simon, and summarizes the various approaches to usability evaluation employed during its development. Simon has received a considerable amount of praise from the industry and won several industry awards, with recognition both for its innovative engineering and its usability.
... Bailey, 1993;R. W. Bailey, Allan, & Raiello, 1992;Gould et al., 1987;Høegh & Jensen, 2008;Lewis, 1996;Marshall et al., 1990). Published cost-benefit analyses (Bias & Mayhew, 1994) have demonstrated the value of usability 668 LEWIS engineering processes that include usability testing, with costbenefit ratios ranging from 1:2 for smaller projects to 1:100 for larger projects (Karat, 1997). ...
... Published cost-benefit analyses (Bias & Mayhew, 1994) have demonstrated the value of usability 668 LEWIS engineering processes that include usability testing, with costbenefit ratios ranging from 1:2 for smaller projects to 1:100 for larger projects (Karat, 1997). For example, consider the results of one case study (Lewis, 1996) and one experiment (G. Bailey, 1993). ...
... Bailey, 1993). Lewis (1996) published a case study of the development of the Simon, a personal communicator now widely considered to be the first commercially available smartphone. The development team (including the usability engineers) defined a set of tasks to use to develop competitive benchmarks and for iterative formative usability testing. ...
Article
Full-text available
The philosopher of science J. W. Grove (1989) once wrote, “There is, of course, nothing strange or scandalous about divisions of opinion among scientists. This is a condition for scientific progress” (p. 133). Over the past 30 years, usability, both as a practice and as an emerging science, has had its share of controversies. It has inherited some from its early roots in experimental psychology, measurement, and statistics. Others have emerged as the field of usability has matured and extended into user-centered design and user experience. In many ways, a field of inquiry is shaped by its controversies. This article reviews some of the persistent controversies in the field of usability, starting with their history, then assessing their current status from the perspective of a pragmatic practitioner. Put another way: Over the past three decades, what are some of the key lessons we have learned, and what remains to be learned? Some of the key lessons learned are:• When discussing usability, it is important to distinguish between the goals and practices of summative and formative usability.• There is compelling rational and empirical support for the practice of iterative formative usability testing—it appears to be effective in improving both objective and perceived usability.• When conducting usability studies, practitioners should use one of the currently available standardized usability questionnaires.• Because “magic number” rules of thumb for sample size requirements for usability tests are optimal only under very specific conditions, practitioners should use the tools that are available to guide sample size estimation rather than relying on “magic numbers.”
... The results of these studies (Kessner et al., 2001;Molich et al., 1998Molich et al., , 2004 stand in stark contrast to the published studies in which iterative usability tests (sometimes in combination with other UCD methods) have led to significantly improved products (Al-Awar et al., 1981;Bailey, 1993;Bailey et al., 1992;Gould et al., 1987;Kelley, 1984;Kennedy, 1982;Lewis, 1982;Lewis, 1996b;Ruthford and Ramey, 2000). For example, in a paper describing their experiences in product development, Marshall et al. (1990) stated, "Human factors work can be reliable -different human factors engineers, using different human factors techniques at different stages of a product's development, identified many of the same potential usability defects" (p. ...
... Due to its generalizability, practitioners can confidently use the PSSUQ when evaluating different types of products and at different times during the development process. The PSSUQ can be especially useful in competitive evaluations (for an example, see Lewis, 1996b) or when tracking changes in usability as a function of design changes made during development. Practitioners and researchers are free to use the PSSUQ and CSUQ (no license fees), but anyone using them should cite the source. ...
... • Vague goal analyses that lead to variability in task scenarios • Vague evaluation procedures • Vague problem criteria that lead to acceptance of anything as a usability problem Developing a better understanding of why these studies produced their results, which are so at odds with the apparent success of usability testing (Al-Awar et al., 1981;Bailey, 1993;Bailey et al., 1992;Gould et al., 1987;Kelley, 1984;Kennedy, 1982;Lewis, 1982;Lewis, 1996b;Marshall et al., 1990;Ruthford and Ramey, 2000), should be one of the top usability research efforts of the coming decade. An improved understanding might provide guidance about how or whether practitioners should change the way they conduct usability tests. ...
Chapter
Full-text available
Usability testing is an essential skill for usability practitioners -- professionals whose primary goal is to provide guidance to product developers for the purpose of improving the ease of use of their products. It is by no means the only skill with which usability practitioners must have proficiency, but it is an important one. Surveys of experienced usability practitioners have indicated that usability testing is a very frequently used method, second only to the use of iterative design. One goal of this chapter is to provide an introduction to the practice of usability testing. This includes some discussion of the concept of usability and the history of usability testing, various goals of usability testing, and running usability tests. A second goal is to cover more advanced topics, such as sample size estimation for usability tests, computation of confidence intervals, and the use of standardized usability questionnaires.
... Most of the studies (90%) were investigations of speech recognition systems (IBM and non-IBM systems), with an emphasis on speech dictation. The other studies were investigations of a personal communicator (Lewis, 1996) and a pen computing device. The PSSUQ database created from the questionnaires completed for this study had 210 entries from participants of widely varying backgrounds, computer experience, and age. ...
... Practitioners should be cautious about using the PSSUQ to compare the attitudes of different cultural groups. The PSSUQ can be especially useful in competitive evaluations (see Lewis, 1996) or when tracking changes in usability as a function of design changes made during development (either within a version or across versions). ...
... It is somewhat sobering to realize that Casio was able to do this in products commercially available 25 years agothe same year that the very first Apple Macintosh computer was released! Another important example is what I believe to be the world's first smartphone: the Simon [7], shown in Figure 3. This was developed jointly by IBM and Bell South, and first shown in 1993. ...
Article
Touch screens have a 40+ year history. Multi-touch and some of the gestures associated with it, are over 25 years old. This paper aspires to provide some perspective on the roots of these technologies, and share some future-relevant insights from those experiences. Since the scope of the article does not permit a comprehensive survey, emphasis has been given to projects and insights that are relevant, but less-well known.
... The results of these studies are in stark contrast to earlier studies in which usability problem discovery was reported to be reliable (Lewis, 1996;Marshall, Brendon, & Prail, 1990). The widespread use of usability problem discovery methods indicates that practitioners believe they are reliable. ...
Chapter
Full-text available
The cumulative binomial probability formula (given appropriate adjustment of p when estimated from small samples) provides a quick and robust means of estimating problem discovery rates (p). This estimate can be used to estimate usability test sample size requirements (for studies that are underway) and to evaluate usability test sample size adequacy (for studies that have already been conducted). Further research is needed to answer remaining questions about when usability testing is reliable, valid, and useful.
... One solution to onscreen keyboard emulation is a predictive keyboard --one that only displays a most-likely subset of the set of keyboard keys. This was one of the keyboard input solutions provided in the Simon™ personal communicator (Lewis, 1996), developed by IBM for BellSouth Corp. IBM Research developed the fundamental algorithms for the predictive keyboard, and the development team built the user interface for its deployment in Simon. ...
Conference Paper
Full-text available
Predictive keyboards are software keyboards that, to conserve screen real estate, display a subset of the full set of alphabetic keys at any one time, predicting the letters to display on the basis of tables of letters' transitional probabilities. Using different types of test texts (names, words, random strings), we evaluated the influence of various manipulations on the efficiency of letter selection with a predictive keyboard. The results of this study indicated that, across tested text types (names, words, random strings), (1) there was little benefit gained from adding a fourth transitional table modeling the likelihood of a letter's use as a function of a space and two preceding letters, (2) there was a potential benefit from increasing the number of displayed letters from six to eight, and (3) for a personal communicator device, an adaptive strategy would probably be less effective than using multiple sets of letter probability tables.
... In my own experience though, when I have conducted a standard scenario-based, problem-discovery usability evaluation with one observer watching multiple participants complete tasks with an interface and have done so in an iterative fashion, the measurements across iterations consistently indicate a substantial and statistically reliable improvement in usability. This leads me to believe that, despite the potential existence of a substantial evaluator effect, the application of usability evaluation methods (at least, methods that involve the observation of participants performing tasks with a product under development) can result in improved usability (e.g., see Lewis, 1996). An important task for future research in the evaluator effect will be to reconcile this effect with the apparent reality of usability improvement achieved through iterative application of usability evaluation methods. ...
Article
Full-text available
In this introduction to the special issue of the International Journal of Human-Computer Interaction, I discuss some current topics in usability evaluation and indicate how the contributions to the issue relate to these topics. The contributions cover a wide range of topics in usability evaluation, including a discussion of usability science, how to evaluate usability evaluation methods, the effect and control of certain biases in the selection of evaluative tasks, a lack of reliability in problem detection across evaluators, how to adjust estimates of problem-discovery rates computed from small samples, and the effects of perception of hedonic and ergonomic quality on user ratings of a product's appeal.
Article
This paper demonstrates the learning process of typing by tracing the development of eye and finger movement strategies over time. We conducted a controlled experiment in which users typed with Qwerty and randomized keyboards on a smartphone, allowing us to induce and analyze users’ behavioral strategies with different amounts of accumulated typing experience. We demonstrate how strategies, such as speed-accuracy trade-offs and gaze deployment between different regions of the typing interface depend on the amount of experience. The results suggest that, in addition to motor learning, the development of performance in mobile typing is attributable to the adaptation of visual attention and eye-hand coordination, in particular, the development of better location memory for the keyboard layout shapes the strategies. The findings shed light on how visuomotor control strategies develop during learning to type.
Article
Full-text available
The growing attention and prominence afforded to analytics presents a genuine challenge for the operational research community. Many in the community have recognised this growth and sought to align themselves with analytics. For instance, the US operational research society INFORMS now offers analytics related conferences, certification and a magazine. However, as shown in this research, the volume of analytics-orientated studies in journals associated with operational research is comparatively low. This paper seeks to address this paradox by seeking to better understand what analytics is, and how operational research is related to it. To do so literature from a range of academic disciplines is analysed, in what is conceived as concurrent histories in the shared tradition of a management paradigm spread over the last 100 years. The findings of this analysis reveal new insights as to how operational research exists within an ecosystem shared with several other disciplines, and how interactions and ripple effects diffuse knowledge and ideas between each. Whilst this ecosystem is developed and evolved through interdisciplinary collaborations, individual disciplines are cast into competition for the attention of the same business users. These findings are further explored by discussing the implication this has for operational research, as well as considering what directions future research may take to maximise the potential value of these relationships.
ResearchGate has not been able to resolve any references for this publication.