Article

Shaking the usability tree: why usability is not a dead end, and a constructive way forward

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

A recent contribution to the ongoing debate concerning the concept of usability and its measures proposed that usability reached a dead end – i.e. a construct unable to provide stable results and to unify scientific knowledge. Extensive commentaries rejected the conclusion that researchers need to look for alternative constructs to measure the quality of interaction. Nevertheless, several practitioners involved in this international debate asked for a constructive way to move forward the usability practice. In fact, two key issues of the usability field were identified in this debate: (i) knowledge fragmentation in the scientific community, and (ii) the unstable relationship among the usability metrics. We recognise both the importance and impact of these key issues, although, in line with others, we may not agree with the conclusion that the usability is a dead end. Under the light of the international debate, this work discusses the strengths and weaknesses of usability construct and its application. Our discussion focuses on identifying alternative explanations to the issues and to suggest mitigation strategies, which may be considered the starting point to move forward the usability field. However, scientific community actions will be needed to implement these mitigation strategies and to harmonise the usability practice.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The term 'usability' must be regarded as a set of concepts taken together to reflect the usability characteristics, such as performance, effectiveness, ease of learning, satisfaction, efficiency, and memorability of a software system, to obtain a complete understanding. However, the usability factor, as well as the associated criteria and metrics, are not consistently defined across different models or standards [13,40,41,68]. ...
... The formal umbrella construct of usability was defined in 1998 by the International Standard for Organisation (ISO) specifically ISO 9241-11(1998), as 'the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use' [9,13,58]. Meanwhile, the Institute of Electrical and Electronics Engineers [53] describes usability as 'the ease with which a user can learn to operate, prepare inputs for and interpret outputs of a system or component.' ...
... Umbrella structures are usually broad and limited to serve the scope of a scientific community adequately [13]. The uncertainty surrounding the usability construct is a bar to acquire and establish disciplinary knowledge accurately. ...
Article
Context A plethora of models are available for open-source software (OSS) usability evaluation. However, these models lack consensus between scholars as well as standard bodies on a specific set of usability evaluation criteria. Retaining irrelevant criteria and omitting essential ones will mislead the direction of the usability evaluation. Objective This study introduces a three-step method to develop a usability evaluation model in the context of OSS. Method The fuzzy Delphi method has been employed to unify the usability evaluation criteria in the context of OSS. The first step in the method is the usability criteria analysis, which involves redefining and restructuring all collected usability criteria reported in the literature. The second step is fuzzy Delphi analysis, which includes the design and validates the fuzzy Delphi instrument and the utilisation of the fuzzy Delphi method to analyse the fuzziness consensus of experts' opinions on the usability evaluation criteria. The third step is the proposal of the OSS usability evaluation model. Results A total of 124 usability criteria were identified, redefined, and restructured by creating groups of related meaning criteria. The result of the groupings generated 11 main criteria; the findings of the fuzzy Delphi narrowed down the criteria to only seven. The final set of criteria was sent back to the panellists for reconsideration of their responses. The panellists verified that these criteria are suitable in the evaluation of the usability of OSS. Discussion The empirical analysis confirmed that the proposed evaluation model is acceptable in assessing the usability of OSS. Therefore, this model can be used as a reference metric for OSS usability evaluation which will have a practical benefit for the community in public and private organisations in helping the decision-maker to select the best OSS software package amongst the alternatives.
... The International Organization for Standardization defines usability as "the effectiveness, efficiency, and satisfaction with which specified users achieve specified goals in particular environments" [3]. In this manner usability is a broad concept that can also include the acceptability of, or satisfaction with, a device, while the WHO lists the evaluation of the usability and feasibility of a device being the first steps that should be undertaken when assessing any new digital health intervention [1,4]. It has been suggested that for wearable devices to be accepted, they must be easy to wear, easy to use, affordable, contain relevant functionality and be aesthetically pleasing [5][6][7][8]. ...
... However it nonetheless demonstrated that researchers are attempting to assess usability, but that the quality of these evaluations are, for the most part, not fit for purpose [74,76], and in some cases amount to little more than a 'tick the box' exercise. Poorly communicated, incomparable results have been previously highlighted as a key issue in this domain [4]. Indeed, despite the relative modernity of this concept, usability is consistently highlighted as a necessary step in technological developments, and is included in both WHO and ISO guidance [1,3]. ...
... One of the criticisms of usability is that, as an umbrella term existing in a multi-disciplinary space, it spans too many factors and means different things for different people [4,77]. Usability has been shown to span a range of concepts including comfort, safety, durability, reliability, aesthetics and engagement [76], all of which were covered in the questionnaires and interviews in this review. ...
Article
Full-text available
Background The World Health Organisation’s global strategy for digital health emphasises the importance of patient involvement. Understanding the usability and acceptability of wearable devices is a core component of this. However, usability assessments to date have focused predominantly on healthy adults. There is a need to understand the patient perspective of wearable devices in participants with chronic health conditions. Methods A systematic review was conducted to identify any study design that included a usability assessment of wearable devices to measure mobility, through gait and physical activity, within five cohorts with chronic conditions (Parkinson’s disease [PD], multiple sclerosis [MS], congestive heart failure, [CHF], chronic obstructive pulmonary disorder [COPD], and proximal femoral fracture [PFF]). Results Thirty-seven studies were identified. Substantial heterogeneity in the quality of reporting, the methods used to assess usability, the devices used, and the aims of the studies precluded any meaningful comparisons. Questionnaires were used in the majority of studies (70.3%; n = 26) with a reliance on intervention specific measures (n = 16; 61.5%). For those who used interviews (n = 17; 45.9%), no topic guides were provided, while methods of analysis were not reported in over a third of studies (n = 6; 35.3%). Conclusion Usability of wearable devices is a poorly measured and reported variable in chronic health conditions. Although the heterogeneity in how these devices are implemented implies acceptance, the patient voice should not be assumed. In the absence of being able to make specific usability conclusions, the results of this review instead recommends that future research needs to: (1) Conduct usability assessments as standard, irrespective of the cohort under investigation or the type of study undertaken. (2) Adhere to basic reporting standards (e.g. COREQ) including the basic details of the study. Full copies of any questionnaires and interview guides should be supplied through supplemental files. (3) Utilise mixed methods research to gather a more comprehensive understanding of usability than either qualitative or quantitative research alone will provide. (4) Use previously validated questionnaires alongside any intervention specific measures.
... Systematic analyses over the past 10 years have consistently raised an alarm [16][17][18][19][20][21] by reporting a lack of a framework and a standardized method in usability studies. Although the definition of usability is still an important debate [22], the usability standard has been updated [15] to emphasize that usability is a result of interaction rather than a property of a product [23], which is also defined by its context of use [15], and it includes the following four components: goals and tasks, resources, environment, and users. These components influence usability results (composed by effectiveness, efficiency, and satisfaction), and it is therefore necessary to know how these components specifically influence usability results. ...
... However, very few collect user skills such as health literacy and device knowledge [29,48]. According to Borsci et al [22] and Grebin et al [49], the lack of attention to human factors is one of the reasons for the slow adoption of medical innovations. These authors proposed to better understand the factors that influence these decision-making processes in order to better understand the resilience abilities of individuals. ...
Article
Full-text available
Background: Studies on the usability of health care devices are becoming more common, although usability standards are not necessarily specified and followed. Yet, there is little knowledge about the impact of the context of use on the usability outcome. It is specified in the usability standard (ISO 9241-11, 2018) of a device that it may be affected by its context of use and especially by the characteristics of its users. Among these, prior health knowledge (ie, knowledge about human body functioning) is crucial. However, no study has shown that prior health knowledge influences the usability of medical devices. Objective: Our study aimed to fill this gap by analyzing the relationship between the usability of two home medical devices (soon to be used in the context of ambulatory surgery) and prior health knowledge through an experimental approach. Methods: For assessing the usability of two home medical devices (blood pressure monitor and pulse oximeter), user tests were conducted among 149 students. A mixed-methods approach (subjective vs objective) using a variety of standard instruments was adopted (direct observation, video analysis, and questionnaires). Participants completed a questionnaire to show the extent of their previous health knowledge and then operated both devices randomly. Efficiency (ie, handling time) and effectiveness (ie, number of handling errors) measures were collected by video analysis. Satisfaction measures were collected by a questionnaire (system usability scale [SUS]). The qualitative observational data were coded using inductive analysis by two independent researchers specialized in cognitive psychology and cognitive ergonomics. Correlational analyses and clusters were performed to test how usability relates to sociodemographic characteristics and prior health knowledge. Results: The results indicated a lack of usability for both devices. Regarding the blood pressure monitor (137 participants), users made approximately 0.77 errors (SD 1.49), and the mean SUS score was 72.4 (SD 21.07), which is considered “satisfactory.” The pulse oximeter (147 participants) appeared easier to use, but participants made more errors (mean 0.99, SD 0.92), and the mean SUS score was 71.52 (SD 17.29), which is considered “satisfactory.” The results showed a low negative and significant correlation only between the effectiveness of the two devices and previous knowledge (blood pressure monitor: r=−0.191, P=.03; pulse oximeter: r=−0.263, P=.001). More subtly, we experimentally identified the existence of a threshold level (χ²2,146=10.9, P=.004) for health knowledge to correctly use the pulse oximeter, but this was missing for the blood pressure monitor. Conclusions: This study has the following two contributions: (1) a theoretical interest highlighting the importance of user characteristics including prior health knowledge on usability outcomes and (2) an applied interest to provide recommendations to designers and medical staff.
... These two ISO standards define the key factors of interaction quality: (i) effectiveness, efficiency and satisfaction in a specific context of use (ISO 9241-11); and (ii) control (where possible) of expectations over time concerning use, satisfaction, perceived level of acceptability, trust, usefulness and all those factors that ultimately push users to adopt and keep using a tool (ISO 9241-210). Although these standards have not yet been updated to meet the specific needs of chatbots and conversational agents, the two aspects of usability and UX are essential to the perceived quality of interaction [21]. Until a framework has been developed and broad consensus reached on assessment criteria, practitioners may benefit from the assessment of chatbots against these ISO standards, as they allow for an evaluation of the interactive output of these applications. ...
... (iii) Although dedicated tools and methods for assessing the quality of interaction with chatbots are lacking, reliable methods and measures to assess interaction are available [17,19,21,37], and these should be adopted and used to enable the generation of comparable evidence regarding the quality of conversational agents. ...
Conference Paper
Full-text available
People with disabilities or special needs can benefit from AI-based conversational agents, which are used in competence training and well-being management. Assessment of the quality of interactions with these chatbots is key to being able to reduce dissatisfaction with them and to understand their potential long-term benefits. This will in turn help to increase adherence to their use, thereby improving the quality of life of the large population of end-users that they are able to serve. We systematically reviewed the literature on methods of assessing the perceived quality of interactions with chatbots, and identified only 15 of 192 papers on this topic that included people with disabilities or special needs in their assessments. The results also highlighted the lack of a shared theoretical framework for assessing the perceived quality of interactions with chatbots. Systematic procedures based on reliable and valid methodologies continue to be needed in this field. The current lack of reliable tools and systematic methods for assessing chatbots for people with disabilities and special needs is concerning, and may lead to unreliable systems entering the market with disruptive consequences for users. Three major conclusions can be drawn from this systematic analysis: (i) researchers should adopt consolidated and comparable methodologies to rule out risks in use; (ii) the constructs of satisfaction and acceptability are different, and should be measured separately; (iii) dedicated tools and methods for assessing the quality of interaction with chatbots should be developed and used to enable the generation of comparable evidence.
... Systematic analyses over the past 10 years have consistently raised an alarm [16][17][18][19][20][21] by reporting a lack of a framework and a standardized method in usability studies. Although the definition of usability is still an important debate [22], the usability standard has been updated [15] to emphasize that usability is a result of interaction rather than a property of a product [23], which is also defined by its context of use [15], and it includes the following four components: goals and tasks, resources, environment, and users. These components influence usability results (composed by effectiveness, efficiency, and satisfaction), and it is therefore necessary to know how these components specifically influence usability results. ...
... However, very few collect user skills such as health literacy and device knowledge [29,48]. According to Borsci et al [22] and Grebin et al [49], the lack of attention to human factors is one of the reasons for the slow adoption of medical innovations. These authors proposed to better understand the factors that influence these decision-making processes in order to better understand the resilience abilities of individuals. ...
Article
Background: Studies on the usability of health care devices are becoming more common, although usability standards are not necessarily specified and followed. Yet, there is little knowledge about the impact of the context of use on the usability outcome. It is specified in the usability standard (ISO 9241-11, 2018) of a device that it may be affected by its context of use and especially by the characteristics of its users. Among these, prior health knowledge (ie, knowledge about human body functioning) is crucial. However, no study has shown that prior health knowledge influences the usability of medical devices. Objective: Our study aimed to fill this gap by analyzing the relationship between the usability of two home medical devices (soon to be used in the context of ambulatory surgery) and prior health knowledge through an experimental approach. Methods: For assessing the usability of two home medical devices (blood pressure monitor and pulse oximeter), user tests were conducted among 149 students. A mixed-methods approach (subjective vs objective) using a variety of standard instruments was adopted (direct observation, video analysis, and questionnaires). Participants completed a questionnaire to show the extent of their previous health knowledge and then operated both devices randomly. Efficiency (ie, handling time) and effectiveness (ie, number of handling errors) measures were collected by video analysis. Satisfaction measures were collected by a questionnaire (system usability scale [SUS]). The qualitative observational data were coded using inductive analysis by two independent researchers specialized in cognitive psychology and cognitive ergonomics. Correlational analyses and clusters were performed to test how usability relates to sociodemographic characteristics and prior health knowledge. Results: The results indicated a lack of usability for both devices. Regarding the blood pressure monitor (137 participants), users made approximately 0.77 errors (SD 1.49), and the mean SUS score was 72.4 (SD 21.07), which is considered "satisfactory." The pulse oximeter (147 participants) appeared easier to use, but participants made more errors (mean 0.99, SD 0.92), and the mean SUS score was 71.52 (SD 17.29), which is considered "satisfactory." The results showed a low negative and significant correlation only between the effectiveness of the two devices and previous knowledge (blood pressure monitor: r=-0.191, P=.03; pulse oximeter: r=-0.263, P=.001). More subtly, we experimentally identified the existence of a threshold level (χ²2,146=10.9, P=.004) for health knowledge to correctly use the pulse oximeter, but this was missing for the blood pressure monitor. Conclusions: This study has the following two contributions: (1) a theoretical interest highlighting the importance of user characteristics including prior health knowledge on usability outcomes and (2) an applied interest to provide recommendations to designers and medical staff.
... Systematic analyses over the past 10 years have consistently raised an alarm [16][17][18][19][20][21] by reporting a lack of a framework and a standardized method in usability studies. Although the definition of usability is still an important debate [22], the usability standard has been updated [15] to emphasize that usability is a result of interaction rather than a property of a product [23], which is also defined by its context of use [15], and it includes the following four components: goals and tasks, resources, environment, and users. These components influence usability results (composed by effectiveness, efficiency, and satisfaction), and it is therefore necessary to know how these components specifically influence usability results. ...
... However, very few collect user skills such as health literacy and device knowledge [29,48]. According to Borsci et al [22] and Grebin et al [49], the lack of attention to human factors is one of the reasons for the slow adoption of medical innovations. These authors proposed to better understand the factors that influence these decision-making processes in order to better understand the resilience abilities of individuals. ...
Preprint
BACKGROUND Studies on the usability of health care devices are quietly becoming more common even though usability standards are not necessarily specified and followed. Yet there is little knowledge about the impact of the context of use on the usability outcome. While it is specified in the usability standard (ISO 9241-11, 2018) of a medical device that it may be affected by its context of use and especially by the characteristics of its users. Among these, prior health knowledge is crucial. However, no study has so far shown that prior health knowledge influences the usability of a medical device. OBJECTIVE Our study was designed to fill this gap, by analyzing the relationship between the usability of home medical devices and prior health knowledge through an experimental approach METHODS In order to assessed the usability of two devices (blood pressure monitor and pulse oximeter) user tests were conducted with 149 students. Mixed-methods (subjective vs. objective) approach using a variety of standard instruments (direct observation, video analysis, questionnaires) were used. They completed a questionnaire to show the extent of their previous health knowledge and then operated both devices randomly. Efficiency and effectiveness measures were collected by video analysis. Satisfaction measures were collected by questionnaire (System Usability Scale). The qualitative observational data was coded using inductive analysis by two independent researchers. Correlational analyses and clusters were performed to test how usability relates to sociodemographic and prior health knowledge. RESULTS The results indicate a lack of usability for both devices. Users made approximately 0.77 errors (SD = 1.49) and the mean of the System Usability Scale (SUS) score was 72.4 (SD = 21.07), which is considered "satisfactory" for the blood pressure monitor (N = 137 participants). The pulse oximeter (N = 147) therefore appears to be easier to use but participants made more errors (M = 0.99, SD = 0.92). The mean SUS score was 71.52 (SD = 17.29), which indicates a "satisfactory" score. Results also showed a low negative and significant correlation only between the effectiveness of the two devices (blood pressure monitor: r = -0.191, P = 0.026; and pulse oximeter: r = -0.263, P = 0.001) and previous knowledge. More subtly, the authors experimentally identified the existence of a threshold level (² = 10.89, P =.004) on health knowledge to correctly use the pulse oximeter, but which is missing for the blood pressure monitor. CONCLUSIONS Thus, this study has two contributions: (1) a theoretical interest highlighting the importance of user characteristics including prior health knowledge of usability outcome and (2) an applied interest to provide recommendations to designers and medical staff.
... Experts generally agree that the interactive experience of a user (user experience, UX) is affected by the perceived usability and aesthetics of an interface, and the extent to which user needs are met in a specific context of use [1]. UX includes the dimensions of usability [2] and concurrently attempts to enlarge the assessment's factors with a focus on cognitive, aesthetics, and qualitative aspects of interaction measured throughout time [3]. Accordingly, to fully model the perceived experience of a user, practitioners should include a set of repeated objective and subjective measures in their evaluation protocols to enable satisfaction and benefit analysis as a "subjective sum of the interactive experience" [4]. ...
Chapter
To fully model the perceived experience of a user, practitioners should include a set of repeated objective and subjective measures in their evaluation protocols to enable satisfaction and benefit analysis as a “subjective sum of the interactive experience.” It is also well known that if the UX of a product is assessed at the end of the design process, product changes are much more expensive than if the same evaluation were conducted throughout the development process. In this study, we aim to present how these concepts of UX and UCD inform the process of selecting and assigning assistive technologies (ATs) for people with disabilities (PWD) according to the Matching Person and Technology (MPT) model and assessments. To make technology the solution to the PWD’s needs, the MPT was developed as an international measure evidence-based tool to assess the best match between person and technology, where the user remains the main actor in all the selection, adaptation, and assignment process (user-driven model). The MPT model and tools assume that the characteristics of the person, environment, and technology should be considered as interacting when selecting the most appropriate AT for a particular person’s use. It has demonstrated good qualitative and quantitative psychometric properties for measuring UX, realization of benefit and satisfaction and, therefore, it is a useful resource to help prevent the needs and preferences of the users from being met and can reduce early technology abandonment and the consequent waste of money and energy.
... It was deduced that the design and the technology flow should be evaluated by end-users to highlight how people will adapt to using the application. Borsci et al. [54] highlighted how usability could be a starting point to evaluate technology and to further enhance a system for efficient and actual usage among individuals. Moreover, the study of Russ and Salem [53] also indicated how measurement scales should be implemented to effectively measure the usability of a system, wherein this study adapted the SUS. ...
Article
Full-text available
Thai Chana is one of the mobile applications for COVID-19 disease-control tracking, especially among the Thais. The purpose of this study was to determine factors affecting the perceived usability of Thai Chana by integrating protection motivation theory, the extended technology acceptance model, and the system usability scale. In all, 800 Thais participated and filled an online questionnaire with 56 questions during the early COVID-19 omicron period (15 December 2021 to 14 January 2022). Structural equation modeling (SEM) showed that the understanding of COVID-19 has significant effects on perceived severity and perceived vulnerability, which subsequently leads to perceived usefulness. In addition, perceived usefulness and perceived ease of use have significant direct effects on attitude, which subsequently leads to the intention to use, actual use, and perceived usability. This study is one of the first studies that have analyzed the mobile application for COVID-19 disease-control tracking. The significant and substantial findings can be used for a theoretical foundation, particularly in designing a new mobile application for disease-control tracking worldwide. Finally, protection motivation theory, the extended technology acceptance model, and the system usability scale can be used for evaluating other disease-control tracking mobile applications worldwide.
... It has been argued that usability should be considered from the perspective of the system domain [17]. eHealth applications are designed to inform about, prevent, diagnose, treat, or monitor health conditions. ...
Article
Background Usability tests can be either formative (where the aim is to detect usability problems) or summative (where the aim is to benchmark usability). There are ample formative methods that consider user characteristics and contexts (ie, cognitive walkthroughs, interviews, and verbal protocols). This is especially valuable for eHealth applications, as health conditions can influence user-system interactions. However, most summative usability tests do not consider eHealth-specific factors that could potentially affect the usability of a system. One of the reasons for this is the lack of fine-grained frameworks or models of usability factors that are unique to the eHealth domain. Objective In this study, we aim to develop an ontology of usability problems, specifically for eHealth applications, with patients as primary end users. Methods We analyzed 8 data sets containing the results of 8 formative usability tests for eHealth applications. These data sets contained 400 usability problems that could be used for analysis. Both inductive and deductive coding were used to create an ontology from 6 data sets, and 2 data sets were used to validate the framework by assessing the intercoder agreement. Results We identified 8 main categories of usability factors, including basic system performance, task-technology fit, accessibility, interface design, navigation and structure, information and terminology, guidance and support, and satisfaction. These 8 categories contained a total of 21 factors: 14 general usability factors and 7 eHealth-specific factors. Cohen κ was calculated for 2 data sets on both the category and factor levels, and all Cohen κ values were between 0.62 and 0.67, which is acceptable. Descriptive analysis revealed that approximately 69.5% (278/400) of the usability problems can be considered as general usability factors and 30.5% (122/400) as eHealth-specific usability factors. Conclusions Our ontology provides a detailed overview of the usability factors for eHealth applications. Current usability benchmarking instruments include only a subset of the factors that emerged from our study and are therefore not fully suited for summative evaluations of eHealth applications. Our findings support the development of new usability benchmarking tools for the eHealth domain.
... However, usability measurements are often inconsistent in practice, including the measurement of emotions (Borsci, Federici, Malizia, & De Filippis, 2019). For example, studies on user experience have mainly focused on positive emotions (Hassenzahl & Tractinsky, 2006). ...
Article
Users’ experiences in mental health assessment are multifaceted, including their emotional experiences. Yet, studies of mobile apps for psychiatric assessment have centered on diagnostic accuracy and perceived usability, with little consideration of the impact of user emotional experiences. In this study, we focused on users’ perceived usability and emotions and compared the user experience of a paper-and-pencil and an app-based collection of mental health screening questionnaires: EarlyDetect. The System Usability Scale (SUS) and modality-directed emotion questionnaires were administered using paper-and-pencil or iPad. Modality was assigned pseudo-randomly on patients’ first visit at a referral-based mental health clinic. We found that patients assigned to the iPad app reported a significantly higher SUS score than patients assigned to paper-and-pencil, qualified by a modality-by-gender interaction where modality effects were significant for men but not for women. Moreover, enjoyment was positively linked to perceived usability, whereas boredom, frustration, and anxiety were negatively linked to usability. Our findings illustrate the added value of studying user experience applied to psychiatric assessments, where both emotions and gender-specific user experience should be taken into consideration. We further discuss the implications for psychiatric assessments via app versus traditional data collection.
... To ensure well-specified usability of the final product, usability itself must be seen as an ongoing series of actions [14]. Usability feedback is currently often difficult to implement practically, but its importance is still considered to be important in theoretical research [5]. In our work, we confirm the latter thesis from a practical perspective. ...
Conference Paper
Full-text available
LinkedIn: https://at.linkedin.com/in/anna-fensel-0862501 2 On the value of usability tests for the development of social media marketing software Social networks such as Facebook, Twitter or LinkedIn opened the door for new ways of online marketing-social media marketing. In order to use social networks efficiently for marketing purposes and reach (potential) customers, marketers rely on social media marketing software (SMMS). These web-based applications support companies or individuals with publishing, engaging, promoting or listening on social media networks. In order to make a competitive SMMS, some of the most important quality factors are usability and user experience. In practice, often only user interface (UI) experts are used for design updates as usability tests can be time intensive and costly. Based on the use case of the social media management tool Onlim (www.onlim.com), the extent to which usability tests can detect user experience issues and suggest improvements was studied by conducting a usability lab. The data of 20 participants of the conducted usability lab was used for an in-depth analysis. The analysis identified fifteen usability problems whereas five are system and ten operational problems. Overall only 40% of the problems were resolved through the implementation of a new UI design when not taking usability tests outcomes into account, and 60% of the problems still remain in the new interface or are only partly solved. Therefore, the value of usability tests is demonstrated for the development of SMMS.
... These two ISO standards define the key factors of interaction quality: (i) effectiveness, efficiency and satisfaction in a specific context of use (ISO 9241-11); and (ii) the control (where possible) over time of expectations concerning use, satisfaction, perceived level of acceptability, trust, usefulness and all those factors that ultimately push users to adopt and keep using a tool (ISO 9241-210). Although these standards have not yet been adapted to accommodate the specific needs of chatbots and conversational agents, these two aspectsusability and UXare essential to the perceived quality of interaction [23]. Until a framework has been developed and broad consensus on the assessment criteria established, practitioners may benefit from assessing chatbots against these ISO standards; this would allow them to compare the interactive performance of these applications. ...
Article
Introduction: People with disabilities or special needs can benefit from AI-based conversational agents (i.e., chatbots) that are used for competence training and well-being management. Assessing the quality of interactions with these chatbots is key to being able to reduce dissatisfaction with them and to understanding their potential long-term benefit. This in turn will help to increase adherence to their use, thereby improving the quality of life of the large population of end-users that they are able to serve. Methods: Following Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) methodology, we systematically reviewed the literature on methods of assessing the perceived quality of interactions with chatbots using the from Scopus and the Web of Science electronic databases. Using the Boolean operators (AND/OR) the keywords chatbot*, conversational agent*, special needs, disability were combined. Results: Revealed that only 15 of 192 papers on this topic included people with disabilities or special needs in their assessments. The results also highlighted the lack of a shared theoretical framework for assessing the perceived quality of interactions with chatbots. Conclusion: Systematic procedures based on reliable and valid methodologies continue to be needed in this field. The current lack of reliable tools and systematic methods to assess chatbots for people with disabilities and special needs is concerning, and ultimately, it may also lead to unreliable systems entering the market with disruptive consequences for people. • Implications for rehabilitation • Chatbots applied in rehabilitation are mainly tested in terms of clinical effectiveness and validity with a minimal focus on measuring the quality of the interaction • The usability and interactive properties of chatbots applied in rehabilitation are not comparable as each tool is measured in different way • The lack of a common framework to assess chatbots exposes people with disability and special needs to the risk of using unreliable tools
... More recent usability definitions also include concepts such as learnability, encouragement for use, minimizing errors, accessible to a wide range of individuals, and easy to maintain [46,47]. Although usability may be applied differently across contexts, for the purposes of this paper, we focus on the perceived quality of the user experience interacting with the 2wT system [48], considering both client and provider input. For acceptability, we focused on the usefulness of the system and user perceptions of the importance of system functions. ...
Article
Full-text available
Background Voluntary medical male circumcision (MC) is safe and effective. Nevertheless, MC programs require multiple post-operative visits. In Zimbabwe, a randomized control trial (RCT) found that post-operative two-way texting (2wT) between clients and MC providers instead of in-person reviews reduced provider workload and safeguarded patient safety. A critical component of the RCT assessed usability and acceptability of 2wT among providers and clients. These findings inform scale-up of the 2wT approach to post-operative follow-up. Methods The RCT assigned 362 adult MC clients with cell phones into 2wT; these men responded to 13 automated daily texts supported by interactive texting or in-person follow-up, when needed. A subset of 100 texting clients filled a self-administered usability survey on day 14. 2wT acceptability was ascertained via 2wT response rates. Among 2wT providers, eight key informant interviews focused on 2wT acceptability and usability. Influences of wage and age on response rates and client-reported potential AEs were explored using linear and logistic regression models, respectively. Results Clients felt confident, comfortable, satisfied, and well-supported with 2wT-based follow-up; few noted texting challenges or concerns about healing. Clients felt 2wT saved them time and money. Response rates (92%) suggested 2wT acceptability. Both clients and providers felt 2wT was highly usable. Providers noted 2wT saved them time, empowered clients to engage in their healing, and closed gaps in MC service quality. For scale, providers reinforced good post-operative counseling on AEs and texting instructions. Wage and age did not influence text response rates or potential AE texts. Conclusion Results strongly suggest that 2wT is highly usable and acceptable for providers and patients. Men with concerns solicited provider guidance and reassurance offered via text. Providers noted that men engaged proactively in their healing. 2wT between providers and patients should be expanded for MC and considered for other short-term care contexts. The trial is registered on ClinicalTrials.gov, trial NCT03119337, and was activated on April 18, 2017. https://clinicaltrials.gov/ct2/show/NCT03119337
... Although the meaning of usability is under debate (e.g. ) [10], usability can be seen as the perceived 'ease of use', 'user-friendliness' or 'quality of use' of a system, interface or product. In the international standard definition, usability is described as the extent to which a product can be used by specified users to achieve designated goals with effectiveness, efficiency, and satisfaction in a specified-context of use [11]. ...
Article
Full-text available
Background: The System Usability Scale (SUS) is used to measure usability of internet-based Cognitive Behavioural Therapy (iCBT). However, whether the SUS is a valid instrument to measure usability in this context is unclear. The aim of this study is to assess the factor structure of the SUS, measuring usability of iCBT for depression in a sample of professionals. In addition, the psychometric properties (reliability, convergent validity) of the SUS were tested. Methods: A sample of 242 professionals using iCBT for depression from 6 European countries completed the SUS. Confirmatory Factor Analysis (CFA) was conducted to test whether a one-factor, two-factor, tone-model or bi-direct model would fit the data best. Reliability was assessed using complementary statistical indices (e.g. omega). To assess convergent validity, the SUS total score was correlated with an adapted Client Satisfaction Questionnaire (CSQ-3). Results: CFA supported the one-factor, two-factor and tone-model, but the bi-factor model fitted the data best (Comparative Fit Index = 0.992, Tucker Lewis Index = 0.985, Root Mean Square Error of Approximation = 0.055, Standardized Root Mean Square Residual = 0.042 (respectively χ2diff (9) = 69.82, p < 0.001; χ2diff (8) = 33.04, p < 0.001). Reliability of the SUS was good (ω = 0.91). The total SUS score correlated moderately with the CSQ-3 (CSQ1 rs = .49, p < 0.001; CSQ2 rs = .46, p < 0.001; CSQ3 rs = .38, p < 0.001), indicating convergent validity. Conclusions: Although the SUS seems to have a multidimensional structure, the best model showed that the total sumscore of the SUS appears to be a valid and interpretable measure to assess the usability of internet-based interventions when used by professionals in mental healthcare.
... For example, Lewis (2018a) pointed out that discrepancies in reported correlations among typical usability measurements could be accounted for by differences in the scopes of literature reviews, and subscales of multidimensional questionnaires designed to assess different aspects of usability and user experience were typically correlated rather than uncorrelated, strongly suggesting the presence of a strong underlying and unifying factor presumed to be perceived usability. Borsci, Federici, Malizia, and De Filippis (2019) drew upon numerous responses to Tractinsky's paper to propose a constructive way to move forward with usability practice, calling upon the user experience community to implement mitigation strategies such as avoiding unnecessary fragmentation of knowledge while still permitting flexible application of usability/user experience standards via a meta-standard of usability and adoption of common guidelines to report usability data for crossdisciplinary communication. Tractinsky's (2018) indictment of the usefulness of the construct of usability but endorsement of the construct of acceptance spurred us to investigate the statistical relationships among various measures of perceived usability and the components of the TAM. ...
Article
In response to recent criticism of the usefulness of the construct of usability, we investigated the relationships between measures of perceived usability and the components of a modified version of the Technology Acceptance Model (mTAM) – Perceived Usefulness (PU) and Perceived Ease-of-Use (PEU). In three surveys, respondents used SUS, UMUX-LITE and mTAM to rate their actual (as opposed to expected) experience with three software products. As expected, the correlations between PEU and other measures of perceived usability tended to be significantly stronger than those with PU. Additional findings support the use of the UMUX-LITE as a compact measure of perceived usability that has a strong relationship to the mTAM and strong correspondence with concurrently collected SUS scores. The main theoretical result of this research were regression results providing evidence that the PEU component of the mTAM appears to be another measure of the construct of perceived usability, connecting the TAM to the construct of perceived usability through the mTAM and providing evidence against the claim that the construct of usability is a theoretical dead end.
Article
Full-text available
Background Currently, most usability benchmarking tools used within the eHealth domain are based on re-classifications of old usability frameworks or generic usability surveys. This makes them outdated and not well suited for the eHealth domain. Recently, a new ontology of usability factors was developed for the eHealth domain. It consists of eight categories: Basic System Performance (BSP), Task-Technology Fit (TTF), Accessibility (ACC), Interface Design (ID), Navigation & Structure (NS), Information & Terminology (IT), Guidance & Support (GS) and Satisfaction (SAT). Objective The goal of this study is to develop a new usability benchmarking tool for eHealth, the eHealth UsaBility Benchmarking Instrument (HUBBI), that is based on a new ontology of usability factors for eHealth. Methods First, a large item pool was generated containing 66 items. Then, an online usability test was conducted, using the case study of a Dutch website for general health advice. Participants had to perform three tasks on the website, after which they completed the HUBBI. Using Partial Least Squares Structural Equation Modelling (PLS-SEM), we identified the items that assess each factor best and that, together, make up the HUBBI. Results A total of 148 persons participated. Our selection of items resulted in a shortened version of the HUBBI, containing 18 items. The category Accessibility is not included in the final version, due to the wide range of eHealth services and their heterogeneous populations. This creates a constantly different role of Accessibility, which is a problem for a uniform benchmarking tool., Conclusions The HUBBI is a new and comprehensive usability benchmarking tool for the eHealth domain. It assesses usability on seven domains (BSP, TTF, ID, NS, IT, GS, SAT) in which a score per domain is generated. This can help eHealth developers to quickly determine which areas of the eHealth system’s usability need to be optimized.
Article
Full-text available
Objectives Current health technology assessment (HTA) methods guidelines for medical devices may benefit from contributions by biomedical and clinical engineers. Our study aims to: (i) review and identify gaps in the current HTA guidelines on medical devices, (ii) propose recommendations to optimize the impact of HTA for medical devices, and (iii) reach a consensus among biomedical engineers on these recommendations. Methods A gray literature search of HTA agency Web sites for assessment methods guidelines on devices was conducted. The International Federation of Medical and Biological Engineers (IFMBE) then convened a structured focus group, with experts from different fields, to identify potential gaps in the current HTA guidelines, and to develop recommendations to fill these perceived gaps. The thirty recommendations generated from the focus group were circulated in a Delphi survey to eighty-five biomedical and clinical engineers. Results Thirty-two panelists, from seventeen countries, participated in the Delphi survey. The responses showed a strong agreement on twenty-seven of thirty recommendations. Some uncertainties remain about the methods to accurately assess the effectiveness and safety, and interoperability of a medical device with other devices or within the clinical setting. Conclusions As medical devices differ from drug therapies, current HTA methods may not accurately reflect the conclusions of their assessment. Recommendations informed by the focus group discussions and Delphi survey responses aimed to address the perceived gaps, and to provide a more integrated approach in medical device assessments in combining engineering with other perspectives, such as clinical, economic, patient, human factors, ethical, and environmental.
Book
Full-text available
Features Proposes an international evidence-based ideal model of the assistive technology assessment based on experimental research and experiences in assistive products service delivery Brings together in one handbook all the assessment tools needed in an assistive technology service delivery center Describes the professional profiles, skills, and interactions of the multidisciplinary and integrated team members involved in the assessment process Identifies the needed role of professionals of psychotechnology and assessment Reviews all forms of technologies, including recent technologies such as brain–computer interfaces, robotics, and exoskeletons Comes with supplemental material containing the Matching Person and Technology tools in multiple languages. Summary Assistive Technology Assessment Handbook, Second Edition, proposes an international ideal model for the assistive technology assessment process, outlining how this model can be applied in practice to re-conceptualize the phases of an assistive technology delivery system according to the biopsychosocial model of disability. The model provides reference guidelines for evidence-based practice, guiding both public and private centers that wish to compare, evaluate, and improve their ability to match a person with the correct technology model. This second edition also offers a contribution to the Global Cooperation on Assistive Technology (GATE) initiative, whose activities are strongly focused on the assistive products service delivery model. Organized into three parts, the handbook: gives readers a toolkit for performing assessments; describes the roles of the assessment team members, among them the new profession of psychotechnologist; and reviews technologies for rehabilitation and independent living, including brain–computer interfaces, exoskeletons, and technologies for music therapy. Edited by Stefano Federici and Marcia J. Scherer, this cross-cultural handbook includes contributions from leading experts across five continents, offering a framework for future practice and research.
Article
Full-text available
Aims (1) To model the process of use and usability of pH strips (2) to identify, through simulation studies, the likelihood of misreading pH strips, and to assess professional’s acceptance, trust and perceived usability of pH strips. Methods This study was undertaken in four phases and used a mixed method approach (an audit, a semi-structured interview, a survey and simulation study). The three months audit was of 24 patients, the semi-structured interview was performed with 19 health professionals and informed the process of use of pH strips. A survey of 134 professionals and novices explored the likelihood of misinterpreting pH strips. Standardised questionnaires were used to assess professionals perceived usability, trust and acceptance of pH strip use in a simulated study. Results The audit found that in 45.7% of the cases aspiration could not be achieved, and that 54% of the NG-tube insertions required x-ray confirmation. None of those interviewed had received formal training on pH strips use. In the simulated study, participants made up to 11.15% errors in reading the strips with important implications for decision making regarding NG tube placement. No difference was identified between professionals and novices in their likelihood of misinterpreting the pH value of the strips. Whilst the overall experience of usage is poor (47.3%), health professionals gave a positive level of trust in both the interview (62.6%) and the survey (68.7%) and acceptance (interview group 65.1%, survey group 74.7%). They also reported anxiety in the use of strips (interview group 29.7%, survey group 49.7%). Conclusions Significant errors occur when using pH strips in a simulated study. Manufacturers should consider developing new pH strips, specifically designed for bedside use, that are more usable and less likely to be misread.
Article
Full-text available
Aim To provide a quantitative assessment of cataract theatre lists focusing on productivity and staffing levels/tasks using time and motion studies. Methods National Health Service (NHS) cataract theatre lists were prospectively observed in five different institutions (four NHS hospitals and one private hospital). Individual tasks and their timings of every member of staff were recorded. Multiple linear regression analyses were performed to investigate possible associations between individual timings and tasks. Results 140 operations were studied over 18 theatre sessions. The median number of scheduled cataract operations was 7 (range: 5–14). The average duration of an operation was 10.3 min±(SD 4.11 min). The average time to complete one case including patient turnaround was 19.97 min (SD 8.77 min). The proportion of the surgeons’ time occupied on total duties or operating ranged from 65.2% to 76.1% and from 42.4% to 56.7%, respectively. The correlations of the surgical time to patient time in theatre was R²=0.95. A multiple linear regression model found a significant association (F(3,111)=32.86, P<0.001) with R²=0.47 between the duration of one operation and the number of allied healthcare professionals (AHPs), the number of AHP key tasks and the time taken to perform these key tasks by the AHPs. Conclusions Significant variability in the number of cases performed and the efficiency of patient flow were found between different institutions. Time and motion studies identified requirements for high-volume models and factors relating to performance. Supporting the surgeon with sufficient AHPs and tasks performed by AHPs could improve surgical efficiency up to approximately double productivity over conventional theatre models.
Article
Full-text available
Purpose: This systematic review examines research and practical applications of the World Health Organization Disability Assessment Schedule (WHODAS 2.0) as a basis for establishing specific criteria for evaluating relevant international scientific literature. The aims were to establish the extent of international dissemination and use of WHODAS 2.0 and analyze psychometric research on its various translations and adaptations. In particular, we wanted to highlight which psychometric features have been investigated, focusing on the factor structure, reliability, and validity of this instrument. Method: Following Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) methodology, we conducted a search for publications focused on “whodas” using the ProQuest, PubMed, and Google Scholar electronic databases. Results: We identified 810 studies from 94 countries published between 1999 and 2015. WHODAS 2.0 has been translated into 47 languages and dialects and used in 27 areas of research (40% in psychiatry). Conclusions: The growing number of studies indicates increasing interest in the WHODAS 2.0 for assessing individual functioning and disability in different settings and individual health conditions. The WHODAS 2.0 shows strong correlations with several other measures of activity limitations; probably due to the fact that it shares the same disability latent variable with them. Implications for Rehabilitation WHODAS 2.0 seems to be a valid, reliable self-report instrument for the assessment of disability. The increasing interest in use of the WHODAS 2.0 extends to rehabilitation and life sciences rather than being limited to psychiatry. WHODAS 2.0 is suitable for assessing health status and disability in a variety of settings and populations. A critical issue for rehabilitation is that a single “minimal clinically important .difference” score for the WHODAS 2.0 has not yet been established.
Conference Paper
Full-text available
Several new and revised ISO standards will be published in 2016/17 that define the basic terms and concepts of usability (ISO 9241-11), give guidance on processes and outcomes of human-centred design (ISO 9241-220), provide examples of measures that can be used in usability evaluation (ISO/IEC 25022 and 25023) and define what should be included in usability evaluation reports for usability tests, inspections and surveys (ISO/IEC 25066). The paper explains some of the new content and how it can be used.
Article
Full-text available
Background: This study was an extension of research which began in the Umbria region in 2009. Aim: To investigate the extent to which assistive technology (AT) has been abandoned by users of the Italian National Health Service (ULHS) and the reasons for this. Design: Observational study. Setting: Users who received a hearing device (HD) or mobility device (MD) by ULHS between 2010 and 2013. Population: 749 out of 3,791 ULHS users contacted via telephone completed the interview: 330 (44.06%) had a HD and 419 (55.94%) a MD. Methods: Data were collected using a specially developed telephone interview questionnaire including the Italian version of the Quebec User Evaluation of Satisfaction with AT (QUEST 2.0) and Assistive Technology Use Follow-up Survey (ATUFS). Results: 134 users (17.9%) were no longer using their assigned AT device within seven months of issue and 40% of this group reported that they had never used the device. Duration of use (for how long the AT device was used before abandonment) and satisfaction with service delivery did not predict AT abandonment. People who received a HD where more likely to abandon their device (22.4%) than those who received a MD (14.4%). Conclusions: Abandonment may be due to assignment of inappropriate devices or failure to meet user needs and expectations. These findings are consistent with previous data collected by Federici and Borsci in 2009. Utility of AT in use, reasons of abandonment, and importance of device and service satisfaction for the use or non-use of an AT are presented and discussed. Clinical rehabilitation impact: AT abandonment surveys provide useful information for modelling AT assessment and delivery process. The study confirms the relevance of person centredness approach for a successful AT assessment and delivery process.
Article
Full-text available
This paper reports a study which demonstrates the advantages of using virtual-reality-based systems for training automotive assembly tasks. Sixty participants were randomly assigned to one of the following three training experiences to learn a car service procedure: (1) observational training through video instruction; (2) an experiential virtual training and trial in a CAVE; and (3) an experiential virtual training and trial through a portable 3D interactive table. Results show that virtual trained participants, after the training, can remember significantly better (p < .05) the correct execution of the steps compared to video-trained trainees. No significant differences were identified between the experiential groups neither in terms of post-training performances nor in terms of proficiency, despite differences in the interaction devices. The relevance of the outcomes for the automotive fields and for the designers of virtual training applications are discussed in light of the outcomes, particularly that virtual training experienced through a portable device such as the interactive table can be effective, as can training performed in a CAVE. This suggests the possibility for automotive industries to invest in advanced portable hardware to deliver effectively long-distance programs of training for car service operators placed all over the world.
Article
Full-text available
Usability comprises the aspects effectiveness, efficiency, and satisfaction. The correlations between these aspects are not well understood for complex tasks. We present data from an experiment where 87 subjects solved 20 information retrieval tasks concerning programming problems. The correlation between efficiency, as indicated by task completion time, and effectiveness, as indicated by quality of solution, was negligible. Generally, the correlations among the usability aspects depend in a complex way on the application domain, the user's experience, and the use context. Going through three years of CHI Proceedings, we find that 11 out of 19 experimental studies involving complex tasks account for only one or two aspects of usability. When these studies make claims concerning overall usability, they rely on risky assumptions about correlations between usability aspects. Unless domain specific studies suggest otherwise, effectiveness, efficiency, and satisfaction should be considered independent aspect of usability and all be included in usability testing.
Article
Full-text available
Point-of-care in vitro diagnostics (POC-IVD) are increasingly becoming widespread as an acceptable means of providing rapid diagnostic results to facilitate decision-making in many clinical pathways. Evidence in utility, usability and cost-effectiveness is currently provided in a fragmented and detached manner that is fraught with methodological challenges given the disruptive nature these tests have on the clinical pathway. The Point-of-care Key Evidence Tool (POCKET) checklist aims to provide an integrated evidence-based framework that incorporates all required evidence to guide the evaluation of POC-IVD to meet the needs of policy and decisionmakers in the National Health Service (NHS). A multimethod approach will be applied in order to develop the POCKET. A thorough literature review has formed the basis of a robust Delphi process and validation study. Semistructured interviews are being undertaken with POC-IVD stakeholders, including industry, regulators, commissioners, clinicians and patients to understand what evidence is required to facilitate decision-making. Emergent themes will be translated into a series of statements to form a survey questionnaire that aims to reach a consensus in each stakeholder group to what needs to be included in the tool. Results will be presented to a workshop to discuss the statements brought forward and the optimal format for the tool. Once assembled, the tool will be field-tested through case studies to ensure validity and usability and inform refinement, if required. The final version will be published online with a call for comments. Limitations include unpredictable sample representation, development of compromise position rather than consensus, and absence of blinding in validation exercise. The Imperial College Joint Research Compliance Office and the Imperial College Hospitals NHS Trust R&D department have approved the protocol. The checklist tool will be disseminated through a PhD thesis, a website, peer-reviewed publication, academic conferences and formal presentations. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Article
Full-text available
Nowadays, practitioners extensively apply quick and reliable scales of user satisfaction as part of their user experience (UX) analyses to obtain well-founded measures of user satisfaction within time and budget constraints. However, in the human-computer interaction (HCI) literature the relationship between the outcomes of standardized satisfaction scales and the amount of product usage has been only marginally explored. The few studies that have investigated this relationship have typically shown that users who have interacted more with a product have higher satisfaction. The purpose of this paper was to systematically analyze the variation in outcomes of three standardized user satisfaction scales (SUS, UMUX and UMUX-LITE) when completed by users who had spent different amounts of time with a website. In two studies, the amount of interaction was manipulated to assess its effect on user satisfaction. Measurements of the three scales were strongly correlated and their outcomes were significantly affected by the amount of interaction time. Notably, the SUS acted as a unidimensional scale when administered to people who had less product experience, but was bidimensional when administered to users with more experience. We replicated previous findings of similar magnitudes for the SUS and UMUX-LITE (after adjustment), but did not observe the previously reported similarities of magnitude for the SUS and the UMUX. Our results strongly encourage further research to analyze the relationships of the three scales with levels of product exposure. We also provide recommendations for practitioners and researchers in the use of the questionnaires.
Article
Full-text available
A replication is an attempt to confirm an earlier study's findings. It is often claimed that research in Human-Computer Interaction (HCI) contains too few replications. To investigate this claim we examined four publication outlets (891 papers) and found 3% attempting replication of an earlier result. The replications typically confirmed earlier findings, but treated replication as a confirm/not-confirm decision, rarely analyzing effect sizes or comparing in depth to the replicated paper. When asked, most authors agreed that their studies were replications, but rarely planned them as such. Many non-replication studies could have corroborated earlier work if they had analyzed data differently or used minimal effort to collect extra data. We discuss what these results mean to HCI, including how reporting of studies could be improved and how conferences/journals may change author instructions to get more replications.
Article
Full-text available
The focus of this paper is interaction design research aimed at supporting interaction design practice . The main argument is that this kind of interaction design research has not (always) been successful, and that the reason for this is that it has not been guided by a sufficient understanding of the nature of design practice . Based on a comparison between the notion of complexity in science and in design, it is argued that science is not the best place to look for approaches and methods on how to approach design complexity . Instead, the case is made that any attempt by interaction design research to produce outcomes aimed at supporting design practice must be grounded in a fundamental understanding of the nature of design practice. Such an understanding can be developed into a well-grounded and rich set of rigorous and disciplined design methods and techniques, appropriate to the needs and desires of practicing designers.
Book
Full-text available
The book we propose is not only a classic handbook or a practical guide for evaluation practictioners that presents and discusses one or a set of evaluation techniques for assessing diferent aspects of interaction. Our proposal is at first a new theoretical perspective in the human computer interaction evaluation that aims to integrate, in a multisteps evaluation process, more techniques for obtaining a whole assessment of interaction. Our theorical perspective is supported by an historical and experimental argumentation. Secondary our book by merging a user center perspective with the idea of user experience and with the growing need of disabled users partecipation in the evalaution and in the improvment of the HCI, proposes a reconceptualization of the web, social and portable tecnologies in a new category the “psychotecnologies” with specific properties. The integrated methodology of intercation evalaution is proposed as a framework for practictioners in order to evaluate all the aspects of the interaction from the accessibility (i.e. the more obejective point of view) to the staisfaction (i.e. the most subjective poitn of view). The evalaution techniques we analyse and the evaluation tools we propose in the book are supported by experimental exemplifications and are correlated to their application in the integrated methodology. Our goal is not only to presents the correct application of the techniques, but also to promote a standard evaluation process in which disabled and not disabled peoples are involved in the assessment.
Article
Full-text available
The rise and fall of organizational effectiveness, an "umbrella construct" once at the forefront of organizational theory, is traced through four life-cycle stages: emerging excitement, the validity challenge, "tidying up with typologies," and construct collapse. Although the study of effectiveness has declined, re- search on its component elements continues to thrive. Using the effectiveness story as an exemplar, we develop a more general model of this process for all umbrella constructs, defined here as broad concepts used to encompass and account for a diverse set of phenomena. This life-cycle model—driven largely by a dialectic between researchers with a broad perspective ("um- brella advocates") and those with a narrower one ("validity po- lice")—leaves open the possibility that some umbrella con- structs may ultimately be made coherent or remain permanently controversial rather than collapse, as effectiveness has done. We propose that umbrella constructs will arise most frequently in academic fields without a theoretical consensus, will inevitably have their validity seriously challenged, will have a shorter life than their constituent elements, and will be more vulnerable to validity challenges when they lack support from practitioners. This model's implications for the future direction of such cur- rent umbrella constructs as organizational learning, culture, strategy, and performance are also explored and elaborated. Ironically, some evidence suggests that studies around the con- struct of organizational "performance" have arisen to replace the nearly identical, but fallen umbrella construct of organiza- tional effectiveness. (Sociology of Organization Science ; Paradigms; Theory Development; Organization Theory; Umbrella Con- structs) The question for organizational science is whether the field can strike an appropriate balance between theoretical tyranny and an anything-goes attitude. (Pfeffer 1993, p. 616)
Conference Paper
Full-text available
The replication of, or perhaps the replicability of, research is often considered to be a cornerstone of scientific progress. Yet unlike many other disciplines, like medicine, physics, or mathematics, we have almost no drive and barely any reason to consider replicating the work of other HCI researchers. Our community is driven to publish novel results in novel spaces using novel designs, and to keep up with evolving technology. The aim of this workshop is to trial a new venue that embodies the plans made in previous SIGs and panels, such that we can begin to give people an outlet to publish experiences of attempting to replicate HCI research, and challenge or confirm its findings.
Article
Full-text available
Although 'user experience' (UX) has become a fashionable term in human-computer interaction over the past 15 years, the practical application of this (multidimensional) concept requires further advances. First, measurement models of UX are essential: they allow the concept to be measured accurately and, thereby, can aid the evaluation of interactive computer systems. Second, structural models of UX are needed: they establish the structural (antecedent-consequent or cause-and-effect) relations between its components and of these components to characteristics of users and computer systems; consequently, they can inform the design of interactive computer systems. As a proposed agenda for research and practice, we discuss various issues that need to be considered in developing and applying both types of model. We anticipate the further fruitful application of the concept of UX in terms of its measurement models and structural models.
Article
Full-text available
Over the last decade, 'user experience' (UX) became a buzzword in the field of human – computer interaction (HCI) and interaction design. As technology matured, interactive products became not only more useful and usable, but also fashionable, fascinating things to desire. Driven by the impression that a narrow focus on interactive products as tools does not capture the variety and emerging aspects of technology use, practitioners and researchers alike, seem to readily embrace the notion of UX as a viable alternative to traditional HCI. And, indeed, the term promises change and a fresh look, without being too specific about its definite meaning. The present introduction to the special issue on 'Empirical studies of the user experience' attempts to give a provisional answer to the question of what is meant by 'the user experience'. It provides a cursory sketch of UX and how we think UX research will look like in the future. It is not so much meant as a forecast of the future, but as a proposal – a stimulus for further UX research.
Article
Full-text available
In recent years, HCI has been influenced by a movement known as user experience (UX), which denotes new ways of understanding and studying the quality in use of interactive products. UX emphasizes on hedonic qualities of use and much more broadly on experience. Hedonic qualities concern, for instance, aesthetics, fun, and identification that people experience during interaction. This focus requires new approaches for designing and evaluating interactive products because existing methods are unable – it is claimed – to capture experience. Thus, many researchers in the field of UX state that they methodologically break new ground or study new facets of interactive products’ use. In this article we discuss whether this is really the case. To what extent is UX research novel, and to what extent does it build on usability evaluation or traditional HCI research? To answer these questions, we reviewed 51 publications from 2005 to 2009, reporting a total of 66 empirical studies of UX.
Article
Full-text available
This chapter introduces a range of evaluation methods that assist developers in the creation of interactive electronic products, services and environments (eSystems) that are both easy and pleasant to use for the target audience. The target audience might be the broadest range of people, including people with disabilities and older people or it might be a highly specific audience, such as university students studying biology. The chapter will introduce the concepts of accessibility, usability and user experience as the criteria against which developers should be evaluating their eSystems, and the iterative user-centred design lifecycle as the framework within which the development and evaluation of these eSystems can take place. Then a range of methods for evaluating accessibility, usability and user experience will be outlined, with information about their appropriate use and strengths and weaknesses.
Chapter
Full-text available
There is little room in today’s educational climate for technologies that do not either accelerate or greatly increase learning (Roblyer, 2005). While 3-D environments, like their game cousins, are motivating and engaging to students (Jenkins, Squire, & Tan, 2003; Tuzun, 2004), there are other educationally sound mechanisms that fit into current time and learning constraints that also achieve the same or better learning outcomes for students. The fact that students spend a lot of time playing games does not mean that the games are based on a sound, efficient and effective instructional design. An examination of several documented games and environments used for learning indicate that many learning games do not demonstrate a sound, efficient educational or instructional design (Dondlinger, 2007).
Chapter
Developers work to create eSystems that are easy and straightforward for people to use. Terms such as user friendly and easy to use o en indicate these characteristics, but the overall technical term for them is usability. e ISO 9241 standard on Ergonomics of Human System Interaction2 (Part 11, 1998) defines usability as: e extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use.
Article
Aims To test a hypothesis that cataract operating room (OR) productivity can be improved with a femtosecond laser (FL) using a hub-and-spoke model and whether any increase in productivity can offset additional costs relating to the FL. Methods 400 eyes of 400 patients were enrolled in a randomised-controlled trial comparing FL-assisted cataract surgery (FLACS) with conventional phacoemulsification surgery (CPS). 299 of 400 operations were performed on designated high-volume theatre lists (FLACS=134, CPS=165), where a hub-and-spoke FLACS model (1×FL, 2×ORs=2:1) was compared with independent CPS theatre lists. Details of operative timings and OR utilisation were recorded. Differences in productivity between hub-and-spoke FLACS and CPS sessions were compared using an economic model including testing hypothetical 3:1 and 4:1 models. Results The duration of the operation itself was 12.04±4.89 min for FLACS compared with CPS of 14.54±6.1 min (P<0.001). Total patient time in the OR was reduced from 23.39±6.89 min with CPS to 20.34±5.82 min with FLACS (P<0.001)(reduction of 3.05 min per case). There was no difference in OR turnaround time between the models. Average number of patients treated per theatre list was 9 for FLACS and 8 for CPS. OR utilisation was 92.08% for FLACS and 95.83% for CPS (P<0.001). Using a previously established economic model, the FLACS service cost £144.60 more than CPS per case. This difference would be £131 and £125 for 3:1 and 4:1 models, respectively. Conclusion The FLACS hub-and-spoke model was significantly faster than CPS, with patients spending less time in the OR. This enabled an improvement in productivity, but insufficient to meaningfully offset the additional costs relating to FLACS.
Article
This commentary discusses “The Usability Construct: A Dead End?” That article debates the conceptual qualities of ‘usability’ as a key concept in HCI. It provides two main conclusions: (1) the usefulness of the concept of ‘usability’ to HCI theories has been limited and (2) a notion of ‘construct’ is useful for identifying weaknesses of the concept of ‘usability’ and thereby provide reasons for its limited usefulness. The article is interesting and provides an in-depth analysis of the concept of usability. Unfortunately, the use of this analysis is not convincing. Instead, redefinition of ‘usability’ should be based on a constructive approach.
Article
“Usability” is a construct conceived by the human–computer interaction (HCI) community to denote a desired quality of interactive systems and products. Despite its prominence and intensive use in HCI research, the usefulness of the usability construct to HCI theories and to our understanding of HCI has been meager. In this article I propose and discuss two reasons for this state of affairs. The first is that usability is an umbrella construct. Umbrella constructs are prevalent in scientific fields that are broad, diverse, and lack a unifying research paradigm. Accordingly, umbrella constructs, such as usability, tend to be vague and loose, characteristics that challenge our ability to accumulate and communicate knowledge and to capture real-world phenomena. The second reason involves the nature of the relations between the usability construct and its measures, a topic rarely discussed in HCI research. There appears to be a mismatch between how the HCI community has (implicitly) conceptualized these relations and how it has empirically examined them. The relations have been conceptualized according to a formative measurement model but have mostly been tested according to a reflective measurement model. The trouble is that representing the usability construct by the reflective model appears inappropriate, and representing it by the formative model involves considerable difficulties. Possible ways of addressing these issues are discussed, each with its advantages and drawbacks. I conclude that for scientific research on this subject to progress, the usability construct ought to be unbundled and replaced by well-defined constructs. The issues discussed in this article are relevant to other HCI umbrella concepts and constructs such as user experience.
Article
Use of in-vitro point of care devices - intended as tests performed out of laboratories and near patient - is increasing in clinical environments. International standards indicate that interaction assessment should not end after the product release, yet human factors methods are frequently not included in clinical and empirical studies of these devices. Whilst the literature confirms some advantages of bed-side tests compared to those in laboratories there is a lack of knowledge of the risks associated with their use. This article provides a review of approaches applied by clinical researchers to model the use of in-vitro testing. Results suggest that only a few studies have explored human factor approaches. Furthermore, when researchers investigated people-device interaction these were predominantly limited to qualitative and not standardised approaches. The methodological failings and limitations of these studies, identified by us, demonstrate the growing need to integrate human factors methods in the medical field.
Article
Many usability practitioners conduct most of their usability evaluations to improve a product during its design and development. We call these "formative" evaluations to distinguish them from "summative" (validation) usability tests at the end of development." https://uxpajournal.org/towards-the-design-of-effective-formative-test-reports/
Article
The debate on effectiveness of virtual and mixed reality (VR/MR) tools for training professionals and operators is long-running with prominent contributions arguing that there are several shortfalls of experimental approaches and assessment criteria reported within the literature. In the automotive context, although car-makers were pioneers in the use of VR/MR tools for supporting designers, researchers started only recently to explore the effectiveness of VR/MR systems as mean for driving external operators of service centres to acquire the procedural skills necessary for car maintenance processes. In fact, from 463 journal articles on VR/MR tools for training published in the last thirty years, we identified only eight articles in which researchers experimentally tested the effectiveness of VR/MR tools for training service operators’ skills. To survey the current findings and the deficiencies of these eight studies, we use two main drivers: (i) a well-known framework of organizational training programmes, and (ii) a list of eleven evaluation criteria widely applied by researchers of different fields for assessing the effectiveness of training carried out with VR/MR systems. The analysis that we present allows us to: (i) identify a trend among automotive researchers of focusing their analysis only on car service operators’ performance in terms of time and errors, by leaving unexplored important pre- and post-training aspects that could affect the effectiveness of VR/MR tools to deliver training contents – e.g., people skills, previous experience, cibersickness, presence and engagement, usability and satisfaction and (ii) outline the future challenges for designing and assessing VR/MR tools for training car service operators.
Article
It is important for practitioners to conceptualize and tailor a prototype in tune with the users’ expectations in the early stages of the design life cycle so the modifications of the product design in advanced phases are kept to a minimum. According to user preference studies, the aesthetic and the usability of a system play an important role in the user appraisal and selection of a product. However, user preferences are just a part of the equation. The fact that a user prefers one product over the other does not mean that he or she would necessarily buy it. To understand the factors affecting the user's assessment of a product before the actual use of the product and the user's intention to purchase the product we conducted a study, reported in this article. Our study, a modification of a well-known protocol, considers the users’ preferences of six simulated smartphones each with different combination of attributes. A sample consisting of 365 participants was involved in our analysis. Our results confirm that the main basis for the users’ pre-use preferences is the aesthetics of the product, whereas our results suggest that the main basis for the user's intention to purchase are the expected usability of the product. Moreover, our analysis reveals that the personal characteristics of the users have different effects on both the users’ preferences and their intention to purchase a product. These results suggest that the designers should carefully balance the aesthetics and usability features of a prototype in tune with the users expectations. If the conceptualization of a product is done properly the redesign cycles after the usability testing can be reduced and speed up the process for releasing the product on the market.
Article
The philosopher of science J. W. Grove (1989) once wrote, “There is, of course, nothing strange or scandalous about divisions of opinion among scientists. This is a condition for scientific progress” (p. 133). Over the past 30 years, usability, both as a practice and as an emerging science, has had its share of controversies. It has inherited some from its early roots in experimental psychology, measurement, and statistics. Others have emerged as the field of usability has matured and extended into user-centered design and user experience. In many ways, a field of inquiry is shaped by its controversies. This article reviews some of the persistent controversies in the field of usability, starting with their history, then assessing their current status from the perspective of a pragmatic practitioner. Put another way: Over the past three decades, what are some of the key lessons we have learned, and what remains to be learned? Some of the key lessons learned are:• When discussing usability, it is important to distinguish between the goals and practices of summative and formative usability.• There is compelling rational and empirical support for the practice of iterative formative usability testing—it appears to be effective in improving both objective and perceived usability.• When conducting usability studies, practitioners should use one of the currently available standardized usability questionnaires.• Because “magic number” rules of thumb for sample size requirements for usability tests are optimal only under very specific conditions, practitioners should use the tools that are available to guide sample size estimation rather than relying on “magic numbers.”
Article
How to measure usability is an important question in HCI research and user interface evaluation. We review current practice in measuring usability by categorizing and discussing usability measures from 180 studies published in core HCI journals and proceedings. The discussion distinguish several problems with the measures, including whether they actually measure usability, if they cover usability broadly, how they are reasoned about, and if they meet recommendations on how to measure usability. In many studies, the choice of and reasoning about usability measures fall short of a valid and reliable account of usability as quality-in-use of the user interface being studied. Based on the review, we discuss challenges for studies of usability and for research into how to measure usability. The challenges are to distinguish and empirically compare subjective and objective measures of usability; to focus on developing and employing measures of learning and retention; to study long-term use and usability; to extend measures of satisfaction beyond post-use questionnaires; to validate and standardize the host of subjective satisfaction questionnaires used; to study correlations between usability measures as a means for validation; and to use both micro and macro tasks and corresponding measures of usability. In conclusion, we argue that increased attention to the problems identified and challenges discussed may strengthen studies of usability and usability research.
Article
Over two decades of research has been conducted using mobile devices for health related behaviors yet many of these studies lack rigor. There are few evaluation frameworks for assessing the usability of mHealth, which is critical as the use of this technology proliferates. As the development of interventions using mobile technology increase, future work in this domain necessitates the use of a rigorous usability evaluation framework. We used two exemplars to assess the appropriateness of the Health IT Usability Evaluation Model (Health-ITUEM) for evaluating the usability of mHealth technology. In the first exemplar, we conducted 6 focus group sessions to explore adolescents' use of mobile technology for meeting their health Information needs. In the second exemplar, we conducted 4 focus group sessions following an Ecological Momentary Assessment study in which 60 adolescents were given a smartphone with pre-installed health-related applications (apps). We coded the focus group data using the 9 concepts of the Health-ITUEM: Error prevention, Completeness, Memorability, Information needs, Flexibility/Customizability, Learnability, Performance speed, Competency, Other outcomes. To develop a finer granularity of analysis, the nine concepts were broken into positive, negative, and neutral codes. A total of 27 codes were created. Two raters (R1 & R2) initially coded all text and a third rater (R3) reconciled coding discordance between raters R1 and R2. A total of 133 codes were applied to Exemplar 1. In Exemplar 2 there were a total of 286 codes applied to 195 excerpts. Performance speed, Other outcomes, and Information needs were among the most frequently occurring codes. Our two exemplars demonstrated the appropriateness and usefulness of the Health-ITUEM in evaluating mobile health technology. Further assessment of this framework with other study populations should consider whether Memorability and Error prevention are necessary to include when evaluating mHealth technology.
Article
Public service motivation refers to the type of motivation to perform behavior that relates typically to the public sector; such as altruism or public interest. The concept was originally developed within an American context by James Perry and other academics. However; one can distinguish a similar, if not identical, concept in several European countries. This article makes a comparison between two European cases of public service motivation: the United Kingdom and Germany. The results show that although there are similarities between them, there are also marked differences. Our findings provide the basis for further research to explore the phenomenon in a cross cultural and international context.
Article
In The Psychology of Attitudes, we provided an abstract - or umbrella - definition of attitude as "a psychological tendency that is expressed by evaluating a particular entity with some degree of favor or disfavor" (Eagly & Chaiken, 1993, p. 1). This definition encompasses the key features of attitudes - namely, tendency, entity (or attitude object), and evaluation. This conception of attitude distinguishes between the inner tendency that is attitude and the evaluative responses that express attitudes. Our definition invites psychologists to specify the nature of attitudes by proposing theories that provide metaphors for the constituents of the inner tendency that is attitude. We advocate theoretical metaphors that endow attitudes with structural qualities.
Comments on the article by W. F. Whyte, D. J. Greenwood, and P. Lazes (see record 1989-38467-001) by using the action science perspective to point out certain practical limitations and conceptual gaps in their description of the Xerox participatory action research intervention. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
The conventional assumption that quality is an attribute of a product is misleading, as the attributes required for quality will depend on how the product is used. Quality of use is therefore defined as the extent to which a product satisfies stated and implied needs when used under stated conditions. Quality of use can be used to measure usability as the extent to which specific goals can be achieved with effectiveness, efficiency and satisfaction by specified users carrying out specified tasks in specified environments. Practical and reliable methods of measuring quality of use have been developed by the MUSiC project. These provide criteria for usability which can be incorporated into a quality system. A description is given of the MUSiC methods for specifying the context of use and measuring effectiveness, efficiency, and satisfaction.
Article
The increased complexity of medical technology makes usability an important selection criterion when new equipment is purchased. However, this requires an understanding of what usability is in a medical technology context and what usability evaluation methods are suitable. A questionnaire was used to investigate what users of medical technology regard as the largest component of usability. The component ‘difficult to make errors’ was regarded as being 30% of overall usability. The components ‘easy to learn’, ‘efficient to use’, ‘easy to remember’ made up 20% each of overall usability. Satisfaction only made up 10% of overall usability. Four common methods, hierarchical task analysis, cognitive walkthrough, heuristic evaluation and usability tests were evaluated according to thoroughness, validity, reliability, cost effectiveness and clarity. Usability tests are recommended to be the primary method in usability evaluations at hospitals, as they fulfil the criteria and address the ‘difficult to make errors’ aspect of overall usability. Hierarchical task analysis and cognitive walkthrough fulfil some criteria. Cognitive walkthrough also addresses the ‘difficult to make errors’ aspect.Relevance to industryThere is an increasing awareness of the need for higher usability of medical technology. This requires an understanding of what usability is and what usability evaluation methods are suitable, both in the design process and when medical technology is purchased at hospitals.