Article

Measuring Perceived Usability: The CSUQ, SUS, and UMUX

Authors:
  • MeasuringU
To read the full-text of this research, you can request a copy directly from the author.

Abstract

The primary purpose of this research was to investigate the relationship between two widely used questionnaires designed to measure perceived usability: the Computer System Usability Questionnaire (CSUQ) and the System Usability Scale (SUS). The correlation between concurrently collected CSUQ and SUS scores was 0.76 (over 50% shared variance). After converting CSUQ scores to a 0–100-point scale (to match the range of the SUS scores), there was a small but statistically significant difference between CSUQ and SUS means. Although this difference (just under 2 scale points out of a possible 100) was statistically significant, it did not appear to be practically significant. Although usability practitioners should be cautious pending additional independent replication, it appears that CSUQ scores, after conversion to a 0–100-point scale, can be interpreted with the Sauro–Lewis curved grading scale. As a secondary research goal, investigation of variations of the Usability Metric for User Experience (UMUX) replicated previous findings that the regression-adjusted version of the UMUX-LITE (UMUX-LITEr) had the closest correspondence with concurrently collected SUS scores. Thus, even though these three standardized questionnaires were independently developed and have different item content and formats, they largely appear to be measuring the same thing, presumably, perceived usability.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Responden diberikan kuesioner SUS terdiri dari 10 pertanyaan dengan memberikan skala 1 sampai 5. Skala 1 sebagai sangat tidak setuju, skala 2 sebagai tidak setuju, skala 3 sebagai netral, skala 4 sebagai setuju, dan skala 5 sebagai sangat setuju. Setelah memperoleh skor akhir SUS, maka nilai dari usability didapat dan dilakukan perbandingan dengan tabel kriteria interpretasi nilai The Sauro-Lewis Curved Grading Scale [10] dan skala pengukuran SUS [11]. 4 3 3 4 4 3 4 3 4 3 57,50 R3 3 3 3 3 3 3 3 3 3 3 50 R4 5 5 3 5 3 4 3 4 4 5 37,50 R5 3 4 3 4 3 2 4 3 4 4 50 R6 3 3 3 5 5 1 5 1 4 3 67,50 R7 3 3 3 3 3 3 3 3 3 3 50 R8 5 5 1 1 1 5 2 2 3 R19 3 3 3 3 3 3 3 3 3 3 50 R20 5 4 4 4 4 3 3 3 3 3 55 R21 4 3 3 5 3 5 5 5 3 3 42,50 R22 5 2 5 5 4 4 3 3 4 5 55 R23 3 3 3 3 3 2 2 2 3 5 47,50 R24 3 4 4 4 4 2 3 4 3 3 50 R25 4 5 5 5 5 5 5 5 5 5 5 50 R35 4 3 4 3 3 3 4 2 4 3 62,50 R36 5 1 5 1 1 1 5 1 5 1 90 R37 3 5 2 5 3 5 3 5 3 5 22,50 R38 4 3 4 2 3 4 4 4 4 4 55 R39 2 4 3 4 4 4 3 3 3 4 40 R40 3 3 3 3 3 3 3 3 3 3 50 R41 2 1 1 1 1 1 1 1 1 1 52,50 R42 4 3 3 3 3 3 4 2 4 3 60 R43 4 2 5 2 3 3 4 2 4 3 70 R44 4 3 3 4 3 2 4 2 4 4 57,50 R45 3 2 2 2 3 2 2 2 3 ...
... penulis melakukan perbandingan dengan tabel kriteria interpretasi nilai The Sauro-Lewis Curved Grading Scale[10] dan skala pengukuran SUS[11]. Dengan skor akhir SUS 52,61 dan membandingkan pada Tabel 2.2, maka website Sistem Informasi "SiCantik" Dinas Penanaman Modal dan Pelayanan Terpadu Satu Pintu (DPMPTSP) Pemerintah Kota Medan mendapatkan nilai D dengan tingkat OK dan masuk dalam kategori Marginal[10]. ...
... penulis melakukan perbandingan dengan tabel kriteria interpretasi nilai The Sauro-Lewis Curved Grading Scale[10] dan skala pengukuran SUS[11]. Dengan skor akhir SUS 52,61 dan membandingkan pada Tabel 2.2, maka website Sistem Informasi "SiCantik" Dinas Penanaman Modal dan Pelayanan Terpadu Satu Pintu (DPMPTSP) Pemerintah Kota Medan mendapatkan nilai D dengan tingkat OK dan masuk dalam kategori Marginal[10]. Setelah itu, penulis membandingkan skor akhir SUS dengan Gambar 2.1 dan website Sistem Informasi "SiCantik" Dinas Penanaman Modal dan Pelayanan Terpadu Satu Pintu (DPMPTSP) Pemerintah Kota Medan tetap memperoleh tingkat OK[11]. ...
... Although the aforementioned methods can be adapted to child users in order to measure their user experiences, they may not be entirely applicable to longitudinal studies of perceived usability. This also makes it difficult to compare the usability results with standardized questionnaires because perceived usability is often measured using standardized questionnaires (Assila et al., 2016), which are then used to generate norms and usability grades (CGS) from a large amount of test data (Lewis, 2018). ...
... The test results were divided into three phases and one retrospective test, corresponding to a total of ten measurements. Because we recalibrated UMUX-Lite into a five-point scale, we normalized the total score to 100 to fit the CGS (Lewis, 2018). The basic statistics are shown in Figure 5. ...
... On the basis of our experimental measurements, Cronbach's alpha coefficient for Test (i) was 0.845 and that for Test (ii) was 0.874. These values are slightly higher than those reported in other analogous studies (Lewis, 2018; and some translation studies Wang & Wang, 2022), and they importantly exceed the commonly accepted value of 0.7. This confirms that the UMUX-Lite scale using cartoon graphics is reliable, and furthermore, that cartoon patterns can fit the semantic degrees of the Likert scale. ...
Article
Full-text available
This study examines longitudinal changes in children's perceived usability based on two aspects. First, we developed a child-friendly usability questionnaire, which used cartoons to express the questionnaire response options. This approach provides an easy-to-understand five-point scale and a filling process using magnetic blocks, which together lead to highly reliable results. Additionally, we designed a longitudinal study to investigate the children's perceived usability according to two measurement methods (immediate and retrospective). The children's usability increased with longer durations of usage (i.e., increased repetitions of exercises). The short-term retrospective assessments depended on the most recent experience, whereas the long-term retrospective assessments were generally more positive.
... After determining the exogenous and endogenous variables, the researcher formulated three hypotheses to be tested in this study, namely: and satisfaction (S) [22], [23]. And also has been shown to measure perceived usability [35]. The indicator of the user retention (R) variable is paying attention to the indicators of Increasing purchases as tenure grows, Customer referrals, and Premium prices [20]. ...
... Seven points Likert scale on [21] and [22] has been effective to avoid the habit of the respondents' answers. Basically, the CSUQ also uses seven points Likert scales to indicate higher satisfaction and switches the labels for "strongly agree" and "strongly disagree" [35]. ...
Article
Full-text available
BukuWarung has been known as a MSME’s bookkeeping application in Indonesia. As the features developed, there were disappointed responses from Playstore reviews, social media, and research interviews who complained about the usability aspect of this application, thus triggering the desire for users to stop using the application. This condition motivates the author to assess the impact level of usability in aspects of Effectiveness, Efficiency, and Satisfaction on user retention. This study used a partial least square–structural equation model method. The total of 248 user samples was obtained using simple random sampling and voluntary response techniques. The research uses CSUQ, as well as questions for user retention. For data testing used Ms. Excel and SmartPLS version 3.3.3. According to the measurement model, 7 of 22 all variables were deleted. This study proves that Effectiveness, Efficiency, and Satisfaction have a positive and significant effect of up to 74.8% on user retention. However, the size of influence and relative influence is weak on user retention. User retention indicators affected by usability are increasing purchases as tenure grows, customer referrals, and premium prices. The implication for BukuWarung is to conduct usability testing on each feature.
... In the third iteration, the analysis stage corresponds to the analysis performed over the data collected during the interviews in which the high-fidelity prototype was subjected to evaluation. The tables containing the notes, recordings and problem identifications of the SmartMoving high-fidelity prototype, can be found at the following URL 14 . In summary, below we present the list of the problems found by the interviewees, their suggestions and their respective solutions. ...
... Once the tour was finished, the evaluator gave each user a questionnaire about the usability of the application they used in the test. The CSUQ [13,14] questionnaire was used. The questions can be found at the following URL 15 . ...
Article
Full-text available
The state of sidewalks in Asunción, Paraguay is far from being optimal. There are many problems such as obstructions, necessary repairs, lack of ramps, unevenness of surface, among others. In addition, the Municipality of Asunción does not have automated mechanisms to know the updated status of sidewalks. In this work we propose SmartMoving, a mobile application that collects information on the state of the sidewalks, with the help of citizens, and recommends pedestrian paths with fewer obstacles. The application can be especially useful for people with reduced mobility, as well as for the Municipality of Asunción. This type of application is based on citizen participation, since it receives data from them, and therefore requires a particularly friendly user experience, adapted to users and their daily context. Therefore, the need for a participatory and user-centered design, as a basis for the development of the application. Therefore, in this paper we present the SmartMoving application and the user-centered design process that has been followed for its development, involving users with reduced mobility.
... Lewis [43] investigated the relationship between two widely used questionnaires for measuring perceived usability: the computer system usability questionnaire (CSUQ) and the system usability scale (SUS). Both concepts include subjective and objective components. ...
... The focus is on perceived ease of use or satisfaction and efficiency, such as the time and effort needed to use the digital tool. The research by Lewis [43] showed that the independently developed standardized questionnaires, with different item contents and formats, largely equally measure perceived ease of use. Since its introduction, the most popular questionnaire for usability studies of technology-based applications is the SUS questionnaire developed by Brooke [34]. ...
Article
Full-text available
Digitization offers new perspectives for educational research to identify the effects of visualizations regarding cognitive processing. In addition, new types of data can be generated, expanding the possibilities for visualizing cognitive processes and understanding human learning. Digital twins are already used in Industry 4.0, as an additional visualization to a real object, for data mining and data analysis for process optimization. The increasing integration of digital twins in the industrial sector requires the formulation of corresponding educational goals to ensure high-quality and future-oriented education. Therefore, future generations must be introduced to technologies from industry during their education. In this paper, an intelligent photometric measurement system called SmaEPho with a digital twin for science, technology, engineering, and mathematics (STEM) learning is presented. In addition to its function as a photometric measurement device, an intelligent sensor technology allows for data generation on the user’s usage behavior. The digital twin reflects and visualizes these data in real-time. This enables a variety of new didactic and methodological approaches in teaching. A first study evaluating the hardware and tracking components of SmaEPho shows that the deviation accuracy of the measurement system is sufficient for experimental applications in schools. Another study with n=52 students confirmed the excellent usability of the SmaEPho hardware platform. These research results lay the foundation for a variety of future research questions on data analysis and machine learning algorithms with the aim of increasing the quality of education. The use of intelligent digital twins as an element of digitization in educational contexts offers the extended possibility of identifying cognitive processing steps using this technology.
... The CSUQ (Lewis, 2018) is a validated measure that examines the usability of computer systems and software. The CSUQ consists of 16 7-point Likert questions (strongly disagree to strongly agree) and two additional questions that ask participants to provide the three most negative and positive aspects of their experience Field notes ...
... Scores of 68 are considered to represent above average usability. CSUQ scores were obtained by using a formula outlined by Lewis (2018) which converts the results to a 100-point scale to match the SUS. Data from the Adjectival Ease of Use Scale and exit tickets were input into a spreadsheet to calculate descriptive statistics. ...
Article
The purpose of this paper is to describe the formative design, development, and evaluation of a three-dimensional collaborative virtual learning environment (3D CVLE) called the Museum of Instructional Design. The 3D CVLE was designed to support the classroom activities of doctoral students enrolled in an instructional design and technology program with an emphasis on providing synchronous discourse and applied design opportunities. The development of the MID was led by an iterative three-phased learner experience design process based on the Successive Approximation Model that included (1) preparation, (2) iterative design, and (3) iterative development. The findings from this paper will provide insight into how formative learner experience design processes can lead to the development of a 3D CVLE.
... System Usability Score (SUS): Participants also completed the SUS questionnaire after visit 3, which is an additional Likert scale tool that is commonly used to evaluate usability. 24 It consists of a 10-item questionnaire with 5 response options for respondents; from strongly agree to strongly ...
... Similarly, SUS scores >68 are considered to have a usability greater than average devices. Using the conversion described by the SUS, 24 an SUS score of 79 translates to a percentile rank of 80%, indicating a very positive usability experience. Furthermore, patients chose to conduct longer durations of voluntary exercise using the KneeBright game compared with the traditional EMG-BF exercises. ...
Article
Context: A novel virtual game system Knee Biofeedback Rehabilitation Interface for game-based home therapy (KneeBright) was developed for strength training using integrated electromyography biofeedback of the quadriceps muscle to control the game. The study aimed to compare the KneeBright and electromyography biofeedback interface among patients with knee osteoarthritis. Design: Controlled before and after design. Methods: Nineteen patients with knee osteoarthritis took part in this laboratory-based study. Exercise sessions took place on 2 separate days. During session 1, participants used a conventional electromyography biofeedback system while performing 3 sets of lower body exercises with emphasis on maximal muscle activation, endurance, and precision. During session 2, participants used the KneeBright game to match the exercise sets in the first session. For both sessions, knee extension torque during the isometric muscle activation exercises and time to voluntary additional exercise were recorded. Patient engagement was assessed using the technology acceptance model and System Usability Score questionnaires. Results: The peak knee extension torque produced during the control exercise session and the KneeBright exercise session were positively correlated. Knee extension torque generated during KneeBright game exercise sessions was increased by an average of 25% compared to the control sessions (2.14 vs 1.77 N·m/kg, P = .02). The mean technology acceptance model score for the KneeBright system was 3.4/5 and the mean System Usability Score was 79, both indicating positive patient engagement. Conclusions: Patients using the KneeBright game produced greater knee torque than patients using the conventional system, had positive levels of engagement, and exercised longer with the KneeBright game.
... After completing the WPV reporting tasks, participants were asked to assess the perceived usability of the WPV report using the post-study system usability questionnaire (PSSUQ) [1]. PSSUQ is the second most commonly used post-study questionnaire for measuring perceived usability and contains three sub-constructs: system quality, information quality, and interface quality [7]. Results of this assessment were compared to the Human-Computer Interaction (HCI) recommended standards as a baseline for evaluation. ...
... The mean (SD and recommended HCI standard) PSSUQ score was 2.88 (0.94, <2.82) ( Table 2), and the subscale scores are 2.5 (0.89) for system usefulness, 2.89 (0.89) for information quality, and 3.5 (1.54) for interface quality [7]. ...
Chapter
Full-text available
A majority of healthcare workers (HCWs) experience workplace violence (WPV) but most WPV events go unreported. Underreporting of WPV is well documented in the literature as a barrier to identifying underlying causes and to evaluating the effectiveness of WPV interventions. Previous studies suggest that WPV reporting data is fragmentary, unreliable, and inconsistent. Also, WPV reporting systems are suboptimally designed making it difficult for healthcare workers to report WPV incidents. This study aims to assess the usability of an electronic WPV report in a large academic medical center and the perceived cognitive workload (CWL) and performance of HCWs associated with reporting WPV events. Findings from this study suggest that our institutional WPV report has suboptimal perceived usability and suboptimal perceived cognitive workload. Further, participants with training reported lower error rates in comparison to participants without training on performance.
... Regarding On the Usability scale, the mean was 75.79, SD = 13.59, which according to the Sauro-Lewis curved grading scale represents an acceptable product (grade B). 25 Parents reported that the intervention was user-friendly (M = 9.15, SD = 1.11). Satisfaction with the intervention was high (M = 25.42, ...
Article
Full-text available
Attention-deficit/ hyperactivity disorder (ADHD) is one of the most common mental health problems in childhood. Despite the fact that evidence-based treatments exist, behavioral parent training programs are the gold standard in the care of children with ADHD, a significant percentage of parents of children with ADHD do not access such interventions. Internet-delivered interventions are effective for a range of mental health problems, however, there is limited research conducted on the efficacy of such interventions in the treatment of ADHD. Objective: The aim of this study is to present the development and feasibility of an Internet-delivered intervention for parents of children with ADHD. Methods: The intervention was based on Behavioral Parent Training and Rational Emotive Behavior Therapy. Participants were mental health specialists (N = 16) and parents of children diagnosed with ADHD (N = 24). Results: Our results indicated high usability and parental satisfaction with the intervention. Conclusion: In conclusion, an Internet-delivered intervention addressed to parents of children diagnosed with ADHD is a promising approach. Future research should investigate the efficacy of this Internet-delivered intervention in a randomized controlled trial.
... which corresponds to grade C using the Sauro-Lewis curved grade scale. 43 Notably, the intervention arm had a higher mean SUS score of 70.7 (95% CI, 54.8-86.6) compared with the control arm of 65 (95% CI, 53.4-76.6). ...
Article
Full-text available
Unlabelled: Kidney transplant (KT) recipients who are not actively engaged in their care and lack self-management skills have poor transplant outcomes, which are disproportionately observed among Black KT recipients. This pilot study aimed to determine whether the MyKidneyCoach app, an mHealth intervention that provides self-management monitoring and coaching, improved patient activation, engagement, and nutritional behaviors in a diverse KT population. Methods: This was a randomized, age-stratified, parallel-group, attention-control, pilot study in post-KT patients. Participants were randomized into the attention-control with access to MyKidneyCoach for education and self-management (n = 9) or the intervention with additional tailored nurse coaching (n = 7). Feasibility, acceptability, and clinical outcomes were assessed. Results: The acceptability of MyKidneyCoach by System Usability Scale was 67.5 (95% confidence interval [CI], 59.1-75.9). Completion rates based on actively using MyKidneyCoach were 81% (95% CI, 57%-93%) and study retention rate of 73%. Patient activation measure significantly increased overall by a mean of 11 points (95% CI, 3.2-18.8). Additionally, Black patients (n = 7) had higher nutrition self-efficacy scores of 80.5 (95% CI, 74.4-86.7) compared with 75.6 (95% CI, 71.1-80.1) in non-Black patients (n = 9) but lower patient activation measure scores of 69.3 (95% CI, 56.3-82.3) compared with 71.8 (95% CI, 62.5-81) in non-Black patients after 3 mo. Conclusions: MyKidneyCoach was easy to use and readily accepted with low attrition, and improvements were demonstrated in patient-reported outcomes. Both Black and non-Black participants using MyKidneyCoach showed improvement in self-management competencies; thus, this intervention may help reduce healthcare inequities in KT.
... Additionally, the UX and UI of the NextLand Online Store were evaluated through a survey conducted amongst 20 workshop participants (12 representing endusers, 7 representing EO-SPs, and 1 representing Nova SBE). While building on the UX and UI literature [24,56,57,77,90], we developed an exploratory survey with a total of 44 questions (scale: 1 = strongly disagree to 5 strongly agree) spread across 16 topics (see Table 1). Although not statistically representative, these exploratory results provide essential guidelines to identify key UX and UI aspects critical to improving the Online Store for its current stakeholders. ...
Article
Full-text available
European Community (EC) Horizon-funded projects and Earth Observation-based Consortia aim to create sustainable value for Space, Land, and Oceans. They typically focus on addressing Sustainable Development Goals (SDGs). Many of these projects (e.g. Commercialization and Innovation Actions) have an ambitious challenge to ensure that partners share core competencies to simultaneously achieve technological and commercial success and sustainability after the end of the EC funds. To achieve this ambitious challenge, Horizon projects must have a proper governance model and a systematized process that can manage the existing paradoxical tensions involving numerous European partners and their respective agendas and stakeholders. This article presents the VCW-Value Creation Wheel (Lages in J Bus Res 69: 4849–4855, 2016), as a framework that has its roots back in 1995 and has been used since 2015 in the context of numerous Space Business, Earth Observation, and European Community (EC) projects, to address complex problems and paradoxical tensions. In this article, we discuss six of these paradoxical tensions that large Horizon Consortia face in commercialization, namely when managing innovation ecosystems, co-creating, taking digitalization, decision-making, tech-transfer, and sustainability actions. We discuss and evaluate how alliance partners could find the optimal balance between (1) cooperation, competition, and coopetition perspectives; (2) financial, environmental, and social value creation; (3) tech-push and market-pull orientations; (4) global and local market solutions; (5) functionality driven and human-centered design (UX/UI); (6) centralized and decentralized online store approaches. We discuss these challenges within the case of the EC H2020 NextLand project answering the call for greening the economy in line with the Sustainable Development Goals (SDGs). We analyze NextLand Online Store, and its Business and Innovation Ecosystem while considering the input of its different stakeholders, such as NextLand’s commercial team, service providers, users, advisors, EC referees, and internal and external stakeholders. Preliminary insights from a twin project in the field of Blue Economy (EC H2020 NextOcean), are also used to support our arguments. Partners, referees, and EC officers should address the tensions mentioned in this article during the referee and approval processes in the pre-grant and post-grant agreement stages. Moreover, we propose using the Value Creation Wheel (VCW) method and the VCW meta-framework as a systematized process that allows us to co-create and manage the innovation ecosystem while engaging all the stakeholders and presenting solutions to address these tensions. The article concludes with theoretical implications and limitations, managerial and public policy implications, and lessons for Horizon Europe, earth observation, remote sensing, and space business projects.
... Step 5 system testing; to test the system for defects and then completely improve the system that has been developed; function test, usability test, application test (Scherr et al., 2018;Lewis, 2018;Ahmad & Hussaini, 2021). ...
Article
Full-text available
Fermented fish is a famous local food knowledge in northeast region of Thailand and now it is expanding to be national food branding. The purpose of this research is to present importance of farmer training in order to 1) study of type of raw fish for producing fermented fish, 2) develop database system on type of raw fish for producing local fermented fish and 3) encourage knowledge transfer activities, using a database system for people in the community. Fifty people voluntarily participated in this study as participants. The research instruments were an interview form, database system, the performance evaluation form and a satisfaction questionnaire. The statistics used for data analysis were percentage, mean and standard deviation. The results showed that 1) raw fish for producing fermented fish could be divided into 2 groups according to morphology; 28 types of scaled fishes, 13 types of leather fish, total 41 types 2) database system, there are 3 components: general user, member users and administrator, which has the ability to display, search, add, edit, delete information about fish and user information, the overall efficiency of the system was 97.92% and 3) the results of knowledge transfer on the use of a database system found satisfaction overall at high level.
... But do not have enough data to support the analysis. SUS contains 10 mixed tone items, with half of the items (odd numbers) positive and the other half (even numbers) negative, all with response scales from 1 (strongly disagree) to 5 (strongly agree) (Lewis, 2018a). Questionnaire according to the SUS method consisting of ten standard statements for the assessment of SUS scores (Lewis, 2018b). ...
Article
Full-text available
The quality of the feasibility of a mobile application can be measured by evaluating the usability using certain testing methods. The refore to improve the performance of mobile applications, it is necessary to measure the level of satisfaction and user acceptance of the application. This study aims to improve the quality of the Myelkomsel mobile application by measuring the level of satisfaction and acceptance of end-users and evaluating the relationship between the level of user satisfaction with user gender differences using the System Usability Scale (SUS) method. Data collection was carried out by distributing questionnaires to 46 Telkomsel provider users consisting of 22 male users and 24 female users. The results showed that the average SUS score was 71.96 for the whole user. According to the calculation of the level of user acceptance included in the category of Acceptable, the scale of the class level is included in category C, and the ranking is included in the good category. While the results of calculations from the SUS Percentile Rank Score show that the MyTelkomsel mobile application is in class C. There are no significant differences according to gender, the average SUS score of the male is 71.93, and female is 71.98 so that gender differences do not affect the level of satisfaction and acceptance application. It can be seen that the level of end-user satisfaction from the MyTelkomsel mobile application is good but still has possible problems with its use
... In this paper, the testing phase uses a SUS method. The SUS method is chosen because this method is popularly used to check the usability of products [15,16]. The validity and sensitivity of SUS have good reliability. ...
... To quantify the usability as perceived by the participants, a system usability scale (SUS) was used. The SUS is composed of ten items on a five-point Likert scale [37][38][39]. To account for possible bias due to lack of attention when completing the questionnaire, the scale alternates between positive and negative items. ...
To support the increasing number of older people, new (assistive) technologies are constantly being developed. For these technologies to be used successfully, future users need to be trained. Due to demographic change, this will become difficult in the future, as the resources for training will no longer be available. In this respect, coaching robots could have great potential to support younger seniors in particular. However, there is little evidence in the literature about the perceptions and potential impact of this technology on the well-being of older people. This paper provides insights into the use of a robot coach (robo-coach) to train younger seniors in the use of a new technology. The study was carried out in Austria in autumn 2020, involving 34 participants equally distributed among employees in their last three years of service and retirees in their first three years of retirement (23 female; 11 male). The aim was to assess participants' expectations and perceptions by examining the perceived ease of use and user experience of the robot in providing assistance during a learning session. The findings reveal a positive impression of the participants and promising results for using the robot as a coaching assistant in daily tasks.
... Pengujian Usability biasanya digunakan System Usability Scale (SUS) yang merupakan suatu kuesioner yang digunakan untuk penilaian kegunaan pengguna [15]. SUS sendiri merupakan penilaian yang paling populer dalam melakukan pengujian Usability [16]- [18]. ...
Article
Full-text available
Untuk mengukur suatu perangkat lunak lebih spesifik berbasis telepon seluler smartphone dapat diterima oleh pengguna maka dilakukan pengujian Usability. Untuk membuat quisioner yang dapat digunakan untuk melakukan pengujian System Usability Scale (SUS) yang dinyatakan valid dan reliabel. Quisioner diuji menggunakan Expert Review dan Product-Moment Coefficient untuk uji validitas, serta Cronbach Alfa untuk uji reliabilitas. Berdasarkan hasil uji yang dilakukan didapatkan 10 butir quisioner untuk uji SUS dengan seluruh butir dinyatakan valid secara Expert Review dan Product-Moment Coefficient, serta reliabel dengan skor Cronbach Alfa 0,778086452. Terdapat beberapa penelitian terkait pembuatan quisioner untuk uji SUS dimana bahasa yang digunakan bahasa indonesia dan bahasa inggris dengan jumlah pertanyaan sebanyak 10 butir. Penelitian ini sendiri memberikan opsi lain quisioner dengan mengadopsi penelitian yang sudah dilakukan dengan melakukan pengujian validitas dan realibilitas untuk menguji suatu perangkat lunak menggunakan uji SUS.
... Brooke [13]. To ensure the validity of the collected data and to avoid the effects of different operating systems on the data mentioned by Lewis [14] in his study, all test users used the Windows 10 operating system with a screen resolution of 1080p. There were also no additional instructions for the test users. ...
Conference Paper
Full-text available
A holistic determination and improvement of the quality of the indoor environment includes, in addition to the “classic” parameters such as air temperature and humidity, other influencing variables such as air quality, noise, and lighting conditions (brightness, color temperature). Since Covid-19, air quality came back into focus. The interaction of these factors in their entirety has an effect on people and significantly determines their well-being and performance. This paper presents the implementation of a monitoring system for these indoor comfort variables (temperature, humidity, wind speed, CO2, VOC, lighting, and noise) based on the Arduino microcontroller ecosystem and corresponding sensor technology. This setup is complemented by the development of a graphical user interface (GUI) with an interactive feedback system. Via touchscreen or the accompanying app for desktop PCs, users can monitor real-time measurements and change settings such as the model of thermal comfort, algorithm parameters that are used to predict comfort indices, database connection, application programming interface, or the language of the software. Feedback can be augmented by using system notifications, color notifications in the GUI, and changeable animated images according to user preferences. Furthermore, user tests were conducted to investigate the system usability and to explore the differences between these two interaction possibilities. During the user testing phase (N = 4), two questionnaires based on the usability metric for user experience lite (UMUX-LITE) and the system usability scale (SUS) proved the high usability of this monitoring system. Additionally, it was found that users increasingly prefer to use the touchscreen as the testing phase progressed.
... We selected a set of standard questionnaires, i.e., task-difficulty-rating and user-experience [41] to analyze the user experience. We further employed state-of-the-art instruments, which are the System Usability Scale (SUS) [42] and Computer System Usability Questionnaire (CSUQ) [43] in subjective usability evaluation. In addition to this, we also recorded and analyzed users' exploration activities. ...
Article
Full-text available
Nowadays, exponential growth in online production and extensive perceptual power of visual contents (i.e., images) complicate the users' information needs. The research has shown that users are interested in satisfying their visual information needs by accessing the image objects. However, the exploration of images via existing search engines is challenging. Mainly, existing search engines employ linear lists or grid layouts, sorted in descending order of relevancy to the user's query to present the image results, which hinders image exploration via multiple information modalities associated with them. Furthermore, results at lower-ranking positions are cumbersome to reach. This research proposed a Search User Interface (SUI) approach to instantiate the non-linear reachability of the image results by enabling interactive exploration and visualization options. We represent the results in a cluster-graph data model, where the nodes represent images and the edges are multimodal similarity relationships. The results in clusters are reachable via multimodal similarity relationships. We instantiated the proposed approach over a real dataset of images and evaluated it via multiple types of usability tests and behavioral analysis techniques. The usability testing reveals good satisfaction (76.83%) and usability (83.73%) scores.
... Aplikasi dengan User Experience yang tidak baik dapat mengakibatkan pengguna mengalami kesulitan dalam menggunakan aplikasi, sehingga pengguna akan beralih pada aplikasi lain (Tantri Fajarini, Ayu Wirdiani dan Arya Dharmaadi, 2020). Evaluasi kebergunaan produk atau usability adalah cara untuk mengukur apakah suatu aplikasi dapat dapat digunakan dengan mudah oleh pengguna sesuai dengan kebutuhannya (Situmorang, 2019).Evaluasi kebergunaan produk dilakukan dengan menggunakan melibatkan pengguna dalam melakukan testing aplikasi dan prototype (Gupta, 2015 (Lewis, 2018a). Kuesioner SUS juga digunakan karena sudah menjadi banyak digunakan dalam penelitian internasional untuk mengukur nilai usability (Lewis, 2018b). ...
Article
p class="Abstrak">Warga Bali adalah aplikasi lokal yang digunakan untuk mempermudah masyarakat untuk mencari informasi tentang kalender bali. Aplikasi Warga Bali juga merupakan aplikasi yang masih dalam proses pengembangan sehingga harus memperhatikan pengalaman pengguna khususnya aspek usability . Evaluasi usability pada penelitian ini untuk mengukur variabel efektivitas, efisiensi dan kepuasan pengguna dengan menggunakan 20 responden penelitian yang dipilih menggunakan simple random sampling dari masyarakat beragama hindu yang berdomisili di Provinsi Bali. Penelitian ini menggunakan metode Usability Testing dengan mengkombinasikan teknik Performance Measurement untuk menghitung variabel efektifitas dan efesiensi serta teknik Think Aloud untuk mengevaluasi antarmuka Aplikasi Warga Bali berdasarkan hasil verbalisasi masukan dan masalah yang responden hadapi saat menjalankan skenario tugas. Data dikumpulkan menggunakan skenario tugas yang diberikan dan dari saran perbaikan yang diberikan oleh responden. Responden selanjutnya akan melakukan pengisian kusisioner SUS ( System Usability Scale ) yang akan digunakan untuk mengukur variabel kepuasan pengguna. Hasil pengujian responden kategori remaja dan dewasa menunjukan bahwa tingkat usability dari variabel efektivitas yang diperoleh sebesar 96% dan 93% yang berarti cukup tinggi, variabel efesiensi memiliki rata – rata 30 detik dan 38 detik, variabel kepuasan pengguna dari kuisioner SUS sebesar 46 dan 51 yang berarti masih dibawah standar nilai SUS yaitu 68 poin. Analisis dengan teknik Think Aloud menghasilkan 49 rekomendasi perbaikan dari Aplikasi Warga Bali. Penelitian selanjutnya dapat melakukan evaluasi lanjutan terhadap desain perbaikan yang telah dibuat. Hasil yang diperoleh oleh peneliti dapat digunakan sebagai acuan oleh pengembang aplikasi Warga Bali dalam melakukan perbaikan dari aspek usability . Abstract Warga Bali is a local application that is used to make it easier for people to find information about the Balinese calendar. Warga Bali application is also an application that is still in the development process so it must pay attention to the user experience, especially the usability aspect. The usability evaluation in this study was to measure the variables of effectiveness, efficiency and user satisfaction by using 20 research respondents who were selected using simple random sampling from Hindu religious communities domiciled in Bali Province. This study uses the Usability Testing method by combining the Performance Measurement technique to calculate the effectiveness and efficiency variables as well as the Think Aloud technique to evaluate Warga Bali Application interface based on the results of verbalizing input and problems that respondents face when carrying out task scenarios. Data were collected using the given task scenarios and from the suggestions for improvement given by the respondents. Respondents will then fill out the SUS (System Usability Scale) questionnaire which will be used to measure user satisfaction variables. The test results of respondents in the adolescent and adult categories show that the usability level of the effectiveness variable obtained is 96% and 93% which means it is quite high, the efficiency variable has an average of 30 seconds and 38 seconds, the user satisfaction variable from the SUS questionnaire is 46 and 51 which meaning that it is still below the standard SUS score of 68 points. The analysis using the Think Aloud technique resulted in 49 recommendations for improvement from the Warga Bali Application. Further research can carry out further evaluation of the improvement designs that have been made. The results obtained by the researchers can be used as a reference by the Balinese citizen application developers in making improvements from the usability aspect. </p
... It was a need to test the learning application developed that can be used in the context of real sample. The instrument used for the pilot study was adapted from the third edition of the Post Study Usability and User Satisfaction Test (PSSUQ) by Lewis, 2018 pilot study was given to 30 pupils to gain the reliability of the test. Cronbach alpha for PSSUQ instrument was 0.95. ...
Article
Full-text available
Application of technology in classroom together with knowledge and skills enhance positive impact in teaching and learning. Hence, the Malaysia Education Blueprint 2013-2025 was developed by the Ministry of Education which emphasizes on the use of technology and innovation that can improve the achievement of pupils. Pupils in primary school faced difficulties in basic concepts, reasoning and problem solving in geometry. Furthermore, most of the teachers explained three-dimensional shapes based on drawings on whiteboards, static images on books and verbal explaining to pupils. In this regard, the innovation of LearnGeoAR application utilising Augmented Reality technology was developed for pupils. This study used Design and Development Research which consisted of three main phases meanwhile ADDIE's instructional design model as a framework to develop the application. The percentage score of content validity for the LearnGeoAR application was 94.3. This study used three experts' interviews to confirm the need for developing this application and Post-Study System Usability Questionnaire (PSSUQ) to obtain feedback of the usability and satisfaction from 30 grade 2 pupils after applying the application. It was hoped that the AR technology applied can improve the effectiveness of learning, increased motivation and creative thinking skills of pupils in solving problems especially in Geometry Topic.
... De manière à affiner notre compréhension des éléments des situations proposées intervenant dans la motivation des élèves, et en particulier du rôle des caractéristiques ergonomiques des ressources utilisées, nous avons ajoutés plusieurs questions concernant l'utilisabilité de ces ressources (Tricot et al., 2003). Pour cela, nous nous sommes appuyé sur le System Usability Scall (SUS - (Lewis, 2018) et le questionnaires Design-Oriented Evaluation of Perceived Usability (DEEP -Yang et al., 2012). ...
Experiment Findings
Full-text available
L’une des ambitions du projet Silva Numerica est de produire un EVE utile, utilisable et acceptable visant l’apprentissage de la forêt et de sa gestion durable à différents niveaux de formation : de la 6ème à la formation professionnelle post-bac. Dans cette perspective, l’examen des conceptualisations des professionnels de la sylviculture relatives à la gestion de la forêt dans une perspective de développement durable, associée à celle des référentiels de formation et à une analyse bibliographique, ont montré que plusieurs concepts relatifs aux processus bioécologiques sont centraux dans la compréhension de la dynamique d’une forêt. Ceux-ci pourraient faire l’objet d’une "progressivité" de leur apprentissage du collège, au lycée (filière technologique STAV) et jusqu’en formation professionnelle (BTS Gestion Forestière). Sur cette base, une expérimentation didactique a été élaborée avec des enseignants du collège et du lycée afin d’examiner le potentiel qu’aurait l’usage de Silva Numerica pour contribuer à un apprentissage de tels concepts « transversaux ». Elle a consisté à comparer les apprentissages des concepts de milieu, de concurrence et/ou de biodiversité, pour des élèves de 4ème et de 1ère STAV, participant à des séances d’enseignement mobilisant ou non Silva Numerica, animées par un enseignant ou dans le cadre d’un scénario d’autoformation. Un deuxième volet de l'expérimentation donne à voir l’incidence de ces différents scénarii d’usage de Silva Numerica au regard d’un scénario utilisant une ressource vidéo sur la régulation de la motivation des élèves. Il s’appuie sur le modèle développé par la théorie de l’autodétermination (Deci et Ryan, 1971, 1975, 1985,1991) et retravaillé par Vallerand et al. (1989). One of the ambitions of the Silva Numerica project is to produce a useful, usable and acceptable VLE aimed at learning about forests and their sustainable management at different levels of education: from the 6th grade to post-baccalaureate vocational training. In this perspective, the examination of the conceptualisations of forestry professionals relating to forest management in a sustainable development perspective, associated with that of training reference systems and a bibliographical analysis, showed that several concepts relating to bio-ecological processes are central to the understanding of the dynamics of a forest. These concepts could be "progressively" learned from secondary school to high school (STAV technology stream) and on to vocational training (BTS in forest management). On this basis, a didactic experiment was developed with secondary school teachers to examine the potential of using Silva Numerica to contribute to the learning of such "transversal" concepts. It consisted in comparing the learning of the concepts of environment, competition and/or biodiversity for 4th and 1st year STAV students, participating in teaching sessions using or not using Silva Numerica, led by a teacher or as part of a self-training scenario. A second part of the experiment shows the impact of these different scenarios of use of Silva Numerica with regard to a scenario using a video resource on the regulation of students' motivation. It is based on the model developed by the theory of self-determination (Deci and Ryan, 1971, 1975, 1985, 1991) and reworked by Vallerand et al. (1989)
... Various questionnaires were developed to be administered to users after their experience with a system or application to assess the usability they perceived (Assila et al. 2016;Lewis 2018;Hajesmaeel-Gohari and Bahaadinbeigy 2021). ...
Article
Full-text available
Augmented Reality (AR) has become an increasingly used technology to support and enhance the enjoyment of cultural heritage. Particularly relevant is its importance for digital storytelling: by framing a portion of a fresco or painting with a smartphone, an AR mobile application can provide contextually relevant information, also in the form of multimedia content, that can help the user to understand the story and meaning behind the images. In this type of application, human factors are of fundamental importance for the effectiveness of the narrative: a mobile AR application must avoid distracting the user’s attention from the content in order to encourage a good level of concentration and immersion. The case study presented in this paper deals with a mobile AR application developed to guide visitors in the interpretation of the frescoes inside the Basilica of Saint Catherina of Alexandria in Galatina. The aim of the study is the analysis of the relations among usability, user experience and mental workload factors in AR-based digital storytelling.
... Pour finir, quelques recherches ont récemment engagé une réflexion sur l'interprétation du score du CSUQ [26,27], tout comme cela a été le cas pour le SUS [5]. Ces recherches se donnent ainsi pour objectif de faire correspondre un adjectif qualificatif à un score obtenu à partir d'une échelle, afin que ce score puisse facilement attribuer une qualité à un système (bon, très bon, mauvais, horrible, etc.). ...
... It is an easy-to-use instrument effective at small sample sizes (typically seen in usability tests) [39] with 16 questions that uses a 7-point psychometric Likert scale -strongly agree (1) to strongly disagree (7) -to measure human attitude [40] and assess perceived usability. The CSUQ produces four scores (in this case, higher is better), one overall and three subscales [41], as follows: ...
Preprint
Full-text available
The outcomes of a clinical research directly depend on the correct definition of the research protocol, the data collection strategy and the data management plan. Furthermore, researchers often need to work within challenging contexts, such as in Tuberculosis services, where human and technological resources for research may be rare. The use of Electronic Data Capture systems, such as REDCap and KoBotoolbox, can help to mitigate such risks and to enable a reliable environment to conduct health research and promote results dissemination and data reusability. The proposed solution was based on needs pinpointed by researchers, considering the lack of an embracing solution to conduct research in low resources environments. The REDbox framework was built to enhance collection, management, sharing and availability of data in tuberculosis research, while providing a better user experience. The relevance of this article lies in the innovative approach to support TB research by combining existing technologies and tailored supporting features. REDbox was implemented as a valuable asset in nationwide cross-institutional Tuberculosis research projects in Brazil.
... e current 16-item version of the CSUQ was used in this study. e CSUQ 16-items are divided into three subscales: system usefulness for items 1-6; information quality for items 7-12; interface quality for items 13-15; and overall satisfaction for item 16. e overall values of reliability and validity calculated for the questionnaires were 0.97 and 0.76, respectively [33][34][35]. In addition, there were two questions added (17 and 18) to list the most and least liked about this system software. ...
Article
Full-text available
Despite the efforts of emerging technologies in the healthcare system, there is still a slower rate of acceleration in prehospital settings compared with the hospitals in digital transformation adaptation. The acknowledgment that digital transformation is significant to healthcare is reflected in planning for the future of digital healthcare. Thus, this study aimed to measure the usability of the electronic patient care report (ePCR) system among emergency medical services (EMS) staff who work in prehospital settings. A descriptive cross-sectional correlation study was used. Two hundred fifty EMS staff who are working in the prehospital setting at Saudi Red Crescent Authority in the Kingdom of Saudi Arabia were surveyed, and the response rate was 79.2% (198). An adapted tool of the Computer System Usability Questionnaire survey was used to collect data. The data were coded numerically and subjected to descriptive and inferential statistical analysis including Pearson’s correlation coefficient using the statistical software (SPSS 21). The majority of the participants rate their ePCR system as “useable” at a high level with a score of 3.41 (SD = 1.021). The overall mean of the ePCR system’s three subscales: system usefulness, information quality, interface quality, and overall satisfaction were 3.39 (SD = 1.152), 3.30 (SD = 1.052), 3.57 (SD = 1.064), and 3.37 (SD = 1.239), respectively. The least liked aspect of ePCR system software was information quality 81 (40.9%). Furthermore, there was a significant correlation between the age of EMS staff and the usability of the ePCR system (r = −0.150, ). The results suggest that healthcare institutions’ policy and decision-makers pay close attention to performing standardized training for the staff on their ePCR system before going to the field to increase efficiency and productivity. Furthermore, the users in this study identified other system features that, if included, could have enhanced usability, and improved functions and capabilities of the design to meet the EMS staff’s expectations.
... However, this requires a professional analyst because gaining relevant feedback for a specific product is challenging for someone unfamiliar with the product and the UX domain [28,29]. Previous research proposed questionnaire methods for quantitatively evaluating the usability of a specific product, such as the System Usability Scale (SUS) [46][47][48]. Think-aloud protocol for product design and development usability assessment which provides results close to what is experienced by the respondents [14,[49][50][51][52]. Questionnaires for User Interaction Satisfaction (QUIS) [53,54]. Post Study System Usability Questionnaire (PSSUQ) [55] and other appropriate assessment methods. ...
Article
Full-text available
The worldwide expansion of internet technologies and the World Wide Web (WWW) has witnessed a booming rise in popularity and adoption of Web Applications (WA). The current technological advancement has allowed web applications to become more innovative and practical in managing born-digital content. This requires developers to continue to expand their assessment repertoire to provide valuable and actionable feature coverage. This study demonstrates User Experience Assessment (UXA) as part of the Re-CRUD console framework formative assessment. Re-CRUD console framework is a code automation tool for web application development containing integrated records management features that help the information professional manage the digital content effectively. The assessment's primary goal was to get detailed feedback from information professionals on the Re-CRUD feature coverage to make Re-CRUD more pleasant for developers and content friendly. We conducted contextual discussions using the think-aloud protocol and usability testing with experts in WA development and information professionals. The findings revealed a positive review of Re-CRUD features coverage and code generation procedure but a less favourable review of authentication policy and audit trail. The feedback is used to improvise Re-CRUD feature coverage and increase code automation productivity.
... Usability measurement can be done using the Computer System Usability Questionnaire (CSUQ) [12], System Usability Scale (SUS) [ ranging from 1 to 5. SUS can measure several characteristics of an application, namely Easy to learn, Inconsistencies, Easy to memorize and Satisfaction [13]. ...
Article
Full-text available
Indonesia is an agricultural country that produces more rice commodities than secondary crops. Many people who work as farmers choose the land to plant rice. Farmers experience several obstacles in determining the correct planting time to improve the rice harvest quality. A planting calendar is a method used by farmers to determine the scheduling of planting for one year. The rice planting calendar works based on rainfall and climate patterns. With the help of the latest technology, determining the rice planting calendar can be done quickly. The utilization of computer technology and algorithms such as Artificial Neural Network is helpful for forecasting rainfall using time series data accurately in the following month. The planting calendar is connected to data from the Meteorology, Climatology and Geophysics Agency (BMKG) from each station in each region. The rice planting calendar is made on a mobile basis with the aim of providing convenience for users in their hands. This cropping calendar application was developed using the Scrum method. The application development stages consist of sprint planning, first sprint, second sprint, third sprint and usability testing. The results of the development of the sprint went well. After completing the story, it was continued with the usability testing stage using the System Usability Scale (SUS). The SUS test was given to 20 respondents who had criteria including farmers and landowners. The results of SUS on the rice planting calendar application got a score of 72.75, which was categorized as Good.
... They comprise two officers from the Education Technology Division, three Primary school mathematics master trainers, two SISC+ officers for Primary school Mathematics, a Mathematics and STEM lecturer from the Institute of Teacher Education Malaysia (IPGM), and two national icons for innovation and technology. Moreover, the summative assessment was conducted with ten pupils using Post Study Usability and User Satisfaction Test (PSSUQ) questionnaire (Lewis, 2018). This questionnaire evaluated the P-JMat applications from various dimensions such as design, functionality, ease of use, learning ability, satisfaction, future use and error, and reliability. ...
Article
Full-text available
Game-based learning has received increasing attention in recent years as it could help improve pupils’ motivation, self-efficacy, and achievement. Technological innovations like learning analytics (LA) and GBL offer pedagogical support for teachers. GBL could significantly support pupils’ learning as a learning approach compared to conventional approaches. Therefore, there is a need to elevate “ teachers’ level of knowledge on the impact of GBL. In the meantime, LA could be used to collect, analyze, and report data on the impact of GBL on pupils’ learning performance. In this light, GBL applications have been developed to facilitate the use of LA for teaching and learning. This paper describes the design of GBL with LA integration for teaching mathematics in primary schools. It documents the construction of the GBL and AL app, which is grounded on the Dick, Carey, and Carey Model and the theory of constructivism. In addition, the cognitive load theory was applied to ensure that the application accommodates pupils’ cognitive load. This study also validated the design of the GBL, and it was found to be relevant and engaging. Keywords: Game-based learning, mathematics, analytics, technology, education
... In line with [9], we gave this questionnaire to the participants before and after performing the evaluation, to contrast their expectations of how the API should work against their impressions of how it actually works. Regarding the CSUQ, this is a questionnaire of 19 questions, all of them answered with a 7-point Likert scale, about the usability of the product being tested [39]. Altogether, users complete three questionnaires: one to check their expectations, one to gather their impressions of the tool (using the cognitive dimensions framework), and finally one to measure the usability of the tool. ...
Article
Full-text available
Multimodal emotion detection has been one of the main lines of research in the field of Affective Computing (AC) in recent years. Multimodal detectors aggregate information coming from different channels or modalities to determine what emotion users are expressing with a higher degree of accuracy. However, despite the benefits offered by this kind of detectors, their presence in real implementations is still scarce for various reasons. In this paper, we propose a technology-agnostic framework, HERA, to facilitate the creation of multimodal emotion detectors, offering a tool characterized by its modularity and the interface-based programming approach adopted in its development. HERA (Heterogeneous Emotional Results Aggregator) offers an architecture to integrate different emotion detection services and aggregate their heterogeneous results to produce a final result using a common format. This proposal constitutes a step forward in the development of multimodal detectors, providing an architecture to manage different detectors and fuse the results produced by them in a sensible way. We assessed the validity of the proposal by testing the system with several developers with no previous knowledge about affective technology and emotion detection. The assessment was performed applying the Computer System Usability Questionnaire and the Twelve Cognitive Dimensions Questionnaire, used by The Visual Studio Usability group at Microsoft, obtaining positive results and important feedback for future versions of the system.
... System Usabillity Scale juga digunakan untuk mengevuasi UI/UX prototype yang dikembangkan ini [19]. SUS banyak digunakan dalam evaluasi usability, diantaranya penggunaan System Usabillity Scale dalam evaluasi sistem operasi komputer apple dan windows [20]. Dengan adanya aplikasi mobile tari rakyat yang menggunakan metode UCD dan evaluasi prototype dengan SUS ini diharapkan para pengguna dalam hal ini siswa dapat memiliki wawasan atau pengetahuan tari rakyat yang ada di Indonesia. ...
Article
Full-text available
Indonesia has many dances in each region. Traditional dance and folk dance are dances that existed in Indonesia before the development of contemporary dance. As one of the local cultures in each area, dance art is included in local content at the elementary to high school level. The changing curriculum has disrupted local cultural education in the world of education some time ago. In addition to these factors, the lack of interactive learning media at least affects. The purpose of this study is to develop a learning pattern for folk dance as a local culture in Indonesia through an interactive mobile application. In addition, this research is used to help preserve and introduce the folk dance arts of each region to students in Indonesia. Gamification can be an alternative for developing folk dance learning. What usually happens is the lack of innovation in conventional learning media to attract students' interest in studying local culture, especially folk dance as a local content subject. This activity is a folk dance education about history, regional origins and dance movements. The result of this application is a folk dance game in Indonesia. There are several levels that must be passed to be able to complete this game. Each season will be taken 3 people who get the reward. The rewards that we design are based on the prizes preferred by elementary, junior high and high school students. Based on the results of the System Usability Scale evaluation, the prototype designed got a score of 86.25% and was considered to have met the usability element.
... Lewis (1995) describes a method for the evaluation of subjective usability using Likert scale questionnaires. One such questionnaire is the Computer System Usability Questionnaire (CSUQ), which is used to evaluate perceived user satisfaction with computer systems (Lewis, 2018a;Lewis, 2018b). ...
Article
Full-text available
The main objective of perceived usability studies is to develop better quality software that is both efficient and effective. Retrospective usability studies in the literature are rich in data that can be used by systems developers to achieve that purpose. However, some developers still fail to make use of users' past user experiences, and as a result some systems continue to include persistent flaws following updates. To address this problem, a conceptual framework of evaluation was developed from the usability evaluation literature. This research proposes a Perceived Usability Evaluation Framework to be consulted by evaluators during maintenance and system updates. To validate this framework, the researchers used evidence empirically collected from the Public Authority for Applied Education and Training (PAAET) online registration system. Results indicate that the framework provides a promising structure which can be followed by researchers, practitioners and systems developers when synthesizing patterns of dissatisfaction from previous system usability evaluations, and these syntheses can in turn can guide future system updates.
... The odd-numbered questions have a positive tone, while the tone of the even-numbered items is negative. The SUS is in the public domain, with no license fee required for its use [395]. According to Brooke (1996), participants should complete the SUS after using the decision support system. ...
Thesis
Small and medium enterprises (SMEs) are an essential engine of the country’s economy. During the last decades, SMEs have adopted and implemented digital technologies. Adopting technological trends such as the Internet of Things is challenging for SMEs, considering their financial and technological limitations. This thesis aims to guide decision-making IoT adoption in small and medium enterprises by defining a Decision Support System (DSS) based on The Technology-Organization-Environment Model (TOE). The decision support system will assist decision-makers indicating their technology readiness level and providing recommendations that support the digital transformation process. This research uses a mixed qualitative and quantitative approach to understand the factors influencing IoT adoption in SMEs. The Doctoral Thesis development was an enriching process that focused on providing a conceptual framework based on the TOE framework to understand the factors that affect IoT adoption in small and medium enterprises. Based on the conceptual framework Ready4IoT in SMEs, a decision support system named DSS Ready4IoT was developed, that according to the Software Usability Scale (SUS), has high usability. The DSS Ready4IoT’s purpose is to produce measurements that capture the particularities of SMEs in the trading sector to understand the adoption of digital technologies. Likewise, the Ready4IoT report can guide policymakers to design ICT policies focused on SMEs. In this way, technology readiness in SMEs can increase by attending to the ICT sector evolution, creating an agile economy that adapts to technological change promoting innovation.
Conference Paper
Full-text available
Los modelos son utilizados para referirse a prototipos o un punto de referencia que actúa como un “patrón” para recrear objetos. En el campo educativo, los modelos tienen la función de encontrar soluciones a problemas recurrentes en las experiencias de enseñanza-aprendizaje. Por tanto, los modelos facilitan la creación de un patrón en las soluciones y forman un lenguaje común para la creación de nuevos modelos. En el campo del diseño de itinerarios personales de aprendizaje, un modelo para su formalización facilitaría a los docentes encontrar soluciones a problemas frecuentes en su diseño. El objetivo de esta contribución es reportar los resultados preliminares sobre la percepción del alumnado universitario respecto a la incorporación de itinerarios personales de aprendizaje con el modelo ACDGE. Este trabajo corresponde a un avance de la tesis doctoral DISEÑO DE UN MODELO PARA LA FORMALIZACIÓN DE ITINERARIOS PERSONALES DE APRENDIZAJE.
Conference Paper
Full-text available
Las tendencias en educación centradas en el alumno, en términos generales, están diseñadas para que el alumno pueda elegir cuando, de qué modo aprender, y a su vez, construir su propia definición de aprendizaje. Una estrategia que favorece esta tendencia son los itinerarios personales de aprendizaje. Los itinerarios son potentes organizadores de los temas/conceptos a aprender, los objetos de aprendizaje a utilizar y la evaluación a desarrollar. Están conformados por secuencias de aprendizaje que el propio alumno elige según sus características individuales (necesidades, estilo de aprendizaje, etc.). Esta contribución tiene por objetivo reportar los resultados preliminares de un estudio de validación por juicio de expertos aplicado a un modelo para gestionar itinerarios personales de aprendizaje, en el marco de la tesis doctoral DISEÑO DE UN MODELO PARA LA FORMALIZACIÓN DE ITINERARIOS PERSONALES DE APRENDIZAJE.
Conference Paper
Full-text available
A través de esta revisión se quiere identificar si en educación superior hay una vinculación entre las estrategias didácticas basadas en metodologías ágiles y la agencia del estudiante, y si estas estrategias están enriquecidas con tecnología. Aunque Agile fue creado para la gestión de proyectos de software, se ha extendido a otros campos como la educación, por favorecer la participación activa del estudiante. Así pues, Agile podría facilitar la agencia al intervenir en la responsabilidad del aprendizaje. A partir de los documentos obtenidos en la búsqueda, se realizó la lectura de resúmenes y se buscó conceptos relacionados con la agencia y con recursos tecnológicos. Pocos documentos contienen el concepto agencia. Sin embargo, sí incluyen factores relacionados con esta y conceptos relacionados con herramientas digitales. Si bien es necesaria una lectura completa, se entrevé que las metodologías ágiles enriquecidas con tecnología pueden mejorar la agencia al poner el foco en el centro del estudiante.
Chapter
Recently, due to the coronavirus pandemic, we are experiencing a revolution that is transforming the way, the education has now shifted to an “physical plus digital” or “phygital” multimodal.This paper analyses the students’ behavioral intention to the phygital learning, meaning how students use online learning platform (e.g. Moodle), collaboration application (e.g. Microsoft teams), chat application (e.g. Wechat) and device (e.g. smartphone, laptop) of a course.For the evaluation purpose is followed by using the Semantic Differential Technique to distinguish the usage attitude of computer and smartphone. The Usage Questionnaire is followed by the System Usability Scale (SUS), which is a Human Computer Interaction (HCI) based approach, and the Technology Acceptance Model (TAM), which is an Information Systems (IS) based approach. The sample size consisted of 68 participants completed the survey questionnaire measuring their responses to perceived usefulness (PU), perceived ease of use (PEOU) and attitudes towards usage (ATU).Through simultaneously both these instruments in one work for the purpose of usability evaluation. By doing so, this work attempts to streamline and unify the process of usability evaluation. Results that are obtained from a large-scale survey of university students show the attitudes towards usage on phygital learning. Moreover, this work also considers the digital-divide aspect (mobile v.s. web environment) whether it has any effect on the perceived usability. Results show that the multiple education modal could reduce the stress on the learning.KeywordsPhygital learningUsability evaluationSystem Usability Scale (SUS)Technology Acceptance Model (TAM)
Article
We propose a new method for generating explanations with Artificial Intelligence (AI) and a tool to test its expressive power within a user interface. In order to bridge the gap between philosophy and human-computer interfaces, we show a new approach for the generation of interactive explanations based on a sophisticated pipeline of AI algorithms for structuring natural language documents into knowledge graphs, answering questions effectively and satisfactorily. With this work, we aim to prove that the philosophical theory of explanations presented by Achinstein can be actually adapted for being implemented into a concrete software application, as an interactive and illocutionary process of answering questions. Specifically, our contribution is an approach to frame illocution in a computer-friendly way, to achieve user-centrality with statistical question answering. Indeed, we frame the illocution of an explanatory process as that mechanism responsible for anticipating the needs of the explainee in the form of unposed, implicit, archetypal questions, hence improving the user-centrality of the underlying explanatory process. Therefore, we hypothesise that if an explanatory process is an illocutionary act of providing content-giving answers to questions, and illocution is as we defined it, the more explicit and implicit questions can be answered by an explanatory tool, the more usable (as per ISO 9241-210) its explanations. We tested our hypothesis with a user-study involving more than 60 participants, on two XAI-based systems, one for credit approval (finance) and one for heart disease prediction (healthcare). The results showed that increasing the illocutionary power of an explanatory tool can produce statistically significant improvements (hence with a P value lower than .05) on effectiveness. This, combined with a visible alignment between the increments in effectiveness and satisfaction, suggests that our understanding of illocution can be correct, giving evidence in favour of our theory.
Article
Full-text available
Visual aesthetics is a success criterion for mobile apps. Despite considerable research on graphical user interface (GUI) assessments, there is a lack of studies investigating the reliability and validity of scale types on visual aesthetics as a unidimensional construct. In this study, 208 subjects were divided into four groups, each using a different rating scale and the VisAWI-S questionnaire as the golden standard, to assess the visual aesthetics of nine mobile GUIs. As a result, all scales showed excellent inter-rater reliability and good agreement. Seven-point scales resulted in slightly higher intra-rater reliability than those with five points, but agreement was lower using five-point Likert scales. All scales have shown to be valid compared with VisAWI-S and presented strong correlations pairwise. Results indicate that any of these scales are suitable to assess mobile GUI visual aesthetics reliably and validly as long as response quality is analyzed. This work supports the adoption of single-item questionnaires reducing effort and time, especially in large-scale assessment designs.
Article
Purpose Usage of learning management systems (LMSs) has become widespread with the disruption of face-to-face educations after the COVID-19 pandemic. There are several software products, usually named as LMS to enable and support distance education. However, selection of a suitable LMS is a complex multiple criteria decision making (MCDM) problem that requires consideration of many criteria and inputs from different parties like students, academicians, education managers, etc. Usability evaluation of LMS is one of the critical steps in deciding which LMS system to be adapted. There are several studies related to usability evaluation of LMS in the literature, but utilization of MCDM methods and real life case studies are very rare. Based on this motivation, perceived usability evaluation of SAKAI-LMS that is in use at an academic department is performed by employing axiomatic design procedure (ADP). This paper aims to discuss the aforementioned issues. Design/methodology/approach ADP is considered as a suitable MCDM method for perceived usability evaluation as it allows an easy approach to data fusion and setting performance targets for decision makers. A questionnaire is developed to collect data from three types of system users about predetermined usability criteria and their importance. After detailed statistical analyses and weighting criteria via analytical hierarch process (AHP), ADP is carried out to evaluate usability of the LMS. Findings It is found that the proposed ADP based approach is easy to apply in practical circumstances and able to quantify perceived usability of the LMSs. Research limitations/implications The proposed approach provides an easy and practical evaluation of perceived usability of the LMSs for decision makers who are responsible for the implementation of LMSs. The developed novel and practical MCDM-based perceived usability approach for LMS in this study has been verified through a real life case study at an academic department. Perceived usability results, therefore, reflects only the views of this focus group and are not generalizable. Originality/value First time in the literature, a comprehensive ADP based MCDM approach is proposed based on the analyses of the related literature and information gathered from the system users.
Article
Within the last decade, the Usability Metric for User Experience is translated into many languages. This study aimed to create an adapted version of UMUX into Turkish, based on item translations made from Arabic, Chinese, English, Italian and Slovene versions, which vary for the content of the items as well as their differences in how they are processed by machine translation systems and understood by human translators. Based on different translations, 45 UMUX variants in Turkish language are assessed psychometrically as a formative construct, regarding the criteria based on PLS measurement models for formative constructs and Rasch analysis. The results are benchmarked with the data collected via UMUX, SUS and CSUQ in Arabic and English or Turkish and English concurrently. Results show that all 45 Turkish variants of UMUX reveal strong psychometric qualities as a formative construct, as well as UMUX and Arabic UMUX. While there is evidence that usability can be measured as a formative construct, differences between item sets suggest that UMUX may not be a complete measure that includes all aspects of usability regarding ISO 9241-11. Our results also suggest that UMUX scores are sensitive to the native language of participants since the mean scores were significantly different between the native and English UMUX versions which were responded concurrently to assess the same software.
Article
The global digitization drive, accelerated by the 2020 health pandemic brought renewed research interest in the use and usability of learning management systems (LMSs). The purpose of this study was to propose validated usability guidelines for an LMS in an open distance and electronic learning (ODeL) context based on lecturers’ views of students’ needs. A set of usability requirements was abstracted from the literature and used as the basis for a heuristic evaluation (HE) of the institution’s LMS. The results were triangulated with the results of three other usability evaluation methods including usability testing with eye tracking, a post-test system usability scale (SUS) questionnaire and interviews. The primary contribution is the validated usability requirements for ODeL LMSs. A secondary contribution is the triangulation approach, which allowed a comparison of the usability evaluation methods’ results and confirmed HE as an effective and efficient evaluation method for LMSs.
Article
Full-text available
Beras merupakan komoditas pangan utama di Indonesia. Dengan tingkat konsumsi di tahun 2021 sebesar 31.9 juta ton dengan kenaikan sebanyak 351,71 ribu ton dibandingkan dengan tahun sebelumnya. Akan tetapi, kenaikan dalam jumlah produksi ini tidak diimbangi dengan peningkatan dalam hal kualitas beras, Berdasarkan data BPS pada tahun 2021, angka impor beras berkualitas sebesar 41.800 ton. Hal tersebut dikarenakan masih adanya pencari beras berkualitas sebagai bahan pangan utama. Untuk mewujudkan swasembada beras di Provinsi Jawa Tengah dan menghasilkan beras berkualitas baik, harus dilakukan pemantauan mulai dari hulu (petani) sampai ke hilir (distributor) beras. Aplikasi pengendalian Kualitas Beras Terpadu dirancang untuk menjawab permasalahan tersebut. Dengan pengembangan Metode pengembangan perangkat lunak Feature-Driven Development (FDD), aplikasi tersebut dapat memastikan pengendalian kualitas beras dapat terjaga dengan baik. Perancangan sistem dilakukan dengan pendekatan yang bertujuan untuk mempermudah bagi para penggunanya. Perancangan desain aplikasi ini juga terbukti mampu diterima bagi penggunanya dengan tingkat pengujian SUS dengan skor rataan 70,95 yang telah dirancang berada dalam kategori baik dan dapat diimplementasikan.
Article
Sense of control is increasingly used as a measure of quality in human-computer interaction. Control has been investigated mainly at a high level, using subjective questionnaire data, but also at a low level, using objective data on participants’ sense of agency. However, it remains unclear how differences in higher level, experienced control reflect lower level sense of control. We study that link in two experiments. In the first one we measure the low-level sense of agency with button, touchpad, and on-skin input. The results show a higher sense of agency with on-skin input. In the second experiment, participants played a simple game controlled with the same three inputs. We find that on-skin input results in both increased sense and experience of control compared to touchpad input. However, the corresponding difference is not found between on-skin and button input, whereas the button performed better in the experiment task. These results suggest that other factors of user experience spill over to the experienced control at rates that overcome differences in the sense of control. We discuss the implications for using subjective measures about the sense of control in evaluating qualities of interaction.
Article
We introduce VIREO, a web-based software tool for graphical authoring of vibrotactile feedback for mobile and wearable applications. VIREO enables flexible specification of vibrotactile patterns with model-based and free-draw input, and is compatible with devices that run JavaScript, either natively or in a web browser. We demonstrate VIREO with applications developed for smartphones, smartwatches, armbands, and smartglasses, and we present the results of a usability evaluation study with sixteen participants represented by coders with various programming experience. We discuss our contributions in the context of the results of a Systematic Literature Review conducted on the topic of software tools, editors, and platforms developed in the scientific community for authoring vibrotactile feedback. Given that one finding of our review is the little availability of such contributions, we release VIREO as a free resource on the web for researchers and practitioners to author and integrate vibrotactile feedback in mobile and wearable applications.
Chapter
The Accessibility Requirements Tool for Information and Communication Technologies (FRATIC) was developed within the work of a doctoral project, at the University of Trás-os-Montes and Alto Douro, and may be used at various stages of public procurement processes as well as projects and developments that include ICT products and services. This tool helps to consult, determine and assess the accessibility requirements for ICT products and services in European Standard EN 301 549 that supports the legislation in the field of public procurement for the countries of the European Union – Directive 2014/24/EU. This study focuses on the standardized usability and accessibility features evaluation of the FRATIC prototype, based on ISO 9241-11 metrics, other usability and accessibility evaluation criteria, as well as various standardized measurement tools and methods – such as Single Ease Question (SEQ) and System Usability Scale (SUS) – after conducting usability tests and interviews with 25 experts in the fields of accessibility, assistive technologies, and public procurement.
Article
Full-text available
In 2009, we published a paper in which we showed how three independent sources of data indicated that, rather than being a unidimensional measure of perceived usability, the System Usability Scale apparently had two factors: Usability (all items except 4 and 10) and Learnability (Items 4 and 10). In that paper, we called for other researchers to report attempts to replicate that finding. The published research since 2009 has consistently failed to replicate that factor structure. In this paper, we report an analysis of over 9,000 completed SUS questionnaires that shows that the SUS is indeed bidimensional, but not in any interesting or useful way. A comparison of the fit of three confirmatory factor analyses showed that a model in which the SUS's positive-tone (odd-numbered) and negative-tone (even-numbered) were aligned with two factors had a better fit than a unidimensional model (all items on one factor) or the Usability/Learnability model we published in 2009. Because a distinction based on item tone is of little practical or theoretical interest, we recommend that user experience practitioners and researchers treat the SUS as a unidimensional measure of perceived usability, and no longer routinely compute Usability and Learnability subscales.
Chapter
Full-text available
Covers the basics of usability testing plus some statistical topics (sample size estimation, confidence intervals, and standardized usability questionnaires).
Article
Full-text available
Usability Metric for User Experience (UMUX) and its shorter form variant UMUX-LITE are recent additions to standardized usability questionnaires. UMUX aims to measure perceived usability by employing fewer items that are in closer conformance with the ISO 9241 definition of usability, while UMUX-LITE conforms to the technology acceptance model (TAM). UMUX has been criticized regarding its reliability, validity, and sensitivity, but these criticisms are mostly based on reported findings associated with the data collected by the developer of the questionnaire. Our study re-evaluates the UMUX and UMUX-LITE scales using psychometric methods with data sets acquired through two usability evaluation studies: an online word processor evaluation survey (n = 405) and a web-based mind map software evaluation survey for three applications (n = 151). Data sets yielded similar results for indicators of reliability. Both UMUX and UMUX-LITE items were sensitive to the software when the scores for the evaluated software were not very close, but we could not detect a significant difference between the software when the scores were closer. UMUX and UMUX-LITE items were also sensitive to users’ level of experience with the software evaluated in this study. Neither of the scales was found to be sensitive to the participants’ age, gender, or whether they were native English speakers. The scales significantly correlated with the System Usability Scale (SUS) and the Computer System Usability Questionnaire (CSUQ), indicating their concurrent validity. The parallel analysis of principal components of UMUX pointed out a single latent variable, which was confirmed through a factor analysis, that showed the data fits better to a single-dimension factor structure.
Article
Full-text available
Article
Full-text available
The use of applications on mobile devices has reached historic levels. Using the System Usability Scale (SUS), data were collected on the usability of applications used on two kinds of mobile platforms—phones and tablets—across two general classes of operating systems, iOS and Android. Over 4 experiments, 3,575 users rated the usability of 10 applications that had been selected based on their popularity, as well as 5 additional applications that users had identified as using frequently. The average SUS rating for the top 10 apps across all platforms was 77.7, with a nearly 20-point spread (67.7–87.4) between the highest and lowest rated apps. Overall, applications on phone platforms were judged to be more usable than applications on the tablet platforms. Practitioners can use the information in this article to make better design decisions and benchmark their progress against a known universe of apps for their specific mobile platform.
Article
Full-text available
Nowadays, practitioners extensively apply quick and reliable scales of user satisfaction as part of their user experience (UX) analyses to obtain well-founded measures of user satisfaction within time and budget constraints. However, in the human-computer interaction (HCI) literature the relationship between the outcomes of standardized satisfaction scales and the amount of product usage has been only marginally explored. The few studies that have investigated this relationship have typically shown that users who have interacted more with a product have higher satisfaction. The purpose of this paper was to systematically analyze the variation in outcomes of three standardized user satisfaction scales (SUS, UMUX and UMUX-LITE) when completed by users who had spent different amounts of time with a website. In two studies, the amount of interaction was manipulated to assess its effect on user satisfaction. Measurements of the three scales were strongly correlated and their outcomes were significantly affected by the amount of interaction time. Notably, the SUS acted as a unidimensional scale when administered to people who had less product experience, but was bidimensional when administered to users with more experience. We replicated previous findings of similar magnitudes for the SUS and UMUX-LITE (after adjustment), but did not observe the previously reported similarities of magnitude for the SUS and the UMUX. Our results strongly encourage further research to analyze the relationships of the three scales with levels of product exposure. We also provide recommendations for practitioners and researchers in the use of the questionnaires.
Conference Paper
Full-text available
Over the recent years, the notion of a non-instrumental, hedonic quality of interactive products received growing interest. Based on a review of 151 publications, we summarize more than ten years research on the hedonic to provide an overview of definitions, assessment tools, antecedents, consequences, and correlates. We highlight a number of contributions, such as introducing experiential value to the practice of technology design and a better prediction of overall quality judgments and product acceptance. In addition, we suggest a number of areas for future research, such as providing richer, more nuanced models and tools for quantitative and qualitative analysis, more research on the consequences of using hedonic products and a better understanding of when the hedonic plays a role and when not.
Conference Paper
Full-text available
In this paper we present the UMUX-LITE, a two-item questionnaire based on the Usability Metric for User Experience (UMUX) [6]. The UMUX-LITE items are This system's capabilities meet my requirements and This system is easy to use." Data from two independent surveys demonstrated adequate psychometric quality of the questionnaire. Estimates of reliability were .82 and .83 -- excellent for a two-item instrument. Concurrent validity was also high, with significant correlation with the SUS (.81, .81) and with likelihood-to-recommend (LTR) scores (.74, .73). The scores were sensitive to respondents' frequency-of-use. UMUX-LITE score means were slightly lower than those for the SUS, but easily adjusted using linear regression to match the SUS scores. Due to its parsimony (two items), reliability, validity, structural basis (usefulness and usability) and, after applying the corrective regression formula, its correspondence to SUS scores, the UMUX-LITE appears to be a promising alternative to the SUS when it is not desirable to use a 10-item instrument.
Article
Full-text available
The study investigated whether the change of response order in a Likert-type scale altered participant responses and scale characteristics. Response order is the order in which options of a Likert-type scale are offered. The sample included 490 college students and 368 junior high school students. Scale means with different response orders were compared. Structural equation modeling was used to test the invariance of interitem correlations, covariances, and factor structure across scale formats and educational levels. The results indicated that response order had no substantial influence on participant responses and scale characteristics. Motivating participants and avoiding ambiguous items may minimize possible effects of scale format on participant responses and scale properties.
Article
Full-text available
This study is a part of a research effort to develop the Questionnaire for User Interface Satisfaction (QUIS). Participants, 150 PC user group members, rated familiar software products. Two pairs of software categories were compared: 1) software that was liked and disliked, and 2) a standard command line system (CLS) and a menu driven application (MDA). The reliability of the questionnaire was high, Cronbach's alpha=.94. The overall reaction ratings yielded significantly higher ratings for liked software and MDA over disliked software and a CLS, respectively. Frequent and sophisticated PC users rated MDA more satisfying, powerful and flexible than CLS. Future applications of the QUIS on computers are discussed.
Article
Full-text available
The present study argued that the meaning of verbal labels of a Likert-type response scale was affected by the presentation order of the scale labels. It was proposed that subjects tended to choose the first alternative acceptable to them from among the ordered response categories so that a primacy effect was predicted. Findings supported the hypothesis. In addition, this response-order effect interfered with the threshold values, with factor structures estimated by factor analysis based on polychoric correlations, and with the item and person parameters estimated by the graded response model. Practical implications of the response-order effects were discussed.
Conference Paper
Full-text available
Usability evaluators used an 18-item, post-study questionnaire in three related usability tests. I conducted an exploratory factor analysis to investigate statistical justification to combine items into subscales. The factor analysis indicated that three factors accounted for 87 percent of the total variance. Coefficient alpha analyses showed that the reliability of the overall summative scale was .97, and ranged from .91 to .96 for the three subscales. In the sensitivity analyses, the overall scale and all three subscales detected significant differences among the user groups; and one subscale indicated a significant system effect. Correlation analyses support the validity of the scales. The overall scale correlated highly with the sum of the After-Scenario Questionnaire ratings that participants gave after each scenario. The overall scale also correlated moderately with the percentage of successful scenario completion. These results are consistent with the hypothesis that these alternative measurements tap into a common underlying construct. This construct is probably usability, based on the content of the questionnaire items and the measurement context.
Conference Paper
Full-text available
Correlations between prototypical usability metrics from 90 distinct usability tests were strong when measured at the task-level (r between .44 and .60). Using test-level satisfaction ratings instead of task-level ratings attenuated the correlations (r between .16 and .24). The method of aggregating data from a usability test had a significant effect on the magnitude of the resulting correlations. The results of principal components and factor analyses on the prototypical usability metrics provided evidence for an underlying construct of general usability with objective and subjective factors. Author Keywords
Conference Paper
Full-text available
When designing questionnaires there is a tradition of including items with both positive and negative wording to minimize acquiescence and extreme response biases. Two disadvantages of this approach are respondents accidentally agreeing with negative items (mistakes) and researchers forgetting to reverse the scales (miscoding). The original System Usability Scale (SUS) and an all positively worded version were administered in two experiments (n=161 and n=213) across eleven websites. There was no evidence for differences in the response biases between the different versions. A review of 27 SUS datasets found 3 (11%) were miscoded by researchers and 21 out of 158 questionnaires (13%) contained mistakes from users. We found no evidence that the purported advantages of including negative and positive items in usability questionnaires outweigh the disadvantages of mistakes and miscoding. It is recommended that researchers using the standard SUS verify the proper coding of scores and include procedural steps to ensure error-free completion of the SUS by users. Researchers can use the all positive version with confidence because respondents are less likely to make mistakes when responding, researchers are less likely to make errors in coding, and the scores will be similar to the standard SUS.
Conference Paper
Full-text available
Since its introduction in 1986, the 10-item System Usability Scale (SUS) has been assumed to be unidimensional. Factor analysis of two independent SUS data sets reveals that the SUS actually has two factors - Usability (8 items) and Learnability (2 items). These new scales have reasonable reliability (coefficient alpha of .91 and .70, respectively). They correlate highly with the overall SUS ( r = .985 and .784, respectively) and correlate significantly with one another ( r = .664), but at a low enough level to use as separate scales. A sensitivity analysis using data from 19 tests had a significant Test by Scale interaction, providing additional evidence of the differential utility of the new scales. Practitioners can continue to use the current SUS as is, but, at no extra cost, can also take advantage of these new scales to extract additional information from their SUS data.
Article
Full-text available
Factor analysis of Post Study System Usability Questionnaire (PSSUQ) data from 5 years of usability studies (with a heavy emphasis on speech dictation systems) indicated a 3-factor structure consistent with that initially described 10 years ago: factors for System Usefulness, Information Quality, and Interface Quality. Estimated reliabilities (ranging from .83-.96) were also consistent with earlier estimates. Analyses of variance indicated that variables such as the study, developer, stage of development, type of product, and type of evaluation significantly affected PSSUQ scores. Other variables, such as gender and completeness of responses to the questionnaire, did not. Norms derived from this data correlated strongly with norms derived from the original PSSUQ data. The similarity of psychometric properties between the original and this PSSUQ data, despite the passage of time and differences in the types of systems studied, provide evidence of significant generalizability for the questionnaire, supporting its use by practitioners for measuring participant satisfaction with the usability of tested systems.
Article
Full-text available
The Usability Metric for User Experience (UMUX) is a four-item Likert scale used for the subjective assessment of an application’s perceived usability. It is designed to provide results similar to those obtained with the 10-item System Usability Scale, and is organized around the ISO 9241–11 definition of usability. A pilot version was assembled from candidate items, which was then tested alongside the System Usability Scale during usability testing. It was shown that the two scales correlate well, are reliable, and both align on one underlying usability factor. In addition, the Usability Metric for User Experience is compact enough to serve as a usability module in a broader user experience metric.
Article
Full-text available
This paper describes recent research in subjective usability measurement at IBM. The focus of the research was the application of psychometric methods to the development and evaluation of questionnaires that measure user satisfaction with system usability. The primary goals of this paper are to (1) discuss the psychometric characteristics of four IBM questionnaires that measure user satisfaction with computer system usability, and (2) provide the questionnaires, with administration and scoring instructions. Usability practitioners can use these questionnaires with confidence to help them measure users' satisfaction with the usability of computer systems.
Article
Full-text available
Valid measurement scales for predicting user acceptance of computers are in short supply. Most subjective measures used in practice are unvalidated, and their relationship to system usage is unknown. The present research develops and validates new scales for two specific variables, perceived usefulness and perceived ease of use, which are hypothesized to be fundamental determinants of user acceptance. Definitions for these two variables were used to develop scale items that were pretested for content validity and then tested for reliability and construct validity in two studies involving a total of 152 users and four application programs. The measures were refined and streamlined, resulting in two six-item scales with reliabilities of .98 for usefulness and .94 for ease of use. The scales exhibited high convergent, discriminant, and factorial validity. Perceived usefulness was significantly correlated with both self-reported current usage (r=.63, Study 1) and self-predicted future usage (r =.85, Study 2). Perceived ease of use was also significantly correlated with current usage (r=.45, Study 1) and future usage (r=.59, Study 2). In both studies, usefulness had a significantly greater correlation with usage behavior than did ease of use. Regression analyses suggest that perceived ease of use may actually be a causal antecedent to perceived usefulness, as opposed to a parallel, direct determinant of system usage. Implications are drawn for future research on user acceptance.
Article
Full-text available
The System Usability Scale (SUS), developed by Brooke (Usability evaluation in industry, Taylor & Francis, London, pp 189-194, 1996), had a great success among usability practitioners since it is a quick and easy to use measure for collecting users' usability evaluation of a system. Recently, Lewis and Sauro (Proceedings of the human computer interaction international conference (HCII 2009), San Diego CA, USA, 2009) have proposed a two-factor structure-Usability (8 items) and Learnability (2 items)-suggesting that practitioners might take advantage of these new factors to extract additional information from SUS data. In order to verify the dimensionality in the SUS' two-component structure, we estimated the parameters and tested with a structural equation model the SUS structure on a sample of 196 university users. Our data indicated that both the unidimensional model and the two-factor model with uncorrelated factors proposed by Lewis and Sauro (Proceedings of the human computer interaction international conference (HCII 2009), San Diego CA, USA, 2009) had a not satisfactory fit to the data. We thus released the hypothesis that Usability and Learnability are independent components of SUS ratings and tested a less restrictive model with correlated factors. This model not only yielded a good fit to the data, but it was also significantly more appropriate to represent the structure of SUS ratings.
Article
“Usability” is a construct conceived by the human–computer interaction (HCI) community to denote a desired quality of interactive systems and products. Despite its prominence and intensive use in HCI research, the usefulness of the usability construct to HCI theories and to our understanding of HCI has been meager. In this article I propose and discuss two reasons for this state of affairs. The first is that usability is an umbrella construct. Umbrella constructs are prevalent in scientific fields that are broad, diverse, and lack a unifying research paradigm. Accordingly, umbrella constructs, such as usability, tend to be vague and loose, characteristics that challenge our ability to accumulate and communicate knowledge and to capture real-world phenomena. The second reason involves the nature of the relations between the usability construct and its measures, a topic rarely discussed in HCI research. There appears to be a mismatch between how the HCI community has (implicitly) conceptualized these relations and how it has empirically examined them. The relations have been conceptualized according to a formative measurement model but have mostly been tested according to a reflective measurement model. The trouble is that representing the usability construct by the reflective model appears inappropriate, and representing it by the formative model involves considerable difficulties. Possible ways of addressing these issues are discussed, each with its advantages and drawbacks. I conclude that for scientific research on this subject to progress, the usability construct ought to be unbundled and replaced by well-defined constructs. The issues discussed in this article are relevant to other HCI umbrella concepts and constructs such as user experience.
Article
This article describes the psychometric properties of the Emotional Metric Outcomes (EMO) questionnaire and the System Usability Scale (SUS) using data collected as part of a large-sample unmoderated usability study (n = 471). The EMO is a concise multifactor standardized questionnaire that provides an assessment of transaction-driven personal and relationship emotional outcomes, both positive and negative. The SUS is a well-known standardized usability questionnaire designed to assess perceived usability. In previous research, psychometric evaluation using data from a series of online surveys showed that the EMO and its component scales had high reliability and concurrent validity with loyalty and overall experience metrics but did not find the expected four-factor structure. Previous structural analyses of the SUS have had mixed results. Analysis of the EMO data from the usability study revealed the expected four-factor structure. The factor structure of the SUS appeared to be driven by item tone. The estimated reliability of the SUS (.90) was consistent with previous estimates. The EMO and its subscales were also quite reliable, with the estimates of reliability for the various EMO scales ranging from.86 to.96. Regression analysis using SUS, EMO, and Effort as predictors revealed different key drivers for the outcome metrics of Satisfaction and Likelihood-to-Recommend. The key recommendations are to include the EMO as part of the battery of poststudy standardized questionnaires, along with the SUS (or similar questionnaire), but to be cautious in reporting SUS subscales such as Usable and Learnable.
Article
The purpose of this research was to investigate various measurements of perceived usability, in particular, to assess (a) whether a regression formula developed previously to bring Usability Metric for User Experience LITE (UMUX-LITE) scores into correspondence with System Usability Scale (SUS) scores would continue to do so accurately with an independent set of data; (b) whether additional items covering concepts such as findability, reliability, responsiveness, perceived use by others, effectiveness, and visual appeal would be redundant with the construct of perceived usability or would align with other potential constructs; and (c) the dimensionality of the SUS as a function of self-reported frequency of use and expertise. Given the broad use of and emerging interpretative norms for the SUS, it was encouraging that the regression equation for the UMUX-LITE worked well with this independent set of data, although there is still a need to investigate its efficacy with a broader set of products and methods. Results from a series of principal components analyses indicated that most of the additional concepts, such as findability, familiarity, efficiency, control, and visual appeal covered the same statistical ground as the other more standard metrics for perceived usability. Two of the other items (Reliable and Responsive) made up a reliable construct named System Quality. None of the structural analyses of the SUS as a function of frequency of use or self-reported expertise produced the expected components, indicating the need for additional research in this area and a need to be cautious when using the Usable and Learnable components described in previous research.
Article
The Usability Metric for User Experience (UMUX) is a four-item Likert scale aimed at replicating the psychometric properties of the System Usability Scale (SUS) in a more compact form. As part of a special issue of the journal Interacting with Computers, the UMUX is being examined in terms of purpose, reliability, validity and structure. This response to commentaries addresses concerns with these issues through updated archival research, deeper analysis on the original data and some updated results with an average-scoring system. The new results show the UMUX performs as expected for a wide range of systems and consists of one underlying usability factor.
Article
The philosopher of science J. W. Grove (1989) once wrote, “There is, of course, nothing strange or scandalous about divisions of opinion among scientists. This is a condition for scientific progress” (p. 133). Over the past 30 years, usability, both as a practice and as an emerging science, has had its share of controversies. It has inherited some from its early roots in experimental psychology, measurement, and statistics. Others have emerged as the field of usability has matured and extended into user-centered design and user experience. In many ways, a field of inquiry is shaped by its controversies. This article reviews some of the persistent controversies in the field of usability, starting with their history, then assessing their current status from the perspective of a pragmatic practitioner. Put another way: Over the past three decades, what are some of the key lessons we have learned, and what remains to be learned? Some of the key lessons learned are:• When discussing usability, it is important to distinguish between the goals and practices of summative and formative usability.• There is compelling rational and empirical support for the practice of iterative formative usability testing—it appears to be effective in improving both objective and perceived usability.• When conducting usability studies, practitioners should use one of the currently available standardized usability questionnaires.• Because “magic number” rules of thumb for sample size requirements for usability tests are optimal only under very specific conditions, practitioners should use the tools that are available to guide sample size estimation rather than relying on “magic numbers.”
Article
In this paper we describe the development of a standardized computer satisfaction usability questionnaire for use with speakers of the Turkish language, the Turkish Computer System Usability Questionnaire, Short Version (T-CSUQ-SV). This new questionnaire, based on the English-language CSUQ, underwent careful translation and transformation through comprehensive psychometric evaluation. The results of the psychometric evaluation revealed an acceptable level of reliability, appropriate construct validity, and sensitivity to manipulation, indicating that Turkish usability practitioners should be able to use the T-CSUQ-SV with confidence when conducting user research.
Article
This chapter discusses the user's experience and evolution of usability engineering. Usability engineering starts with a commitment to action in the world. It seeks to capture user experience within a context situated in user work and in a form useful for engineering. Usability engineering provides operationally defined criteria so that usability objectives can be used to drive an efficient and productive engineering effort, and it can lead to production of systems that are experienced by users as usable and that serve as a basis for the next generation of systems. When a system is built and delivered to users, interaction with it would affect user experience and would shift the background against which users evaluate that system in comparison with other systems. Therefore, as systems are built that provide new functionality with new levels of usability, the expectations of users would shift so that the whole cycle should begin again.
Article
The System Usability Scale (SUS) was administered verbally to native English and non-native English speakers for several internally deployed applications. It was found that a significant proportion of non-native English speakers failed to understand the word "cumbersome" in Item 8 of the SUS (that is, "I found the system to be very cumbersome to use.") This finding has implications for reliability and validity when the questionnaire is distributed electronically in multinational usability efforts.
Article
The System Usability Scale (SUS) is an inexpensive, yet effective tool for assessing the usability of a product, including Web sites, cell phones, interactive voice response systems, TV applications, and more. It provides an easy-to-understand score from 0 (negative) to 100 (positive). While a 100-point scale is intuitive in many respects and allows for relative judgments, information describing how the numeric score translates into an absolute judgment of usability is not known. To help answer that question, a seven-point adjective-anchored Likert scale was added as an eleventh question to nearly 1,000 SUS surveys. Results show that the Likert scale scores correlate extremely well with the SUS scores (r=0.822). The addition of the adjective rating scale to the SUS may help practitioners interpret individual SUS scores and aid in explaining the results to non-human factors professionals.
Article
The article addresses some concerns about how coefficient alpha is reported and used. It also shows that alpha is not a measure of homogeneity or unidimensionality. This fact and the finding that test length is related to reliability may cause significant misinterpretations of measures when alpha is used as evidence that a measure is unidimensional. For multidimensional measures, use of alpha as the basis for corrections for attenuation causes overestimates of true correlation. Satisfactory levels of alpha depend on test use and interpretation. Even relatively low (e.g., .50) levels of criterion reliability do not seriously attenuate validity coefficients. When reporting intercorrelations among measures that should be discriminable, it is important to present observed correlations, appropriate measures of reliability, and correlations corrected for unreliability.
Article
Popular statistical software packages do not have the proper procedures for determining the number of components in factor and principal components analyses. Parallel analysis and Velicer’s minimum average partial (MAP) test are validated procedures, recommended widely by statisticians. However, many researchers continue to use alternative, simpler, but flawed procedures, such as the eigenvaluesgreater-than-one rule. Use of the proper procedures might be increased if these procedures could be conducted within familiar software environments. This paper describes brief and efficient programs for using SPSS and SAS to conduct parallel analyses and the MAP test.
Article
This article presents nearly 10 year's worth of System Usability Scale (SUS) data collected on numerous products in all phases of the development lifecycle. The SUS, developed by Brooke (1996)2. Brooke , J. 1996. “SUS: A “quick and dirty” usability scale”. In Usability evaluation in industry, Edited by: Jordan , P. W. , Thomas , B. A. Weerdmeester and McClelland , I. L. 189–194. London: Taylor & Francis. View all references, reflected a strong need in the usability community for a tool that could quickly and easily collect a user's subjective rating of a product's usability. The data in this study indicate that the SUS fulfills that need. Results from the analysis of this large number of SUS scores show that the SUS is a highly robust and versatile tool for usability professionals. The article presents these results and discusses their implications, describes nontraditional uses of the SUS, explains a proposed modification to the SUS to provide an adjective rating that correlates with a given score, and provides details of what constitutes an acceptable SUS score.
The system usability scale: Beyond standard usability testing
  • R A Grier
  • A Bangor
  • P T Kortum
  • S C Peres
Grier, R. A., Bangor, A., Kortum, P. T., & Peres, S. C. (2013). The system usability scale: Beyond standard usability testing. In Proceedings of the Human Factors and Ergonomics Society (pp. 187-191), Santa Monica, CA: Human Factors and Ergonomics Society.
The computer user satisfaction inventory (CUSI): Manual and scoring key
  • J Kirakowski
  • A Dillon
Kirakowski, J., & Dillon, A. (1988). The computer user satisfaction inventory (CUSI): Manual and scoring key. Cork, Ireland: Human Factors Research Group, University College of Cork.
A comparison of questionnaires for assessing website usability. Paper presented at the Usability Professionals Association Annual Conference Effects of response order on Likerttype scales
  • T S Tullis
  • J N L Stetson
  • C Cheng
Tullis, T. S., & Stetson, J. N. (2004). A comparison of questionnaires for assessing website usability. Paper presented at the Usability Professionals Association Annual Conference, UPA, Minneapolis, MN. Retrieved September 13, 2017 from, https://www.researchgate. net/publication/228609327_A_Comparison_of_Questionnaires_for_ Assessing_Website_Usability Weng, L., & Cheng, C. (2000). Effects of response order on Likerttype scales. Educational and Psychological Measurement, 60(6), 908-924.
A comparison of questionnaires for assessing website usability
  • T S Tullis
  • J N Stetson
Tullis, T. S., & Stetson, J. N. (2004). A comparison of questionnaires for assessing website usability. Paper presented at the Usability Professionals Association Annual Conference, UPA, Minneapolis, MN. Retrieved September 13, 2017 from, https://www.researchgate. net/publication/228609327_A_Comparison_of_Questionnaires_for_