Conference PaperPDF Available

Testing the usability of a personalized system: Comparing the use of interviews, questionnaires and thinking-aloud

Authors:

Abstract and Figures

Personalized systems present each user with tailored content or output. Testing the usability of such a system must take some specific usability issues and the suitability of the personalized output into account. In this study, we evaluated a personalized search engine to compare the use of interviews, questionnaires and concurrent thinking- aloud to this avail. The interview and the questionnaire are the best methods to elicit comments on usability issues. Concurrent thinking-aloud turned out to be the best method to elicit comments on the perceived relevance of search results. When testing the usability of a personalized system it is wise to use a combination of concurrent thinking-aloud and the interview or the questionnaire.
They show that there are significant differences among the methods for all usability issues, except predictability and comprehensibility. The different methods all elicit the same numbers of comments on these two topics. For the remaining five usability issues, we conducted Post-Hoc analyses by means of Bonferroni tests to ascertain which groups differed. As before, we applied a 500 significance level. The results of these tests can be found in table 3. We found that for the remaining five usability issues the interview always elicited more comments than thinking-aloud. In the case of controllability, the method also elicited more comments than the questionnaire. The questionnaire itself gathered more comments than thinking-aloud on the topics of controllability, unobtrusiveness and privacy. Appreciation and perceived relevance Besides a usability issue, a comment could also be classified as a statement on appreciation of personalization or the perceived relevance of search results. Table 5 shows the numbers of comments questionnaires, interviews and thinking-aloud elicited on these topics. In table 4, the results of ANOVA analyses that were conducted in order to identify differences among the amount of comments each method elicited on these topics are included. They show that these amounts differ for both topics. Post-Hoc analyses using Bonferroni tests and a 500 significance level show that for the topic of appreciation of personalization, the questionnaire and the interview elicit more comments than thinking-aloud.
… 
Content may be subject to copyright.
A preview of the PDF is not available
... the experiments and to verbalize their thoughts and experiences while interacting with the system. The observer has to observe and records their comments during the experiments. The author will select students from the University, who shall participate in the test. Study of Henderson found that the Think aloud identifying more issues on usability. [13], [16] ...
... The Questionnaires are the qualitative approach for gathering the user's data and may be useful and cheap than any other approach. To design the evaluation criteria for Questionnaire the author take the guideline form different authors [12], [13], [50]. 8 ...
... The selection of question and criteria is adopted form the different authors. [12], [13], [50] ...
... The second threat to construct validity is that collecting feedback online through questionnaires may introduce some bias. Face-to-face interviews help with more accurate screening, e.g., capturing verbal and non-verbal [61], [62]. In contrast, the data collected with online questionnaire could be inaccurate and misleading. ...
Article
Crowdsourcing is gaining more and more popularity among the academic and industrial community. Organizations are adopting this technological advent and increasingly crowdsourcing their tasks to the unknown individuals. However, in the context of competitive crowdsource software development (CCSD) crowdsourcing is still unexplored. Too little is presently known about what intricate developers to participate in crowdsourcing software development competitions. Most importantly, what kinds of developers are more likely to participate. Such open questions remain to be explored. To this end, in this paper, we present the results of an empirical study conducted to investigate what motivates software developers to participate in CCSD, and what inhibits software developers to participate in such competitions. An online questionnaire is sent out to more than 300 crowdsource software participants, of which 113 return valid responses. It’s also sent to more than 150 industry practitioners, of which 75 return valid responses. The results suggest monetary rewards are not significantly important to motivate software developers to participate in CCSD. Instead, learning, social contacts, and peer recognition are more important. Besides the survey, we also analyze the history data collected from one of the most popular software crowdsourcing platforms. The analysis results reveal that the Pareto principle holds for CCSD as well, and 0.9% of the participants win 86% competitions. The results support the premise that CCSD market is still at early stage. Most of the professional software engineers do not participate seriously in crowdsource software development. Therefore, many crowdsourced tasks, especially complex tasks, may fail to receive any satisfying submission. These findings are worthwhile for the crowdsourcing platforms and companies who want to outsource their software development tasks to the CCSD platforms.
... Voor de evaluatie van de MRSA-net website is vervolgens gekozen voor een combinatie van het hardop-denk-protocol, interviews en een schriftelijke vragenlijst. Deze combinatie is in navolging van de literatuur (Van Velsen,Van der Geest & Klaassen, 2007;Nielsen, 2000; Krahmer & Ummelen, 2004) gebruikt, omdat deze methoden de meeste uitspraken genereren met betrekking tot de verschillende usability-aspecten. Bovendien kan middels het hardopdenk-protocol de redenatie van de gebruiker betreffende zijn zoekgedrag naar voren komen. ...
Conference Paper
The exponential increase in availability of scientific papers, institutional reports or research monographies in digital contexts (i.e. in digital repositories, archives or social scientific networks) has led to the advancement of manual, semi-automatic or automatic-based methods to analyze these texts in the digital environment. These techniques cover a heterogeneous range, from manual expert analysis using computer methods (usually by annotation systems), to the application of natural language processing algorithms or discourse analysis techniques, which are able to identify cognitive relationships between text elements, e.g. causal structures or contrasts argumentations. This advancement is more evident in humanities research contexts, where most of the knowledge generated are expressed in textual formats. However, how the use of these techniques is affecting the analysis conducted by researchers in humanities' texts? Is it possible to measure the quality of the textual analysis? What kind of cognitive structures are identified in the text using these methods? This paper presents an empirical study conducted with humanities researchers, with the goal of obtaining a better understanding about how texts in digital contexts are analyzed by these professionals using semiautomatic discourse analysis techniques. The paper also proposes a method, based on Thinking Aloud protocols, in order to design experiments and evaluate software cognitive aspects, such as digital textual analysis, with humanities professionals. Finally, the paper discusses about how empirical studies and the Thinking Aloud method constitute a solid basis to better understand the relationship between expert textual analysis in humanities and it conducting using software methods.
Article
Full-text available
A fisioterapia preconiza um atendimento individualizado, gradativo e, frequentemente, extenso. A repetição de exercícios aliada ao grande período de tratamento a que o paciente é submetido e os ganhos diários, geralmente ínfimos, são as principais causas da desmotivação do paciente e sua consequente evasão das clínicas de fisioterapia (MENDONÇA & GUERRA, 2004). Nesse contexto, o uso de Tecnologias Interativas (TIs) na saúde tem sido uma solução proposta para estimular maior engajamento do paciente ao processo de reabilitação por promover um ambiente de reabilitação mais rico e motivador (KESHNER, 2004; LITTMAN, 1999; SVEISTRUP, 2004). As ferramentas baseada sem TIs visam facilitar a interação entre o usuário (i.e., pacientes) e o computador (TORI, 2005). A realidade virtual (RV), uma das TIs relevantes da atualidade, tem sido alvo de pesquisas pelo seu uso em programas de reabilitação. A RV proporciona aos pacientes a oportunidade de se engajarem em ambientes virtuais que se assemelham ao mundo real (DE BRUIN et al., 2010). Diversos estudos propõem a Resumo: A aplicabilidade das Tecnologias Interativas (TIs) na área de saúde, em particular na reabilitação motora, tem sido uma alternativa clínica usada com intuito de estimular maior engajamento do paciente ao seu processo de recuperação que por vezes é extenuante. O presente estudo descreve uma ferramenta tecnológica –Ikapp-de suporte a reabilitação motora. Ferramenta essa que busca ampliar as possibilidades dos dispositivos comerciais já existentes no contexto clínico. Sessenta (60) voluntários foram convidados a interagir com as interfaces do setup e do jogo do Ikapp com objetivo de examinar a funcionalidade, grau de aceitação, demandas e limitações para aprimoramentos. Os resultados do presente estudo demonstram altos índices de satisfação pelos participantes. Além disso, os resultados demonstraram que o Ikapp é uma ferramenta que agrega valores terapêuticos à ludicidade e motivação de acordo com a perspectiva dos participantes. Palavras-chave: Reabilitação. Interação usuário-computador. Tecnologia. Development and improvement of a computational system-Ikapp-to support motor rehabilitation Abstract: The applicability of Interactive Technologies (ITs) in the health area, especially in motor rehabilitation, has been a therapeutic alternative used aiming to encourage a greater patient engagement in their recovery process that is sometimes lengthy. The present study describes the technological tool (Ikapp) to support motor rehabilitation, which aims to expand the possibility of the commercial devices that is already used in clinical practice. Sixty (60) volunteers were invited to interact with the setup and game interfaces of Ikapp aiming to examine their features, the degree of acceptance, demands and limitations to the enhancement. The results of present study showed high levels of satisfaction for the participants. Furthermore, the results demonstrated that the Ikapp is a tool that adds value to therapeutic playfulness and motivation according to the participant' perspective.
Article
Full-text available
The applicability of Interactive Technologies (ITs) in the health area, especially in motor rehabilitation, has been a therapeutic alternative used aiming to encourage a greater patient engagement in their recovery process that is sometimes lengthy. The present study describes the technological tool (Ikapp) to support motor rehabilitation, which aims to expand the possibility of the commercial devices that is already used in clinical practice. Sixty (60) volunteers were invited to interact with the setup and game interfaces of Ikapp aiming to examine their features, the degree of acceptance, demands and limitations to the enhancement. The results of present study showed high levels of satisfaction for the participants. Furthermore, the results demonstrated that the Ikapp is a tool that adds value to therapeutic playfulness and motivation according to the participant' perspective.
Article
In chapter 1, I introduced the concept of personalization and showed how tailored electronic communication is the product of centuries of evolution. Personalization involves gearing communication towards an individual’s characteristics, preferences and context. User-Centered Design (UCD) was proposed as a means to achieve a good fit between personalized communication and the individual user. This means that design of personalization should include an initial focus on users and their tasks, studies should be conducted that focus on actual user behavior and perceptions, and finally, an iterative design approach should be applied. In this way, problematic issues related to specific, personalized usability issues, such as privacy or a need for control, can be prevented. Chapter 2 addressed an early stage in the UCD process of personalization to determine the role of trust in the organization providing personalization, trust in the technology, and perceived controllability in relation to the intention of potential users to use online content personalization. Using an online questionnaire, 1,141 participants were demonstrated four common approaches to online content personalization and a non-personalized baseline condition with respect to a fictive municipality. We assessed participant perceptions of the aforementioned factors and determined their influence on the intention to use the different approaches to online content personalization. Trust in the organization appeared to play no role in the decision to use online content personalization. Trust in the technology had a moderate effect on the intention to use, while perceived controllability was overall the most important antecedent. When designing online content personalization, it is therefore most important to provide users with the option to control personalization. Next, users should be assured that they are interacting with an organization in a secure electronic environment. The requirements engineering phase was focus of chapter 3. In that chapter, we proposed a user-centered approach to requirements engineering for personalized e-Government services and demonstrated its value by means of a case study. The approach utilized interviews and formulated requirements by focusing on concrete and measurable criteria, low-fidelity prototyping, and evaluating by means of a citizen walkthrough. The case study reaffirmed the importance of applying an iterative approach to design, as the translation of user input into system design may not align with the original characteristics, preferences and contexts of the user. Furthermore, using a citizen walkthrough, the proposed approach succeeded in making personalization understandable to participants, which is an important objective for evaluating personalization. Finally, the case study demonstrated that a multidisciplinary design team is a crucial aspect of creating personalized e-Government services. In chapter 4, we reviewed literature that focused on user-centered evaluation of personalization (i.e., evaluations that include an assessment of subjective criteria or the identification of usability problems). The findings indicate that current user-centered evaluations, as reported in the scientific literature, are not well-aligned with the principles of UCD. Questionnaires appeared to be exceedingly popular, while methods that have been found to identify usability problems well, such as thinking-aloud techniques, are only used sparingly. Specific usability issues for personalization are only rarely a topic of investigation. In the last few years, however, an increasing number of publications have reported on evaluations that focus on acceptance, iterative design or system trust. This trend suggests that personalization researchers are becoming aware of the added value of user-centered evaluations and are starting to make it part of their common research practice. Chapter 5 reported a comparison of the usefulness of three methods (i.e., interviews, questionnaires with open-ended questions and concurrent thinking-aloud techniques) for identifying usability issues in personalized systems. Thinking-aloud was the only method that uncovered all critical and serious problems related to personalization as well as usability problems not related to personalization. Furthermore, it was also the method that best elicited participant feedback on the perceived quality of personalized output. Comments on the specific usability issues for personalization were elicited best by the questionnaire. Therefore, when evaluating a personalized system in order to obtain input for redesign purposes, we recommend a combination of thinking-aloud techniques and questionnaires with open-ended questions that address specific usability issues in personalization.
Conference Paper
Full-text available
Background: Usability evaluation methods have become critical in the Web domain to ensure the success of Web applications. Aim: Since a large number of proposals have been presented during the last few years, a question arises: Which usability evaluation methods have proven to be the most effective in the Web domain? Method: This paper presents a systematic review that was motivated by previous results obtained from a systematic mapping study in the Web usability evaluation field. Results: A total of 18 studies were selected from an initial set of 206 in order to extract, code, and synthesize empirical data concerning the effectiveness of usability evaluation methods for the Web. Conclusions: We detected a need of more empirical studies and more standardized effectiveness measures for comparing usability evaluation methods. Our results suggest several evaluation methods which may be useful in allowing researchers and practitioners to perform effective Web usability evaluations.
Article
Full-text available
Purpose: This article discusses how personalization will affect technical communication practitioners' everyday work, and indicates to researchers which knowledge gaps scientific research needs to fill. Method: After a description of how personalization exactly works, we demonstrate that the technique is very similar to the approach to personalization as applied in ancient rhetoric. Next, we describe how the history of the concept "the audience," and how it has been analyzed and approached, has led to the tactic of electronically tailoring communication to individuals. We propose the User-Centered Design approach as an approach that can help the designer get to know the individual user, thereby increasing the fit between personalized systems and users' needs, wishes, and contexts. Results: We discuss how the User-Centered Design approach needs to be adjusted to cope with the demands personalization places on the approach. Furthermore, we consider the technical communicator's role in this design process. Conclusion: Technical communicators need to devise and lead user studies that inform and evaluate each step of the personalization process. Researchers need to focus their efforts on studies that aid the design of personalized systems, like discerning in which situations personalization is of added value or not, and identifying the factors that influence the acceptance of personalization.
Chapter
Full-text available
Empirical studies with adaptive systems offer many advantages and opportunities. Nevertheless, there is still a lack of evaluation studies. This chapter lists several problems and pitfalls that arise when evaluating an adaptive system, and provides guidelines and recommendations for workarounds or even avoidance of these problems. Among other things the following issues are covered: relating evaluation studies to the development cycle; saving resources; specifying control conditions, sample, and criteria; asking users for adaptivity effects; reporting results. An overview of existing evaluation frameworks shows which of these problems have been addressed and in which way.
Article
Full-text available
Adaptive and adaptable systems provide tailored output to various users in various contexts. While adaptive systems base their output on implicit inferences, adaptable systems use explicitly provided information. Since the presentation or output of these systems is adapted, standard user-centered evaluation methods do not produce results that can be easily generalized. This calls for a reflection on the appropriateness of standard evaluation methods for user-centered evaluations of these systems. We have conducted a literature review to create an overview of the methods that have been used. When reviewing the empirical evaluation studies we have, among other things, focused on the variables measured and the implementation of results in the (re)design process. The goal of our review has been to compose a framework for user-centered evaluation. In the next phase of the project, we intend to test some of the most valid and feasible methods with an adaptive or adaptable system.
Article
Full-text available
Despite the increasing popularity of electronic commerce, there appears to be little evidence of the methodical evaluation of the usability of commercial web sites. The usability of a web site defines how well and how easily a visitor, without formal training, can interact with the site. This paper reports the results of a research project, which applies a systematic qualitative technique known as protocol analysis or think aloud method, to examine the usability of a commercial web site. About 15 usability principles and 3 evaluation parameters (content, navigation and interactivity) were used as a framework to analyze the verbal protocols of a sample of users interacting with a greeting card web site. The protocols provided evidence of usability problems caused by crowded content, poor navigation and cumbersome interactivity. These results underscore the importance of two crucial usability goals for commercial web sites: clear path to products and transparency of the ordering process.
Chapter
Empirical studies with adaptive systems offer many advantages and opportunities. Nevertheless, there is still a lack of evaluation studies. This chapter lists several problems and pitfalls that arise when evaluating an adaptive system, and provides guidelines and recommendations for workarounds or even avoidance of these problems. Among other things the following issues are covered: relating evaluation studies to the development cycle; saving resources; specifying control conditions, sample, and criteria; asking users for adaptivity effects; reporting results. An overview of existing evaluation frameworks shows which of these problems have been addressed and in which way.
Article
The aim was to examine four prominent user-based computer software usability evaluation methods. Four evaluation methods (logged data, questionnaire, inter view, and verbal protocol analysis) were used to evaluate three different business software types (spreadsheet, word processor, and database) using a between-groups design, involving 148 individuals of mixed age and gender. Comparisons were made to examine the efficiency of each evaluation method in terms of its ability to highlight usability problems both between and within the evaluation strategy. Here, the verbal protocol analysis was found to be most efficient. The possibility of further efficiency gains by using two evaluation methods was also examined, where it was found that no statistically significant improvement was obtained over the verbal protocol analysis used by itself. Ways in which the utility of the methods may be enhanced was also discussed.
Chapter
Designing interactive computer systems to be efficient and easy to use is important so that people in our society may realize the potential benefits of computer-based tools .... Although modern cognitive psychology contains a wealth of knowledge of human ...
Article
Without prior searching instruction, undergraduate novices wrote structured self-reports during their first session on a Web search engine. Users chose their own topics and followed written instructions that prompted them to describe thoughts and feelings during specified stages of the search: pre-search formulation; search statement formulation; search strategy; and evaluation of results. The sentences in the self-reports were numbered and then coded according to their affective or cognitive function. The affective sentences reveal how users set goals and limit the scope of the cognitive operations. Search acts appear to be governed by an affective filter that organizes incoming information and provides criteria for ranking cognitive relevance to search goal. The cognitive sentences reveal a variety of operations in executing searches. Following the search, students made self-ratings on self-confidence as searchers and satisfaction with the search experience, with explanations of their ratings. Content analysis identified reasons users have for rating self-confidence, stress level, satisfaction, usefulness, and success with future searches.
Article
We have evaluated an adaptive hypermedia system, PUSH, and compared it to a non-adaptive variant of the same system. Based on an inferred information-seeking task, PUSH chooses what to show and what to hide in a page using a stretchtext technique, thus attempting to avoid information overload.We studied how successful the subjects were in retrieving the most relevant information, and found that the subjects' solutions were influenced by the choice made by the adaptive system. We also studied how much the adaptivity reduced the amount of actions needed, and found that subjects made substantially fewer actions in the adaptive case. A third measurement was the subjects' subjective preferences for the adaptive or the non-adaptive system, where we found that the subjects clearly preferred the adaptive system. It seems as if the adaptive system requires fewer decisions on behalf of the subject, thereby reducing the cognitive load.