Conference PaperPDF Available

Preliminary Results of a Systematic Review: Quality Assessment of Conversational Agents (Chatbots) for People with Disabilities or Special Needs



People with disabilities or special needs can benefit from AI-based conversational agents, which are used in competence training and well-being management. Assessment of the quality of interactions with these chatbots is key to being able to reduce dissatisfaction with them and to understand their potential long-term benefits. This will in turn help to increase adherence to their use, thereby improving the quality of life of the large population of end-users that they are able to serve. We systematically reviewed the literature on methods of assessing the perceived quality of interactions with chatbots, and identified only 15 of 192 papers on this topic that included people with disabilities or special needs in their assessments. The results also highlighted the lack of a shared theoretical framework for assessing the perceived quality of interactions with chatbots. Systematic procedures based on reliable and valid methodologies continue to be needed in this field. The current lack of reliable tools and systematic methods for assessing chatbots for people with disabilities and special needs is concerning, and may lead to unreliable systems entering the market with disruptive consequences for users. Three major conclusions can be drawn from this systematic analysis: (i) researchers should adopt consolidated and comparable methodologies to rule out risks in use; (ii) the constructs of satisfaction and acceptability are different, and should be measured separately; (iii) dedicated tools and methods for assessing the quality of interaction with chatbots should be developed and used to enable the generation of comparable evidence.
Preliminary Results of a Systematic Review:
Quality Assessment of Conversational Agents
(Chatbots) for People with Disabilities
or Special Needs
Maria Laura de Filippis
, Stefano Federici
Maria Laura Mele
, Simone Borsci
, Marco Bracalenti
Giancarlo Gaudino
, Antonello Cocco
, Massimo Amendola
and Emilio Simonetti
Department of Philosophy, Social and Human Sciences and Education,
University of Perugia, Perugia, Italy
Department of Cognitive Psychology and Ergonomics, Faculty of BMS,
University of Twente, Enschede, The Netherlands
Department of Surgery and Cancer, Faculty of Medicine, NIHR London IVD,
Imperial College, London, UK
Design Research Group, School of Creative Arts, Hertfordshire University,
Hateld, UK
DGTCSI-ISCTI, Directorate General for Management and Information
and Communications Technology, Superior Institute of Communication
and Information Technologies, Ministry of Economic Development, Rome, Italy
Department of Public Service, Prime MinistersOfce, Rome, Italy
Abstract. People with disabilities or special needs can benet from AI-based
conversational agents, which are used in competence training and well-being
management. Assessment of the quality of interactions with these chatbots is
key to being able to reduce dissatisfaction with them and to understand their
potential long-term benets. This will in turn help to increase adherence to their
use, thereby improving the quality of life of the large population of end-users
that they are able to serve. We systematically reviewed the literature on methods
of assessing the perceived quality of interactions with chatbots, and identied
only 15 of 192 papers on this topic that included people with disabilities or
special needs in their assessments. The results also highlighted the lack of a
shared theoretical framework for assessing the perceived quality of interactions
with chatbots. Systematic procedures based on reliable and valid methodologies
continue to be needed in this eld. The current lack of reliable tools and sys-
tematic methods for assessing chatbots for people with disabilities and special
needs is concerning, and may lead to unreliable systems entering the market
with disruptive consequences for users. Three major conclusions can be drawn
from this systematic analysis: (i) researchers should adopt consolidated and
comparable methodologies to rule out risks in use; (ii) the constructs of satis-
faction and acceptability are different, and should be measured separately;
©Springer Nature Switzerland AG 2020
K. Miesenberger et al. (Eds.): ICCHP 2020, LNCS 12376, pp. 250257, 2020.
(iii) dedicated tools and methods for assessing the quality of interaction with
chatbots should be developed and used to enable the generation of comparable
Keywords: Chatbots Conversational agents People with disability People
with special needs Usability Quality of interaction
1 Introduction
Chatbots are intelligent conversational software agents which can interact with people
using natural language text-based dialogue [1]. They are extensively used to support
interpersonal services, decision making, and training in various domains [25]. There is
a broad consensus on the effectiveness of these AI agents, particularly in the eld of
health, where they can promote recovery, adherence to treatment, and training [6,7] for
the preparation of different competencies and the maintenance of well-being [3,8,9].
In view of this, an evaluation of the perceived quality of engagement with chatbots is
key to being able to reduce dissatisfaction, facilitate their possible long-term benets,
increase loyalty and thus improve the quality of life of the large population of end-users
that they are able to serve. Chatbots are interaction systems, and irrespective of their
domain of application, their output in terms of the quality of interaction should be
planned and measured in conjunction with their users, rather than by applying a
system-centric approach [1]. A recent review by Abd-Alrazaq and colleagues [6] found
that in the eld of mental health, researchers typically only test chatbots in a ran-
domized controlled trial. The efciency of interaction is seldom assessed, and is gen-
erally done by looking at non-standardized aspects of interaction and qualitative
measurements that do not require comparisons to be made. This unreliable method of
testing the quality of interaction of these devices or applications through a wide and
varied range of variables is endemic in all elds that use chatbots, and makes it difcult
to compare the results of these studies [1,10,11]. While some qualitative guidelines
and tools have emerged [1,12], it is still hard to nd agreement on which factors should
be tested. As argued by Park and Humphry [13], the implementation of these inno-
vative systems should be based on a common framework for assessing the perceived
interaction quality, in order to prevent chatbots from being regarded by their end-users
as merely another source of social alienation, and being discarded in the same way as
any other unreliable assistive technology [14,15]. A common framework and guide-
lines on how to determine the perceived quality of chatbot interaction are therefore
required. From a systems perspective, a subjective experience of consistency arises
from the interaction between the user and the program in specic conditions and
contexts. Subjective experience cannot be measured merely by believing that the
optimal performance of the system as perceived by the user is the same as a good user
experience [16]. The need to quantify the objective and subjective dimensions of
experience in a reliable and comparable manner is a lesson that has been learned by
those in the eld of human-computer interaction, but has yet to be learned in the eld of
chatbots, as outlined by Lewis [17] and Bendig and colleagues [18]. Chatbot devel-
opers are forced to rely on the umbrella framework provided by the International
Preliminary Results of a Systematic Review 251
Organization for Standards (ISO) 9241-11 [19] for assessing usability, and ISO 9241-
210 [20] for assessing user experience (UX), due to the absence of a common
assessment framework that species comparable evaluation criteria. These two ISO
standards dene the key factors of interaction quality: (i) effectiveness, efciency and
satisfaction in a specic context of use (ISO 9241-11); and (ii) control (where possible)
of expectations over time concerning use, satisfaction, perceived level of acceptability,
trust, usefulness and all those factors that ultimately push users to adopt and keep using
a tool (ISO 9241-210). Although these standards have not yet been updated to meet the
specic needs of chatbots and conversational agents, the two aspects of usability and
UX are essential to the perceived quality of interaction [21]. Until a framework has
been developed and broad consensus reached on assessment criteria, practitioners may
benet from the assessment of chatbots against these ISO standards, as they allow for
an evaluation of the interactive output of these applications. This paper examines how
aspects of perceived interaction quality are assessed in studies of AI-based agents that
support people with disabilities or special needs. Our systematic literature review was
conducted in accordance with the PRISMA reporting checklist.
2 Methods
A systematic review was carried out of journal articles investigating the relationship
between chatbots and people with disabilities or special needs over the last 10 years. To
determine whether and how the quality of interaction with chatbots was evaluated in
line with ISO standards of usability (ISO 9241-11) and UX (ISO 9241-210), this
review sought to answer the following research questions:
R1. How are the key factors of usability (efciency, effectiveness, and satisfaction)
measured and reported in evaluations of chatbots for people with disabilities or
special needs?
R2. How are factors relating to UX measured and reported in assessments of chatbots?
We included in our review studies that: (i) referred to chatbots or conversational
interfaces/agents for people with disabilities or special needs in the title, abstract,
keywords or main text; (ii) included empirical ndings and discussions of theories (or
frameworks) of factors that could contribute to the perceived quality of interaction with
chatbots, with a focus on people with various types of disability.
We excluded records that did not include at least one group of end-users with a
disability in either the testing or the design of the interaction, and studies that focused
on: (i) testing emotion recognition during the interaction exchange, or assessing
applications for detecting the development of disability conditions or disease;
(ii) chatbots supporting people with alcoholism, anxiety, depression or traumatic dis-
orders; (iii) the assessment of end-user compliance with clinical treatment, or assess-
ment of the clinical effectiveness of using AI agents as an alternative to standard (or
other) forms of care without considering the interaction exchange with the chatbot; and
(iv) the ethical and legal implications of interacting with AI-based digital tools.
Records were retrieved from Scopus and the Web of Science using the Boolean
252 M. L. de Filippis et al.
operators (AND/OR) to combine the following keywords: chatbot*, conversational
agent*, special needs, disability*. We searched only for English language articles.
3 Results
A total of 147 items were retrieved from Scopus and Web of Science. A further 53
records were added based on a previous review of chatbots in mental health in [6].
After removing eight duplicates, a scan of the remaining 192 records by title and
abstract was performed by two authors (MLDF, SB). Articles that dened their scope
as including the assessment of interactions between chatbots and conversational agents
and people with various types of intellectual disabilities or special needs were retained.
The full text of 68 records was then scanned to look for articles mentioning methods
and factors for assessing the interactions of people with disabilities or special needs
with chatbots. The nal list consisted of 15 documents [3,8,9,2233], 80% of which
had already been discussed in previous work by Abd-Alrazaq et al. [6] for different
Of the 15 records that matched our criteria, 80% examined AI agents in terms of
supporting people with autism and (mild to severe) mental disabilities, while the other
20% focused on the testing of applications to support the general health or training of
people with a wide range of disabilities. The main goal of 66.6% of the applications
was to support health and rehabilitation, while the remaining studies focused on
solutions to support learning and training for people with disabilities. In terms of their
approach to assessment, 46.7% of the studies used surveys or questionnaires, 26.7%
applied a quasi-experimental procedure, and the remaining 26.7% tested chatbots using
randomized controlled trials (i.e., testing the use of the agent versus standard practice
with a between design) that assessed several aspects relating to the quality of the
interaction. Factors relating to usability (i.e., effectiveness, efciency, and satisfaction)
were partly assessed, with 80% of the studies reporting measures of effectiveness,
26.7% measures of efciency and 20% measures of satisfaction. In terms of UX,
acceptability was the most frequently reported measure (26.7% of the cases) while a
few other factors (e.g., engagement, safety, helpfulness, etc.) were measured using
various approaches.
4 Discussion
The results suggest that the main focus of studies of chatbots for people with dis-
abilities or special needs is the effectiveness of such apps compared with standard
practice, in terms of supporting adherence to treatment. The results can be summarized
in accordance with our research questions as follows:
R1. A total of 80% of the studies [3,8,9,23,25,27,3033] tested the effectiveness of
chatbots according to the ISO standard [19], i.e., the ability of the app to perform
correctly, allowing the users to achieve their goals. Only 26.7% of the studies
[9,25,26,32] also investigated efciency, by measuring performance in terms of
Preliminary Results of a Systematic Review 253
time or factors relating to the resources invested by participants to achieve their
goals. Only 20% [9,22,23] referred to an intention to gather data on user
satisfaction in a structured way, and only one study [23] used a validated scale
(e.g., the System Usability Scale, or user metrics of UX [34]). In another, prac-
titioners adapted a standardized questionnaire without clarifying the changes to
the items [22], and a qualitative scale was used in a further study [9].
R2. Acceptability was identied as an assessment factor in 26.7% of the studies [9,22,
24,25]. Despite the popularity of the technology acceptance model [35,36],
acceptability was measured in a variety of ways (e.g., lack of complaints [25]) or
treated as a measure of satisfaction [24]. A total of 53% of the studies used various
factors to assess the quality of interaction, such as the overall experience, safety,
acceptability, engagement, intention to use, ease of use, helpfulness, enjoyment,
and appearance. Most used non-standardized questionnaires to assess the quality of
interaction. Even when a factor such as safety was identied as a reasonable form
of quality control, in compliance with ISO standards for medical devices [37] and
risk analysis [38], the method of its measurement in these studies was questionable,
i.e., assessing a product to be safe based on a lack of adverse events [9].
5 Conclusion
The results of the present study suggest that informal and untested measures of quality
are often employed when it comes to evaluating user interactions with AI agents. This
is particularly relevant in the domain of health and well-being, where researchers set
out to measure the clinical validity of tools intended to support people with disabilities
or special needs. The risk is that shortcomings in these methods could signicantly
compromise the quality of chatbot usage, ultimately leading to the abandonment of
applications that could otherwise have a positive impact on their end-users. Three
major ndings can be identied from this systematic analysis. (i) Researchers tend to
consider a lack of complaints as an indirect measure of the safety and acceptability of
tools. However, safety and acceptability should be assessed with consolidated and
comparable methodologies to rule out risks in use [3739]. (ii) Satisfaction, intended as
a usability metric, is a different construct from acceptability, and these two constructs
should be measured separately with available standardized questionnaires [39,40].
(iii) Although dedicated tools and methods for assessing the quality of interaction with
chatbots are lacking, reliable methods and measures to assess interaction are available
[17,19,21,37], and these should be adopted and used to enable the generation of
comparable evidence regarding the quality of conversational agents.
254 M. L. de Filippis et al.
1. Radziwill, N.M., Benton, M.C.: Evaluating quality of chatbots and intelligent conversational
agents. arXiv preprint arXiv:1704.04579 (2017)
2. Ammari, T., Kaye, J., Tsai, J.Y., Bentley, F.: Music, search, and IoT: how people (really) use
voice assistants. ACM Trans. Comput.-Hum. Interact. 26 (2019).
3. Beaudry, J., Consigli, A., Clark, C., Robinson, K.J.: Getting ready for adult healthcare:
designing a chatbot to coach adolescents with special health needs through the transitions of
care. J. Pediatr. Nurs. 49,8591 (2019).
4. Costa, S., Brunete, A., Bae, B.C., Mavridis, N.: Emotional storytelling using virtual and
robotic agents. Int. J. Hum. Robot. 15 (2018).
5. Dmello, S., Graesser, A.: AutoTutor and affective AutoTutor: learning by talking with
cognitively and emotionally intelligent computers that talk back. ACM Trans. Interact. Intell.
Syst. 2(2012).
6. Abd-Alrazaq, A.A., Alajlani, M., Alalwan, A.A., Bewick, B.M., Gardner, P., Househ, M.:
An overview of the features of chatbots in mental health: a scoping review. Int. J. Med. Inf.
132 (2019).
7. Fadhil, A., Wang, Y., Reiterer, H.: Assistive conversational agent for health coaching: a
validation study. Methods Inf. Med. 58, 009023 (2019)
8. Burke, S.L., et al.: Using virtual interactive training agents (ViTA) with adults with autism
and other developmental disabilities. J. Autism Dev. Disord. 48(3), 905912 (2017). https://
9. Ellis, T., Latham, N.K., DeAngelis, T.R., Thomas, C.A., Saint-Hilaire, M., Bickmore, T.W.:
Feasibility of a virtual exercise coach to promote walking in community-dwelling persons
with Parkinson disease. Am. J. Phys. Med. Rehabil. 92, 472485 (2013).
10. Balaji, D., Borsci, S.: Assessing user satisfaction with information chatbots: a preliminary
investigation. University of Twente, University of Twente repository (2019)
11. Tariverdiyeva, G., Borsci, S.: Chatbotsperceived usability in information retrieval tasks: an
exploratory analysis. University of Twente, University of Twente repository (2019)
12. IBM
13. Park, S., Humphry, J.: Exclusion by design: intersections of social, digital and data
exclusion. Inf. Commun. Soc. 22, 934953 (2019)
14. Federici, S., Borsci, S.: Providing assistive technology in Italy: the perceived delivery
process quality as affecting abandonment. Disabil. Rehabil. Assist. Technol. 11,2231
15. Scherer, M.J., Federici, S.: Why people use and dont use technologies: introduction to the
special issue on assistive technologies for cognition/cognitive support technologies.
NeuroRehabilitation 37, 315319 (2015).
16. Bevan, N.: Measuring usability as quality of use. Softw. Qual. J. 4, 115130 (1995). https://
17. Lewis, J.R.: Usability: lessons learnedand yet to be learned. Int. J. Hum.-Comput.
Interact. 30, 663684 (2014).
18. Bendig, E., Erb, B., Schulze-Thuesing, L., Baumeister, H.: The next generation: chatbots in
clinical psychology and psychotherapy to foster mental health a scoping review.
Verhaltenstherapie (2019).
19. ISO: ISO 9241-11:2018 Ergonomic Requirements for Ofce Work with Visual Display
Terminals Part 11: Guidance on Usability. CEN, Brussels (2018)
Preliminary Results of a Systematic Review 255
20. ISO: ISO 9241-210:2010 Ergonomics of Human-System Interaction Part 210: Human-
Centred Design for Interactive Systems. CEN, Brussels (2010)
21. Borsci, S., Federici, S., Malizia, A., De Filippis, M.L.: Shaking the usability tree: why
usability is not a dead end, and a constructive way forward. Behav. Inform. Technol. 38,
519532 (2019).
22. Ali, M.R., et al.: A virtual conversational agent for teens with autism: experimental results
and design lessons. arXiv preprint arXiv:1811.03046 (2018)
23. Cameron, G., et al.: Assessing the usability of a chatbot for mental health care. In:
Bodrunova, S.S., et al. (eds.) INSCI 2018. LNCS, vol. 11551, pp. 121132. Springer, Cham
24. Konstantinidis, E.I., Hitoglou-Antoniadou, M., Luneski, A., Bamidis, P.D., Nikolaidou, M.M.:
Using affective avatars and rich multimedia content for education of children with autism. In:
2nd International Conference on PErvasive Technologies Related to Assistive Environments
(PETRA 2009), pp. 16 (2009).
25. Lahiri, U., Bekele, E., Dohrmann, E., Warren, Z., Sarkar, N.: Design of a virtual reality
based adaptive response technology for children with autism. IEEE Trans. Neural Syst.
Rehabil. Eng. 21,5564 (2013).
26. Ly, K.H., Ly, A.-M., Andersson, G.: A fully automated conversational agent for promoting
mental well-being: a pilot RCT using mixed methods. Internet Interv. 10,3946 (2017).
27. Milne, M., Luerssen, M.H., Lewis, T.W., Leibbrandt, R.E., Powers, D.M.W.: Development
of a virtual agent based social tutor for children with autism spectrum disorders. In:
International Joint Conference on Neural Networks (IJCNN 2010), pp. 19 (2010). https://
28. Razavi, S.Z., Ali, M.R., Smith, T.H., Schubert, L.K., Hoque, M.: The LISSA virtual human
and ASD teens: an overview of initial experiments. In: Traum, D., Swartout, W.,
Khooshabeh, P., Kopp, S., Scherer, S., Leuski, A. (eds.) IVA 2016. LNCS (LNAI), vol.
10011, pp. 460463. Springer, Cham (2016).
29. Smith, M.J., et al.: Job offers to individuals with severe mental illness after participation in
virtual reality job interview training. Psychiatr. Serv. 66, 11731179 (2015).
30. Smith, M.J., et al.: Virtual reality job interview training for individuals with psychiatric
disabilities. J. Nerv. Mental Dis. 202, 659667 (2014).
31. Tanaka, H., Negoro, H., Iwasaka, H., Nakamura, S.: Embodied conversational agents for
multimodal automated social skills training in people with autism spectrum disorders.
PLoS ONE 12, e0182151 (2017).
32. Wargnier, P., Benveniste, S., Jouvelot, P., Rigaud, A.-S.: Usability assessment of interaction
management support in Louise, an ECA-based user interface for elders with cognitive
impairment. Technol. Disabil. 30, 105126 (2018).
33. Smith, M.J., et al.: Virtual reality job interview training in adults with autism spectrum
disorder. J. Autism Dev. Disord. 44(10), 24502463 (2014).
34. Borsci, S., Federici, S., Bacci, S., Gnaldi, M., Bartolucci, F.: Assessing user satisfaction in
the era of user experience: comparison of the SUS, UMUX and UMUX-LITE as a function
of product experience. Int. J. Hum.-Comput. Interact. 31, 484495 (2015).
35. Venkatesh, V., Morris, M.G., Davis, G.B., Davis, F.D.: User acceptance of information
technology: toward a unied view. MIS Q.: Manag. Inf. Syst. 27, 425478 (2003)
256 M. L. de Filippis et al.
36. Federici, S., Tiberio, L., Scherer, M.J.: Ambient assistive technology for people with
dementia: an answer to the epidemiologic transition. In: Combs, D. (ed.) New Research on
Assistive Technologies: Uses and Limitations, pp. 130. Nova Publishers, New York
37. IEC: IEC 62366-1:2015 Medical Devices Part 1: Application of Usability Engineering to
Medical Devices. CEN, Brussels (2015)
38. ISO: ISO 14971:2007 Medical Devices Application of Risk Management to Medical
Devices. CEN, Brussels (2007)
39. Borsci, S., Federici, S., Mele, M.L., Conti, M.: Short scales of satisfaction assessment: a
proxy to involve disabled users in the usability testing of websites. In: Kurosu, M. (ed.) HCI
2015. LNCS, vol. 9171, pp. 3542. Springer, Cham (2015).
40. Borsci, S., Buckle, P., Walne, S.: Is the lite version of the usability metric for user experience
(UMUX-LITE) a reliable tool to support rapid assessment of new healthcare technology?
Appl. Ergon. 84, 103007 (2020)
Preliminary Results of a Systematic Review 257
... This highlights the importance of exploring the role of other individual differences to better understand when, and for whom, chatbots are most beneficial. Without consideration of demographic and individual differences, the use of chatbots in DMHIs remains limited and introduces concerns regarding risks, safety, and effectiveness [126,127]. ...
Full-text available
Introduction Increasing demand for mental health services and the expanding capabilities of artificial intelligence (AI) in recent years has driven the development of digital mental health interventions (DMHIs). To date, AI-based chatbots have been integrated into DMHIs to support diagnostics and screening, symptom management and behavior change, and content delivery. Areas covered We summarize the current landscape of DMHIs, with a focus on AI-based chatbots. Happify Health’s AI chatbot, Anna, serves as a case study for discussion of potential challenges and how these might be addressed, and demonstrates the promise of chatbots as effective, usable, and adoptable within DMHIs. Finally, we discuss ways in which future research can advance the field, addressing topics including perceptions of AI, the impact of individual differences, and implications for privacy and ethics. Expert opinion Our discussion concludes with a speculative viewpoint on the future of AI in DMHIs, including the use of chatbots, the evolution of AI, dynamic mental health systems, hyper-personalization, and human-like intervention delivery.
... "WoeBot" [15] a chatbot designed based on some known CBT techniques, specifically psychoeducation for stress coping, and proved to effectively reduce depressive symptoms. Similarly, "Wysa" [16] provides evidencebased therapies for mental issues and LISSA [17] for developing the social skills of people who have autism. ...
Full-text available
BACKGROUND Apps and web-based chatbots can provide valuable and meaningful support to healthcare workers in assessing and guiding management of various health problems particularly when human resources are scarce. Despite poor adherence to such apps, chatbots can be cost-effective and efficient on-demand virtual assistants for various mental health conditions, including anxiety and depression. OBJECTIVE This study aims to review the features of chatbots currently available for individuals with suspected anxiety or depression. METHODS ACM digital library, IEEE, Google Scholar, Embase, Medline, and PsychINFO were the six bibliographic databases searched for conducting the review. We conducted backward and forward reference list checking of included studies. Study selection and data extraction were performed by two reviewers independently; two other individual reviewers justified cross-checking of extracted data. We utilized a narrative approach for synthesizing the data. RESULTS The initial search returned a total of 917 citations. A total of 32 studies remained on filtering the publications, which formed the final dataset for this scoping review. While most of the studies were from conference proceedings (69%, n=22), the remainder were either journal articles (16%, n=5), reports (9%, n=3), or book chapters (6%, n=2). Of the studies that developed an actual chatbot, 16% (n=7) were web based and 63% (n=20) stand-alone in the form of an app. The remainder were available on both platforms or were only conceptual ideas. About half of the reviewed chatbots had functionality targeting both anxiety and depression (56%, n=18), whereas 38% (n=12) targeted only depression, 3% (n=1) anxiety and the remaining addressed other mental health issues along with anxiety and depression like public speaking anxiety, stress, lack of motivation, negative emotion, nervousness. Input modality of most of the chatbots was written (84%, n=27), followed by spoken (25%, n=8) and visual imaging (9%, n=3). Despite the fact of increasing popularity of embodiment techniques in chatbots such as avatars were rarely used in these studies only 34% (n=11) CONCLUSIONS Recent research shows that mental health chatbots could be of benefit in helping patients with anxiety and depression and provide valuable support to mental healthcare workers, particularly when resources are scarce. They often provide virtual assistance where medical professionals are inaccessible or users need anonymous real-time personal virtual assistance. Their role in mental health care is expected to increase following the COVID-19 pandemic and its impact on mental health and wellbeing of the world population.
Full-text available
Background: Chatbots are systems that are able to converse and interact with human users using spoken, written, and visual languages. Chatbots have the potential to be useful tools for individuals with mental disorders, especially those who are reluctant to seek mental health advice due to stigmatization. While numerous studies have been conducted about using chatbots for mental health, there is a need to systematically bring this evidence together in order to inform mental health providers and potential users about the main features of chatbots and their potential uses, and to inform future research about the main gaps of the previous literature. Objective: We aimed to provide an overview of the features of chatbots used by individuals for their mental health as reported in the empirical literature. Methods: Seven bibliographic databases (Medline, Embase, PsycINFO, Cochrane Central Register of Controlled Trials, IEEE Xplore, ACM Digital Library, and Google Scholar) were used in our search. In addition, backward and forward reference list checking of the included studies and relevant reviews was conducted. Study selection and data extraction were carried out by two reviewers independently. Extracted data were synthesised using a narrative approach. Chatbots were classified according to their purposes, platforms, response generation, dialogue initiative, input and output modalities, embodiment, and targeted disorders. Results: Of 1039 citations retrieved, 53 unique studies were included in this review. The included studies assessed 41 different chatbots. Common uses of chatbots were: therapy (n = 17), training (n = 12), and screening (n = 10). Chatbots in most studies were rule-based (n = 49) and implemented in stand-alone software (n = 37). In 46 studies, chatbots controlled and led the conversations. While the most frequently used input modality was written language only (n = 26), the most frequently used output modality was a combination of written, spoken and visual languages (n = 28). In the majority of studies, chatbots included virtual representations (n = 44). The most common focus of chatbots was depression (n = 16) or autism (n = 10). Conclusion: Research regarding chatbots in mental health is nascent. There are numerous chatbots that are used for various mental disorders and purposes. Healthcare providers should compare chatbots found in this review to help guide potential users to the most appropriate chatbot to support their mental health needs. More reviews are needed to summarise the evidence regarding the effectiveness and acceptability of chatbots in mental health.
Full-text available
As smart technologies such as artificial intelligence (AI), automation and Internet of Things (IoT) are increasingly embedded into commercial and government services, we are faced with new challenges in digital inclusion to ensure that existing inequalities are not reinforced and new gaps that are created can be addressed. Digital exclusion is often compounded by existing social disadvantage, and new systems run the risk of creating new barriers and harms. Adopting a case study approach, this paper examines the exclusionary practices embedded in the design and implementation of social welfare services in Australia. We examined Centrelink’s automated Online Compliance Intervention system (‘Robodebt’) and the National Disability Insurance Agency’s intelligent avatar interface ‘Nadia’. The two cases show how the introduction of automated systems can reinforce the punitive policies of an existing service regime at the design stage and how innovative AI systems that have the potential to enhance user participation and inclusion can be hindered at implementation so that digital benefits are left unrealised.
Full-text available
Voice has become a widespread and commercially viable interaction mechanism with the introduction of voice assistants (VAs), such as Amazon’s Alexa, Apple’s Siri, Google Assistant, and Microsoft’s Cortana. Despite their prevalence, we do not have a detailed understanding of how these technologies are used in domestic spaces. To understand how people use VAs, we conducted interviews with 19 users, and analyzed the log files of 82 Amazon Alexa devices, totaling 193,665 commands, and 88 Google Home Devices, totaling 65,499 commands. In our analysis, we identified music, search, and IoT usage as the command categories most used by VA users. We explored how VAs are used in the home, investigated the role of VAs as scaffolding for Internet of Things device control, and characterized emergent issues of privacy for VA users. We conclude with implications for the design of VAs and for future research studies of VAs.
Full-text available
BACKGROUND: Embodied conversational agents (ECA) are possible enablers of assistive technologies, in particular for older adults with cognitive impairment. Yet, dedicated interaction management techniques addressing the specificities of this public are needed. OBJECTIVES: We assess whether the interaction management framework of the LOUISE (Lovely User Interface for Servicing Elders) ECA has the potential to overcome the user interface constraints linked to cognitive impairment. METHODS: LOUISE supports key target-specific features: personalization; attention management; context reminders; image and video displays; a conversation manager for task-oriented interactions; and the foundations for a domain-specific XML-based language for task-oriented assistive scenarios. LOUISE's usability and acceptance were evaluated at the Broca geriatric hospital in Paris. with a group of 14 older adults with either mild cognitive impairment (MCI) or Alzheimer's disease (AD) through four simple but realistic assistive scenarios: drinking, taking medicine, measuring blood pressure and choosing the lunch menu. RESULTS: Most of our participants were able to interact with the ECA, succeeded in completing the proposed tasks and enjoyed our design. CONCLUSION: The field usability evaluation of LOUISE's interaction management framework suggests that this suite of interaction techniques can be effective in enabling interfaces for users with MCI or AD.
Objective: To ascertain the reliability of a standardised, short-scale measure of satisfaction in the use of new healthcare technology i.e., the LITE version of the usability metric for user experience (UMUX-LITE). Whilst previous studies have demonstrated the reliability of UMUX-LITE, and its relationship with measures of likelihood to recommend a product, such as the Net Promoter Score (NPS) in other sectors no such testing has been undertaken with healthcare technology. Materials and methods: Six point-of-care products at different stages of development were assessed by 120 healthcare professionals. UMUX-LITE was used to gather their satisfaction in use, and NPS to declare their intention to promote the product. Inferential statistics were used to: i) ascertain the reliability of UMUX-LITE, and ii) assess the relationship between UMUX-LITE and NPS at different stages of products development. Results: UMUX-LITE showed an acceptable reliability (α = 0.7) and a strong positive correlation with NPS (r = 0.455, p < .001). This is similar to findings in other fields of application. The level of product development did not affect the UMUX-LITE scores, while the stage of development was a significant predictor (R2 = 0.49) of the intention to promote. Discussion and conclusion: Practitioners may apply UMUX-LITE alone, or in combination with the NPS, to complement interview and 'homemade' scales to investigate the quality of new products at different stages of development. This shortened scale is appropriate for use in the context of healthcare in which busy professionals have a minimal amount of time to support innovation.
Background and Purpose: The present age of digitalization brings with it progress and new possibilities for health care in general and clinical psychology/psychotherapy in particular. Internet- and mobile-based interventions (IMIs) have often been evaluated. A fully automated version of IMIs are chatbots. Chatbots are automated computer programs that are able to hold, e.g., a script-based conversation with a human being. Chatbots could contribute to the extension of health care services. The aim of this review is to conceptualize the scope and to work out the current state of the art of chatbots fostering mental health. Methods: The present article is a scoping review on chatbots in clinical psychology and psychotherapy. Studies that utilized chatbots to foster mental health were included. Results: The technology of chatbots is still experimental in nature. Studies are most often pilot studies by nature. The field lacks high-quality evidence derived from randomized controlled studies. Results with regard to practicability, feasibility, and acceptance of chatbots to foster mental health are promising but not yet directly transferable to psychotherapeutic contexts. ­Discussion: The rapidly increasing research on chatbots in the field of clinical psychology and psychotherapy requires corrective measures. Issues like effectiveness, sustainability, and especially safety and subsequent tests of technology are elements that should be instituted as a corrective for future funding programs of chatbots in clinical psychology and psychotherapy.
Objective Poor lifestyle represents a health risk factor and is the leading cause of morbidity and chronic conditions. The impact of poor lifestyle can be significantly altered by individual's behavioral modification. Although there are abundant lifestyle promotion applications and tools, they are still limited in providing tailored social support that goes beyond their predefined functionalities. In addition, virtual coaching approaches are still unable to handle user emotional needs. Our approach presents a human–virtual agent mediated system that leverages the conversational agent to handle menial caregiver's works by engaging users (e.g., patients) in a conversation with the conversational agent. The dialog used a natural conversation to interact with users, delivered by the conversational agent and handled with a finite state machine automaton. Our research differs from existing approaches that replace a human coach with a fully automated assistant on user support. The methodology allows users to interact with the technology and access health-related interventions. To assist physicians, the conversational agent gives weighting to user's adherence, based on prior defined conditions. Materials and Methods This article describes the design and validation of CoachAI, a conversational agent-assisted health coaching system to support health intervention delivery to individuals or groups. CoachAI instantiates a text-based health care conversational agent system that bridges the remote human coach and the users. Results We will discuss our approach and highlight the outcome of a 1-month validation study on physical activity, healthy diet, and stress coping. The study validates technology aspects of our human–virtual agent mediated health coaching system. We present the intervention settings and findings from the study. In addition, we present some user-experience validation results gathered during or after the experimentation. Conclusions The study provided a set of dimensions when building a human–conversational agent powered health intervention tool. The results provided interesting insights when using human–conversational agent mediated approach in health coaching systems. The findings revealed that users who were highly engaged were also more adherent to conversational-agent activities. This research made key contributions to the literature on techniques in providing social, yet tailored health coaching support: (1) identifying habitual patterns to understand user preferences; (2) the role of a conversational agent in delivering health promoting microactivities; (3) building the technology while adhering to individuals' daily messaging routine; and (4) a socio-technical system that fits with the role of conversational agent as an assistive component. Future Work Future improvements will consider building the activity recommender based on users' interaction data and integrating users' dietary pattern and emotional wellbeing into the initial user clustering by leveraging information and communication technology approaches (e.g., machine learning). We will integrate a sentiment analysis capability to gather further data about individuals and report these data to the caregiver.
The aim of this paper is to assess the usability of a chatbot for mental health care within a social enterprise. Chatbots are becoming more prevalent in our daily lives, as we can now use them to book flights, manage savings, and check the weather. Chatbots are increasingly being used in mental health care, with the emergence of “virtual therapists”. In this study, the usability of a chatbot named iHelpr has been assessed. iHelpr has been developed to provide guided self-assessment, and tips for the following areas: stress, anxiety, depression, sleep, and self esteem. This study used a questionnaire developed by Chatbottest, and the System Usability Scale to assess the usability of iHelpr. The participants in this study enjoyed interacting with the chatbot, and found it easy to use. However, the study highlighted areas that need major improvements, such as Error Management and Intelligence. A list of recommendations has been developed to improve the usability of the iHelpr chatbot.
A recent contribution to the ongoing debate concerning the concept of usability and its measures proposed that usability reached a dead end – i.e. a construct unable to provide stable results and to unify scientific knowledge. Extensive commentaries rejected the conclusion that researchers need to look for alternative constructs to measure the quality of interaction. Nevertheless, several practitioners involved in this international debate asked for a constructive way to move forward the usability practice. In fact, two key issues of the usability field were identified in this debate: (i) knowledge fragmentation in the scientific community, and (ii) the unstable relationship among the usability metrics. We recognise both the importance and impact of these key issues, although, in line with others, we may not agree with the conclusion that the usability is a dead end. Under the light of the international debate, this work discusses the strengths and weaknesses of usability construct and its application. Our discussion focuses on identifying alternative explanations to the issues and to suggest mitigation strategies, which may be considered the starting point to move forward the usability field. However, scientific community actions will be needed to implement these mitigation strategies and to harmonise the usability practice.