Content uploaded by Stefano Federici
Author content
All content in this area was uploaded by Stefano Federici on Mar 31, 2019
Content may be subject to copyright.
Bio-behavioral and Self-Report User Experience Evaluation of a
Usability Assessment Platform (UTAssistant)
Stefano Federici1, Maria Laura Mele1, Marco Bracalenti1, Arianna Buttafuoco1, Rosa Lanzilotti2,
Giuseppe Desolda2
1Department of Philosophy, Social and Human Sciences and Education, University of Perugia, Piazza Ermini 1, Perugia,
Italy
stefano.federici@unipg.it, {marialaura.mele, marco.bracalenti, arianna.buttafioco}@gmail.com}
2Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
{rosa.lanzilotti,giuseppe.desolda}@uniba.it}
Keywords: User Experience, UX Bio-behavioral Methods, Semi-automatic Usability Assessment, Usabi lit y Assessment
Tools.
Abstract: This work shows the user experience (UX) assessment of a web-based platform for the semi-automatic
usability evaluation of websites, UTAssistant, which is primarily addressed to workers in public
administration (PA). The study is part (Phase 1) of a multiple assessment methodology which consists of
four phases in total: (1) UX in laboratory conditions; (2) Usability evaluation in remote online conditions;
(3) Usability evaluation in workplace conditions; and (4) Heuristic evaluation. In Phase 1, a UX study in
laboratory conditions was carried out. Participants’ UX of a PA website navigation through UTAssistant
was evaluated by both traditional self-report usability assessment tools (SUS and UMUX) and bio-
behavioral measurement techniques (facial expression recognition and electroencephalography). Results
showed that using the UTAssistant usability assessment tool for webpages did not affect users’ perceived
usability in terms of self-reports and affective states, which were mostly neutral for all the assessment
session. However, frontal alpha asymmetry EEG’s scores showed a higher sensitivity of UTAssistant users
to the duration of the trial, with a decrease in motivation displayed as the trial ensued. However, this result
did not seem to affect emotional experience.
1 INTRODUCTION
In October 2012, the Department of Public Function,
Italian Ministry for Simplification and Public
Administration, founded the “GLU” (Working Group
on Usability), that is working to support the public
administration (PA) website staff in performing
usability evaluation activities on their websites and
other Italian e-government systems. Its primary goals
were a collection of the best usability experiences of
PA websites, the development of a practical tool that
could operatively support the analysis and evaluation
of the user interfaces, and the testing of this tool. In
2013, the GLU developed the first guide protocol
(eGLU; Borsci et al., 2014) for Italian web editors of
the PA, to investigate any difficulties a site user could
have in finding information, consulting documents, or
filling in online forms. The actual version of the
eGLU protocol, released in 2015, is the 2.1
(Dipartimento della Funzione Pubblica, 2015). Based
on the eGLU 2.1 protocol, a new web platform called
UTAssistant (Usability Tool Assistant) was
developed. It is a lightweight and simple,
semiautomatic usability evaluation tool to support
practitioners in the usability evaluation of web
systems and services provided by the PA (Federici et
al., 2018; Desolda et al., 2017; Catarci et al., 2018).
Federici and colleagues (2018) described the
design of an experimental methodology aimed at
evaluating the User eXperience (UX) of UTAssistant.
This methodology was applied toward assessing the
platform with end-users and web managers of PA
websites, both in a laboratory setting and through a
web-based recruitment platform. The methodology
adopted used both usability assessment and
psychophysiological measurements applied in four
different conditions: (1) UX in laboratory conditions;
(2) usability evaluation in remote online conditions,
(3) usability evaluation in workplace conditions; and
(4) heuristic evaluation.
This paper describes Phase 1, in which evaluators
used the Partial Concurrent Thinking Aloud (PCTA)
technique (Federici et al., 2010; Borsci and Federici,
2009) to apply traditional usability methods based on
self-report questionnaires. Users’
psychophysiological reactions as occurred during the
interaction were also measured using two bio-
behavioral detection techniques: facial expression
recognition and electroencephalography (EEG), in
order to evaluate the emotional impact, the interface
with UTAssistant had on users.
2 USING BIO-BEHAVIORAL
MEASURES IN UX
EVALUATION
The term “UX” refers to a new and broad concept that
can be defined according to different perspective. In
this paper, it was considered the definition of UX as
“person’s perceptions and responses resulting from
the use and/or anticipated use of a product” (Brooke,
1996; Borsci et al., 2015). In particular, the
interactive experience of a user is affected by the
amount of user interaction with a product, the
perceived usability and aesthetics of interface
(Brooke, 1996; Borsci et al., 2015). Therefore, in a
UX evaluation test it should be included a set of
subjective and objective measures of interactive
experience.
Traditional methods to assess practical and
psychological aspects of UX are based on self-
reporting methodologies, which include
questionnaires, interviews and contextual inquiry and
represent the subjective perception of product use
(Ganglbauer et al., 2009; Isbister et al., 2006). All the
abovementioned techniques ask users to express their
opinion about or evaluate the usability of a product or
its features (Mele and Federici, 2012). Self-reporting
methodologies can be used before and/or after the
product use according to the dimensions that they are
intended to evaluate. Even though evaluating short-
term practical experience is relevant to understanding
the dynamic changes related to situational and
emotional factors, it is also necessary to employ
reliable methods that are able to evaluate user
experience during the interaction with a product as
well (Ganglbauer et al., 2009; Vermeeren et al., 2010;
Jokela, 2010).
In the last decades, implicit methods are
increasingly employed in the evaluation of UX.
Implicit methods are based on the analysis of the
users’ behavior independent of their awareness of
perceptions and their conscious control (Mele and
Federici, 2012; Jimenez-Molina et al., 2018).
Specifically, psychophysiological measurements
have been determined to be reliable approaches for
measuring the many aspects of interaction user
experience (Ganglbauer et al., 2009).
Usually, the psychophysiological techniques
most used in the field of UX evaluation are:
electromyography, galvanic skin response (GSR),
respiration rate, electroencephalography (EEG),
electrocardiography (ECG), eye tracking and
electrodermal activity (EDA).
The state of the art regarding psychophysiological
methods to evaluate user experience shows three
main advantages of employing these techniques: (a)
they are able to investigate cognitive processes such
as, for example, the changes in mental workload that
may be caused by a difficult interaction; (b) they
provide insights in the affective aspects of interaction
such as users’ engagement and satisfaction during
product use; (c) they provide constant feedback
during the interaction, allowing designers to
customize product features for the users during the
interface itself. Furthermore, using implicit methods
may involve several areas of interest such as web
browsing, gaming, software and website design, and
research on packaging and organoleptic properties of
food. In order to investigate more about the
advantages of using psychophysiological methods,
we conducted a literature review to address the
following questions: first, which type of implicit
methods are mainly used in the evaluation of user
experience; second, whether implicit techniques have
never been combined with traditional methods; and
last, what is the additional value of
psychophysiological techniques.
The literature shows that the psychophysiological
method most employed for UX purposes is ECG,
followed by EEG and GSR equally with EDA. The
primacy of ECG can be explained by its temporal
precision of detection and recording, which is simple,
effective, low cost, noninvasive and continuous
(Goshvarpour et al., 2017). Moreover, the literature
shows that implicit methods are frequently combined
with traditional ones because using self-report
together with psychophysiological techniques allows
practitioners to overcome shortcomings of traditional
methods (Poore et al., 2017) and to enhance the data
concerning cognitive and affective aspects of UX
(Muñoz Cardona et al., 2016). A particularly useful
example of such a combined instrument is the Game
Engagement Questionnaire (Brockmyer et al., 2009).
Finally, it is widely accepted that an additional
value of psychophysiological methods in evaluating
UX is provided by their objectivity and their capacity
to bypass language and memory restrictions (Muñoz
Cardona et al., 2016). Further, with their use it is
possible to assess continuously the UX with high
temporal precision and without interference
stemming from product exploration (Zhang et al.,
2018; Jimenez-Molina et al., 2018; Rodriguez-
Guerrero et al., 2017; Muñoz Cardona et al., 2016;
Poore et al., 2017; Vourvopoulos and Bermúdez i
Badia, 2016; Yan et al., 2016; Pallavicini et al., 2013;
Nacke et al., 2011; Jun et al., 2010). Furthermore, the
great amount of information collected permits the
obtaining of a holistic understanding of UX regarding
both practical and psychological aspects. Finally, the
psychophysiological techniques are helpful in
guiding adaptation of product features to user needs
(Mendoza-Denton et al., 2017; Christensen and
Estepp, 2013).
3 UTASSISTANT: A NEW
USABILITY TESTING TOOL
FOR ITALIAN PUBLIC
ADMINISTRATION
UTAssistant is a web platform to provide the Italian
Public Administration with a lightweight and simple
tool to conduct user studies, according to the eGLU
2.1 protocol, without requiring any installation on
user devices (Federici et al., 2018).
One of the most important objectives driving the
platform development was the need to perform
remote usability studies, with the aim of stimulating
users to participate in usability tests in a simpler and
more comfortable way (Jokela, 2010). To accomplish
such requirements, UTAssistant has been developed
as a web platform so that the involved stakeholders,
namely evaluators and web users, act from their PCs
wherever and whenever they desire. With respect to
the state-of-the-art of usability test tools, this aspect
represents an important contribution, since remote
participation fosters a wider adoption of this tool and,
consequently, of the usability test technique. Indeed,
the existing tools for usability tests require software
installation on PCs with specific requirements (e.g.
Morae® https://www.techsmith.com/morae.html).
3.1 Usability test design
Every usability test starts from the evaluation design,
which mainly consists of: (i) creating a script to
introduce the users to the test; (ii) defining a set of
tasks; (iii) identifying data to be gathered (e.g.
number of clicks and time required by user to
accomplish a task, audio/video/desktop recording,
logs, etc.); and (iv) deciding which questionnaire(s)
to administer to users.
UTAssistant facilitates evaluators in performing
these activities by means of three wizard procedures.
The first procedure guides evaluators in specifying:
(a) general information (e.g. a title, the script); (b)
data to gather during user task execution (e.g.
mouse/keyboard data logs,
webcam/microphone/desktop recordings); (c) post-
test questionnaire(s) to administer. After that, the
second procedure assists evaluators in creating the
task lists. For each task, starting/ending URLs, the
goal and the duration must be specified. Finally, the
third procedure requires evaluators to determine the
users they evaluate, by selecting them from a list of
users already registered to the platform or by typing
their email address.
3.2 Usability test execution
After creation of the usability test design, the users
receive an email with information about the
evaluation they must complete and a link to access the
UTAssistant. By clicking on the link, the users can
carry out the evaluation test, which starts by
providing general instructions about the platform use
(e.g. a short description of the toolbar with all the
useful commands), the script of the evaluation and,
finally, privacy policies pertaining to how data such
as mouse or keyboard logs and webcam, microphone
and desktop recordings will be captured.
Afterwards, UTAssistant administers all the tasks
one at a time. Each task execution is strongly guided
by the platform that, after displaying the task
description in a pop-up window, opens the webpage
from which users must start the task execution. To
minimize the invasiveness of the platform during the
evaluation test execution, we grouped all the
functions and indications—such as the current task
goal and instructions, duration time, task number, and
buttons to go to the next task or stop the evaluation—
in a toolbar placed at the top of the webpage. The
toolbar displays the button to go to the next task:
“Complete Questionnaire” when the users finish the
last task and then move to complete the
questionnaire(s). During task execution, the platform
collects all the data the evaluator set during the study
design in a transparent and noninvasive way.
3.3 Evaluation test data analysis
UTAssistant automates all activities (such as collect,
store, merge and analyze) related to data analysis,
removing barriers in gathering usability test data. The
evaluators access the data analysis results in their
control panel, by exploiting different tools that
provide a useful support in finding usability issues. In
the next subsections, an overview of some of such
tools is reported.
3.4 Task success rate
UTAssistant calculates the task success rate (the
percentage of tasks that users correctly complete
during the test, that can be also calculated for each
task, estimating the percentage of users who complete
that task), and visualizes them in a table in which the
columns represent the tasks, while the rows show the
users. The last row reports the success rate for each
task while the last column depicts the success rate for
each user. The global success rate is reported below
the table.
3.5 Questionnaire results
Thanks to UTAssistant, the evaluators can administer
one or more questionnaires at the end of the usability
evaluation. The platform automatically stores all the
user’s answers and produces results by means of
statistics and graphs. For example, if the System'
Usability' Scale (SUS) (Brooke, 1996; Borsci et al.,
2015; Jokela, 2010) has been chosen, UTAssistant
calculates statistics such as global SUS score,
usability score and learnability score. In addition,
different visualizations depict the results from
different perspectives, e.g. a histogram of users’ SUS
scores, a box-plot of SUS score/learnability/usability,
and a score that is compared to SUS evaluation scales.
3.6 Audio/video analysis
While users execute the study tasks, UTAssistant
collects and stores the user’s voice (through the
microphone), facial expressions (through the
webcam), and desktop (through a browser plug-in).
The implemented player also provides an annotation
tool so that, when evaluators detect some difficulties
externalized by means of verbal comments or facial
expressions, they can document the recorded
audio/video tracks. If the evaluators decide to record
both camera and desktop videos, their tracks are
merged in a picture-in-picture fashion.
3.7 Mouse/keyboard logs analysis
Finally, UTAssistant tracks user’s behavior by
collecting mouse and keyboard logs and, starting
from the collected data, the platform shows
performance statistics for each task.
4 UX EVALUATION OF
UTASSISTANT
This paper aims at evaluating the UX of UTAssistant
under laboratory conditions through two bio-
behavioral implicit measures, i.e. facial expression
recognition and electroencephalography, and two
explicit measures, the SUS (Lewis and Sauro, 2017;
Borsci et al., 2015) and the Usability Metric for User
Experience (UMUX) (Finstad, 2010; Borsci et al.,
2015). The methodology adopted in this usability
study was the PCTA (Borsci et al., 2013).
4.1 Methods
The quality of the interaction with UTAssistant is
measured through bio-behavioral measurement
techniques which are able to measure the underlying
psychophysiological reactions of participants through
(i) the recognition of facial expressions and (ii)
electroencephalography (EEG).
(i) The scientific community recognizes a limited
number of facial expressions (45 action units) which
are considered universally connected to hundreds of
emotions resulting from the combination of seven
basic emotions (Ekman et al., 2002): joy, anger,
surprise, fear, contempt, sadness and disgust. In
humans, facial movements related to basic emotions
are unaware and automatic. The analysis of
involuntary facial expressions provides information
on the emotional impact that an interface can elicit
during the interaction.
(ii) The EEG method allows recording the
electrical activity generated by the neurons through
electrodes positioned on the participant’s scalp.
Thanks to a high temporal resolution, the EEG allows
analysis of which brain areas are active at a given
moment. In this study, the frontal alpha (8–12 Hz)
asymmetry index was derived for each participant.
Frontal alpha asymmetry (FAA) reflects the levels of
approach/withdraw cognitive processes calculated by
the difference between right-hemispheric electrodes,
in which an increase in alpha power is an index of
withdrawal motivation or related negative emotion
(Gruzelier, 2014), and their left-hemispheric
electrodes counterparts (1):
ln(R)-ln(L)
(1)
in which an increase in alpha power reflects positive
emotions and approach motivation. Since positive
FAA values indicate larger relative right-hemispheric
power, an increase in FAA may reflect withdrawal
motivation and negative emotions. In alpha frontal
asymmetry, the power between left and right
hemisphere is normalized between 0 (perfect
symmetry) and 1 (maximal asymmetry).
The experiment follows the PCTA method
(Borsci et al., 2013). The PCTA requires that user and
evaluator do not verbally interact during the whole
duration of a task, but whenever users find a difficulty
or want to express an opinion about the quality of
their navigation, they are instructed to indicate this
with a signal (generally the sound of a desk bell). In
order to remember the problem at the end of the task,
the evaluator takes note of the actions the user was
performing before ringing the bell, while the
interaction is video recorded. The signal is designed
to serve as a memorandum for discussing problems
with the users at the end of the trial (Federici et al.,
2010; Borsci and Federici, 2009).
4.2 Materials and Apparatuses
The UTAssistant web-based platform was used to
evaluate the Italian Ministry of Economic
Development (MiSE) website
(http://www.sviluppoeconomico.gov.it) on a 15”
Lenovo ThinkPad T540p laptop computer, with a
1920 x 1200 screen resolution. The browser used to
access the UTAssistant platform is Google Chrome
(http://www.google.com/intl/en/chrome). During the
test, the computer was set to maximum brightness and
was constantly plugged in to power.
We use the iMotions platform
(https://imotions.com) to synchronize the data
collected by the Affectiva facial expression
recognition software (https://www.affectiva.com/)
through an integrated webcam and (EEG data
recorded by the EPOC+ 16-electrode headset
(https://www.emotiv.com/epoc/). A table bell was
placed next to the mouse. Affectiva is able to
calculate the emotional valence (positive, negative
and neutral) and the seven basic human emotions
(both ranging from 1 to 100). In this work, basic
emotions are computed with a threshold = 20 (only
facial expressions valued by algorithm as equal to or
higher than the 20% probability of a human emotion
is accepted). An external Logitech Webcam 250
camera was positioned on a tripod stand behind the
participant.
Two questionnaires are used to assess the
perceived usability of the system, i.e. the SUS and the
UMUX. The SUS is an easy and short tool for
measuring the usability of a system (Lewis and Sauro,
2017; Borsci et al., 2015). The SUS is a reliable 10-
item questionnaire based on a five-point Likert scale
ranging from 1 (strong disagreement) to 5 (strong
agreement). Compared to the SUS, the UMUX is
shorter and is based on the effectiveness, efficiency
and satisfaction dimensions of usability as defined by
the ISO 9241 (Finstad, 2010; Borsci et al., 2015;
Finstad, 2013). The UMUX is a reliable four-item
questionnaire based on a seven-point Likert scale.
4.3 Procedure
Participants are invited to perform the test in a
sufficiently bright and silent laboratory environment,
sitting in a comfortable chair placed at least 50
centimeters away from the computer screen. After
being informed about the general aim and methods of
the test, the subjects are asked to sign a consent form.
After the application of the EEG headset, the test
instructions are presented either by the UTAssistant
platform (experimental group) or directly by the
experimenter (control group).
Participants were divided into two groups: an (i)
experimental group and a (ii) control group. Subjects
were asked to browse the MiSE website following
four consecutive tasks, each one presented to users in
form of scenario. (i) The experimental group received
the description of the four scenarios automatically
presented in a written form one by one by the
UTAssistant platform. The participants assigned to
(ii) the control group were instructed about the
content of each task directly by the experimenter. The
maximum duration allowed for completion of each
task was five minutes, beyond which participants
were invited to proceed with the next task until the
test was completed. In order to follow the PCTA
method described in the section 4.1, participants are
asked not to verbalize any problems that may arise
during the test, but to report them at the end of the
trial. At the end of the session, participants are asked
to complete the two usability assessment
questionnaires in digital form, i.e. the SUS and the
UMUX scales.
4.4 Subjects
Thirty participants took part to the experiment, mean
age = 27.35 years old, equally divided by gender. Ten
subjects were assigned to the control group (mean age
= 20.67 years old), 20 subjects were assigned to the
two experimental groups (mean age = 26.76 years
old).
4.5 Results
The results of SUS questionnaire and UX bio-
behavioral data are described as follows.
4.5.1 Questionnaire results
The mean and standard deviations of the scores
obtained by both the experimental group (SUS mean
= 59.875; SUS S.D. = 23.500; UMUX mean = 54.85;
UMUX S.D. = 15.806) and the control group (SUS
mean = 63.75; SUS S.D. = 17.209; UMUX mean =
59.191; UMUX S.D. = 13.997) were evaluated.
The one-way ANOVA between groups analysis
found no significant difference between experimental
group and control group analysis for the SUS score
(F(1,28) = 0.213, p > 0.05) and for the UMUX scores
(F(1,28) = 0.213, p > 0.05).
4.5.2 UX bio-behavioral data
EEG. The mean frontal alpha asymmetry (FAA)
index of the whole session was calculated for both
experimental (mean FAA = 0.122) and control group
(mean FAA = 0.005). Figure 1 shows FAA index
mean values for each task. Correlations between time
and FAA (N = 204.987) show a highly significant
positive correlation (Pearson’s r = 0.035, p = 0.000).
Figure 1: Frontal alpha asymmetry of both experimental
group and control group.
The one-way ANOVA between groups shows a
significant difference in FAA values between control
group and experimental group (F(1,119) = 5.351, p =
0.022). No significant difference in frontal alpha
asymmetry values among the four tasks was found
(one way ANOVA F(3,117) = 0.643, p > 0.05)).
Figure 2: Overall mean affective valence time percentage
of both experimental group and control group.
Facial expression recognition. The overall affective
valence for both experimental and control groups was
calculated (Figure 2). No significant difference
between the experimental and control groups was
found for affective valence (one way ANOVA
between groups F(1,85) = 0.271, p > 0.05), whereas a
significant difference was found for valence type in
both groups (one way ANOVA F(2,84) = 1397.407,
p = 0.000).
The mean time (milliseconds) percent of each
basic emotion has been calculated for experimental
and control groups in the four tasks (Figures 3 and 4).
Figure 3: Mean time (ms) percent of emotions calculated
during tasks for the experimental group.
Figure 4: Mean time (ms) percent of emotions calculated
during tasks for the control group.
One-way ANOVA between groups shows no
difference between experimental and control group
related to basic emotions (F(1,54) = 0.042, p > 0.05)
and among tasks (F(3,52) = 0.342, p > 0.05) for both
groups. A significant difference was found among
emotional values (F(6,49) = 8.901, p = 0.000).
Bonferroni post hoc tests were used to determine
which basic emotions are significantly different from
each other. Bonferroni post hoc tests showed that the
only emotion significantly different at p < 0.01 was
surprise (mean = 1.355; SD = 0.186) compared to
anger, sadness, disgust and joy.
4.5.3 Comparisons between questionnaires
and bio-behavioral measures
Correlations between both SUS and UMUX scores
and frontal alpha asymmetry values were computed,
showing no significant correlation between both SUS
(Pearson’s r = 0.135, p > 0.05) and UMUX (Pearson’s
r = -.240, p > 0.05). No correlation between SUS and
UMUX scores was found as well for all participants
(Pearson’s r = 0.018, p > 0.05). Moreover,
correlations between frontal alpha asymmetry and
affective values showed no significant correlation
between both basic emotion (Pearson’s r = -0.124, p
> 0.05) and affective valence (Pearson’s r = 0.017, p
> 0.05).
4.6 Discussion
The study provides four main findings on
comparisons between the experimental group
(browsing the studied website through UTAssistant)
and the control group (conducting the test directly on
the studied website):
(1). There is a significant difference between the
frontal alpha asymmetry (FAA) measured on the
control group during the four usability tasks
compared to the experimental group. FAA showed no
increase during the tasks in the control group, but it
increased in the experimental group over the trial
time. The positive asymmetry values for the
experimental group are therefore significantly higher
than the control, meaning that there is a greater
positive activity relative to the frontal area of the right
hemisphere. As described in Section 4.1, increasing
right frontal activity is related to a withdrawal from
stimuli. In this work, the increase of FAA suggests a
decrease in motivation and a more negative approach
to the interaction with the system.
(2). There is no significant difference in affective
valence between the groups. Despite a possible
decrease in motivation, a difference in emotion and
emotional value (positive, negative, neutral) is not
confirmed on an emotional level during the entire test
for both groups. The significantly greater and
constant affective valence during the test was neutral
for both groups. From the post hoc analyses within
basic emotions, it emerges that the most significant
emotion for all subjects is surprise, which can most
likely be attributed to the content of the browsed
website or the nature of the task.
(3). What happens for the FAA among the groups
does not happen for the results of the self-reports.
There is no significant difference between the control
group and the experimental group for both the SUS
and the UMUX. This means that both groups are
homogeneous in their attribution of quality of
usability, although there is a decrease in motivation
for those who use UTAssistant. We should also note
that neither measure investigates the level of
motivation, unlike the FAA.
(4). The results of the self-reports do not correlate
with both the FAA values or with the affective values.
In this study, it seems that self-reports are not enough
to predict bio-behavioral measures that indicate
motivation levels during the interaction.
5 CONCLUSIONS
This work is about the user experience (UX)
assessment of a web-based usability assessment
platform called UTAssistant, primarily applied to the
field of public administration. The study has been
carried out under laboratory conditions using
traditional usability assessment methodologies (such
as the Partial Concurrent Thinking Aloud method and
usability questionnaires) and bio-behavioral
measures: electroencephalography and facial
expression recognition. Results showed that the
perception of usability of the system as measured by
self-reports is similar for both experimental and
control group. Also, the emotions behind the
interaction are mostly neutral in affective values for
both groups. However, electroencephalography
measurements of alpha activity seem to be more
sensitive to the duration effect for participants using
UTAssistant to complete tasks, with a decrease in
motivation that increases with the increase of test
duration. However, this is not followed by a feeling
of negative emotions that is observable through or
manifest in facial expressions. Future works will
focus on extending the investigation to other bio-
behavioral information provided by technologies
such as eye-tracking tools or electrodermal activity
sensors.
ACKNOWLEDGEMENTS
This work is supported and funded by Department of
Public Function, Italian Ministry for Simplification
and Public Administration (PA). The software
UTAssistant is designed and developed by G.D. and
R.L.
REFERENCES
Borsci, S. & Federici, S. 2009. The Partial Concurrent
Thinking Aloud: A New Usability Evaluation
Technique for Blind Users. In: Emiliani, P. L.,
Burzagli, L., Como, A., Gabbanini, F. & Salminen, A.-
L. (eds.) Assistive Technology from Adapted
Equipment to Inclusive Environments: AAATE 2009.
Amsterdam, NL: IOS Press.
Borsci, S., Federici, S., Bacci, S., Gnaldi, M. & Bartolucci,
F. 2015. Assessing User Satisfaction in the Era of User
Experience: Comparison of the SUS, UMUX and
UMUX-LITE as a Function of Product Experience.
International Journal of Human-Computer Interaction,
31, 484–495.
Borsci, S., Federici, S. & Mele, M. L. 2014. eGLU 1.0: un
protocollo per valutare la comunicazione web delle PA.
Diritto e Pratica Amministrativa. Milan, IT: Il Sole 24
Ore.
Borsci, S., Kurosu, M., Federici, S. & Mele, M. L. 2013.
Computer Systems Experiences of Users with and
without Disabilities: An Evaluation Guide for
Professionals, Boca Raton, FL, CRC Press.
Brockmyer, J. H., Fox, C. M., Curtiss, K. A., McBroom, E.,
Burkhart, K. M. & Pidruzny, J. N. 2009. The
development of the Game Engagement Questionnaire:
A measure of engagement in video game-playing.
Journal of Experimental Social Psychology, 45, 624–
634.
Brooke, J. 1996. SUS: A “quick and dirty” usability scale.
In: Jordan, P. W., Thomas, B., Weerdmeester, B. A. &
McClelland, I. L. (eds.) Usability Evaluation in
Industry. London, UK: Taylor & Francis.
Catarci, T., Amendola, M., Bertacchini, F., Bilotta, E.,
Bracalenti, M., Buono, P., Cocco, A., Costabile, M. F.,
Desolda, G., Di Nocera, F., Federici, S., Gaudino, G.,
Lanzilotti, R., Marrella, A., Laura Mele, M., Pantano,
P. S., Poggi, I. & Tarantino, L. Digital interaction:
where are we going? Proceedings of the 2018
International Conference on Advanced Visual
Interfaces: AVI 2018, May 29–Jun 01 2018 Castiglione
della Pescaia (GR), IT. ACM Digital Library, 1–5.
Christensen, J. C. & Estepp, J. R. 2013. Coadaptive Aiding
and Automation Enhance Operator Performance.
Human Factors, 55, 965–975.
Desolda, G., Gaudino, G., Lanzilotti, R., Federici, S. &
Cocco, A. UTAssistant: A Web Platform Supporting
Usability Testing in Italian Public Administrations. In:
Bottoni, P., Gena, C., Giachetti, A., Iacolina, S. A.,
Sorrentino, F. & Spano, L. D., eds. 12th Edition of
CHItaly: CHItaly 2017, Sep 18–20 2017 Cagliari, IT.
CEUR-WS.org, 138–142.
Dipartimento della Funzione Pubblica 2015. Il Protocollo
eGLU 2.1: Come realizzare test di usabilità semplificati
per i siti web e i servizi online delle PA. Rome, IT:
Formez PA.
Ekman, P., Friesen, W. V. & Hager, J. C. 2002. Facial
Action Coding System. The Manual, Salt Lake City,
UT, A Human Face.
Federici, S., Borsci, S. & Stamerra, G. 2010. Web usability
evaluation with screen reader users: Implementation of
the Partial Concurrent Thinking Aloud technique.
Cognitive Processing, 11, 263–272.
Federici, S., Mele, M. L., Lanzilotti, R., Desolda, G.,
Bracalenti, M., Meloni, F., Gaudino, G., Cocco, A. &
Amendola, M. UX Evaluation Design of UTAssistant:
A New Usability Testing Support Tool for Italian
Public Administrations. 20th International Conference
on Human-Computer Interaction, Jul 15–20 2018 Las
Vegas, NV. Springer International Publishing, 55–67.
Finstad, K. 2010. The Usability Metric for User Experience.
Interacting with Computers, 22, 323–327.
Finstad, K. 2013. Response to commentaries on ‘The
Usability Metric for User Experience’. Interacting with
Computers, 25, 327–330.
Ganglbauer, E., Schrammel, J., Deutsch, S. & Tscheligi, M.
Applying psychophysiological methods for measuring
user experience: possibilities, challenges and
feasibility. Workshop on user experience evaluation
methods in product development, 2009. Citeseer.
Goshvarpour, A., Abbasi, A. & Goshvarpour, A. 2017.
Fusion of heart rate variability and pulse rate variability
for emotion recognition using lagged poincare plots.
Australasian Physical & Engineering Sciences in
Medicine, 40, 617–629.
Gruzelier, J. H. 2014. EEG-neurofeedback for optimising
performance. I: A review of cognitive and affective
outcome in healthy participants. Neuroscience &
Biobehavioral Reviews, 44, 124–141.
Isbister, K., Höök, K., Sharp, M. & Laaksolahti, J. The
sensual evaluation instrument: developing an affective
evaluation tool. Conference on Human Factors in
Computing Systems: SIGCHI ’06, Apr 22–27, 2006
2006 Montréal, CA. ACM, 1163–1172.
Jimenez-Molina, A., Retamal, C. & Lira, H. 2018. Using
Psychophysiological Sensors to Assess Mental
Workload During Web Browsing. Sensors, 18, 458.
Jokela, T. Determining usability requirements into a call-
for-tenders: a case study on the development of a
healthcare system. 6th Nordic Conference on Human-
Computer Interaction: Extending Boundaries, Oct 16-
20 2010 Reykjavik, IS. ACM, 256–265.
Jun, J., Ou, L.-C., Oicherman, B., Wei, S.-T., Luo, M. R.,
Nachilieli, H. & Staelin, C. 2010. Psychophysical and
Psychophysiological Measurement of Image Emotion.
Color and Imaging Conference, 2010, 121–127.
Lewis, J. R. & Sauro, J. 2017. Revisiting the Factor
Structure of the System Usability Scale. Journal of
Usability Studies, 12, 183–192.
Mele, M. L. & Federici, S. 2012. A psychotechnological
review on eye-tracking systems: Towards user
experience. Disability and Rehabilitation: Assistive
Technology, 7, 261–281.
Mendoza-Denton, N., Eisenhauer, S., Wilson, W. & Flores,
C. 2017. Gender, electrodermal activity, and
videogames: Adding a psychophysiological dimension
to sociolinguistic methods. Journal of Sociolinguistics,
21, 547–575.
Muñoz Cardona, J. E., Cameirão, M. S., Paulino, T.,
Bermudez i Badia, S. & Rubio, E. Modulation of
Physiological Responses and Activity Levels during
Exergame Experiences. 8th International Conference
on Games and Virtual Worlds for Serious Applications:
VS-GAMES ’16, Sep 7–9, 2016 2016 Barcelona, ES.
1–8.
Nacke, L. E., Stellmach, S. & Lindley, C. A. 2011.
Electroencephalographic Assessment of Player
Experience:A Pilot Study in Affective Ludology.
Simulation & Gaming, 42, 632–655.
Pallavicini, F., Cipresso, P., Raspelli, S., Grassi, A., Serino,
S., Vigna, C., Triberti, S., Villamira, M., Gaggioli, A.
& Riva, G. 2013. Is virtual reality always an effective
stressors for exposure treatments? some insights from a
controlled trial. BMC Psychiatry, 13, 52.
Poore, J. C., Webb, A. K., Cunha, M. G., Mariano, L. J.,
Chappell, D. T., Coskren, M. R. & Schwartz, J. L.
2017. Operationalizing Engagement with Multimedia
as User Coherence with Context. IEEE Transactions on
Affective Computing, 8, 95–107.
Rodriguez-Guerrero, C., Knaepen, K., Fraile-Marinero, J.
C., Perez-Turiel, J., Gonzalez-de-Garibay, V. &
Lefeber, D. 2017. Improving Challenge/Skill Ratio in a
Multimodal Interface by Simultaneously Adapting
Game Difficulty and Haptic Assistance through
Psychophysiological and Performance Feedback.
Frontiers in Neuroscience, 11.
Vermeeren, A. P. O. S., Law, E. L.-C., Roto, V., Obrist, M.,
Hoonhout, J. & Väänänen-Vainio-Mattila, K. User
experience evaluation methods: current state and
development needs. 6th Nordic Conference on Human-
Computer Interaction Extending Boundaries:
NordiCHI ’10, Oct 16–20, 2010 2010 Reykjavik, IC.
ACM, 521–530.
Vourvopoulos, A. & Bermúdez i Badia, S. 2016. Motor
priming in virtual reality can augment motor-imagery
training efficacy in restorative brain-computer
interaction: a within-subject analysis. Journal of
NeuroEngineering and Rehabilitation, 13, 69.
Yan, S., Ding, G., Li, H., Sun, N., Wu, Y., Guan, Z., Zhang,
L. & Huang, T. 2016. Enhancing Audience
Engagement in Performing Arts Through an Adaptive
Virtual Environment with a Brain-Computer Interface.
21st International Conference on Intelligent User
Interfaces: IUI '16. Sonoma, CA: ACM.
Zhang, L., Sun, S., Zhang, K., Xing, B. & Xu, W. 2018.
Using psychophysiological measures to evaluate the
multisensory and emotional dynamics of the tea
experience. Journal of Food Measurement and
Characterization, 12, 1399–1407.