ArticlePDF Available

Measuring Cognitive Load in Virtual Reality Training via Pupillometry

Authors:

Abstract

Pupillometry is known as a reliable technique to measure cognitive load in learning and performance. However, its applicability to virtual reality (VR) environments, an emerging technology for simulation-based training, has not been well-verified in educational contexts. Specifically, the VR display causes light reflexes that confound task-evoked pupillary responses (TEPRs), impairing cognitive load measures. Through this pilot study, we validated whether task difficulty can predict cognitive load as measured by TEPRs corrected for the light reflex and if these TEPRs correlate with cognitive load self-ratings and performance. 14 students in health sciences performed observation tasks in two conditions: difficult versus easy tasks, whilst watching a VR scenario in home health care. Then, a cognitive load self-rating ensued. We used a VR system with a built-in eye-tracker and a photosensor installed to assess pupil diameter and light intensity during the scenario. Employing a method from the human-computer interaction field, we determined TEPRs by modeling the pupil light reflexes using a baseline. As predicted, the difficult task caused significantly larger TEPRs than the easy task. Only in the difficult task condition did TEPRs positively correlate with the performance measure. These results suggest that TEPRs are valid measures of cognitive load in VR training when corrected for the light reflex. It opens up possibilities to use real-time cognitive load for assessment and instructional design for VR training. Future studies should test our findings with a larger sample size, in various domains, involving complex VR functions such as haptic interaction.
1
TLT-2021-04-0074.R1
Measuring Cognitive Load in Virtual Reality Training
via Pupillometry
Joy Yeonjoo Lee, Nynke de Jong, Jeroen Donkers, Halszka Jarodzka, Jeroen J.G. van Merriënboer
Abstract Pupillometry is known as a reliable technique to
measure cognitive load in learning and performance. However,
its applicability to virtual reality (VR) environments, an
emerging technology for simulation-based training, has not been
well-verified in educational contexts. Specifically, the VR display
causes light reflexes that confound task-evoked pupillary
responses (TEPRs), impairing cognitive load measures. Through
this pilot study, we validated whether task difficulty can predict
cognitive load as measured by TEPRs corrected for the light
reflex and if these TEPRs correlate with cognitive load self-
ratings and performance. 14 students in health sciences
performed observation tasks in two conditions: difficult versus
easy tasks, whilst watching a VR scenario in home health care.
Then, a cognitive load self-rating ensued. We used a VR system
with a built-in eye-tracker and a photosensor installed to assess
pupil diameter and light intensity during the scenario. Employing
a method from the human-computer interaction field, we
determined TEPRs by modeling the pupil light reflexes using a
baseline. As predicted, the difficult task caused significantly
larger TEPRs than the easy task. Only in the difficult task
condition did TEPRs positively correlate with the performance
measure. These results suggest that TEPRs are valid measures of
cognitive load in VR training when corrected for the light reflex.
It opens up possibilities to use real-time cognitive load for
assessment and instructional design for VR training. Future
studies should test our findings with a larger sample size, in
various domains, involving complex VR functions such as haptic
interaction.
Index Terms Educational simulations, Virtual and augmented
reality, Personalized e-learning, Mobile and personal devices,
Cognitive load, Medical training
Manuscript received XXX XX, XXXX; revised XXX XX, XXXX;
accepted XXX XX, XXXX. Date of publication XXX XX, XXXX; date of
current version XXX XX, XXXX. This work was supported by the
Netherlands Organization for Scientific Research (NWO) under Grant
055.16.117. (Corresponding author: Joy Yeonjoo Lee.)
J. Y. Lee is with Faculty of Governance and Global Affairs, Leiden
University, The Hague, 2501 EE the Netherlands (e-mail:
j.y.lee@luc.leidenuniv.nl).
N. de Jong is with Health Services Research, Faculty of Health, Medicine
and Life Sciences, Maastricht University, Maastricht, 6200 MD the
Netherlands (e-mail: n.dejong@maastrichtuniversity.nl).
J. Donkers is with School of Health Professions Education, Faculty of
Health, Medicine and Life Sciences, Maastricht University, Maastricht, 6200
MD the Netherlands (e-mail: jeroen.donkers@maastrichtuniversity.nl).
H. Jarodzka is with Faculty of Education Sciences, Open University,
Heerlen, 6401 DL the Netherlands (e-mail: Halszka.Jarodzka@ou.nl).
J. J. G. van Merriënboer is with School of Health Professions Education,
Faculty of Health, Medicine and Life Sciences, Maastricht University,
Maastricht, 6200 MD the Netherlands (e-mail:
j.vanmerrienboer@maastrichtuniversity.nl).
I. INTRODUCTION
IRTUAL Reality (VR) has become a powerful
alternative to physical simulation-based training [1].
VR creates immersive task environments that
provoke a sense of presence, emotions, and engagement [2, 3],
which allows for a favorable training environment with high
fidelity of specific tasks [4-6]. Some studies have reported
positive effects of VR training on learning outcomes and
perceived usability [7, 8]. However, key challenges still
remain: instructional design or course structure for VR
training have not been fully established [6, 9, 10], and few
studies have actually measured students’ performance during
VR training [7]. In order to build an effective instructional
design for a new task environment, performance assessment
and progress monitoring are fundamental [11]. This
necessitates the development of real-time indicators of
learning progress and task performance, which can be
informed by cognitive load.
A. Cognitive Load Theory
Cognitive load provides a useful indicator to analyze
learning processes in simulation training [12-15]. Cognitive
load theory posits an “element interactivity” where
heterogeneous processes in cognitive, affective, and social
domains coincide in working memory [13, 16], which is
particularly relevant to simulation training that requires
multiple tasks to be performed simultaneously within complex
environments [12, 17, 18]. Cognitive load refers to the
imposition of these processes caused by given tasks, to
working memory that has limited capacity [13, 19].
Cognitive load is not always harmful to task performance.
There are different types of cognitive load that can be
beneficial or detrimental, depending on the sources of
cognitive load. In the traditional framework, three types of
cognitive load are identified: intrinsic cognitive load which
reflects the complexity of a task and a learner’s competency
for performing the task, germane cognitive load that pertains
to learning, and extraneous cognitive load that stems from
suboptimal instructional design [13, 20]. For instance, if
learning processes are motivated in working memory, it can be
beneficial to performance but leading to higher germane load.
If the total cognitive load exceeds the working memory
capacity, performance may deteriorate. In a recent framework
of cognitive load, other types of cognitive load are identified:
primary load for domain-specific task performance, and
secondary load for domain-general metacognitive performance
that supports the primary performance [21]. If the secondary
V
This article has been accepted for publication in IEEE Transactions on Learning Technologies. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TLT.2023.3326473
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
2
TLT-2021-04-0074.R1
load is activated, performance can be enhanced but at the cost
of higher total cognitive load [22]. Accordingly, the literature
has shown that the correlation between cognitive load and
performance can be either positive or negative, depending on
characteristics of the measurements or the research context
[12, 21-23].
If measured properly, cognitive load can be an effective
indicator of performance, learning, and expertise, which in
turn informs instructional design [13, 24]. Thus, measuring
cognitive load with a valid and reliable method has been an
important issue in research on learning and instruction [13,
25]. In general, three methods have been used to measure
cognitive load: self-rating, secondary tasks, and
psychophysiological indices [12, 25]. In simulation training,
psychophysiological indices are shown to be the most
sensitive measure of the three [12]. Specifically, eye-tracking
indices have been used as a cognitive load measure for
decades [26-28], demonstrating higher validity among other
psychophysiological measures such as heart rate or heart rate
variability [12].
B. Using Pupillometry in Virtual Reality
Pupil dilation is a well-validated real-time measure of
cognitive load in simulation training [24]. A large body of
literature has confirmed that pupil dilation correlates with
cognitive demands imposed by tasks such as solving
arithmetic problems or spelling difficult words [29-33]. At a
physiological level, pupil dilation is known to be an
involuntary response that reflects noradrenergic activity in the
locus coeruleus which regulates arousal, mental activity, and
emotion [34, 35]. Since pupil dilation reflects emotional
activity, it can be an effective measure of cognitive load
caused by task difficulty especially when the task includes the
management of emotion [23]. Moreover, pupil dilation may
capture real-time changes in cognitive load more robustly than
self-rating in dynamic environments such as computer-based
simulation [23].
Nonetheless, when using pupillometry to measure cognitive
load in dynamic task environments, researchers should be
wary of a major confounder, the pupillary light reflex. The
pupil dilation caused by cognitive processing for the given
task, or task-evoked pupillary responses (TEPR), is notably
small while the magnitude of change caused by light reflex is
large [36]. A traditional way to control for this artifact is to
obtain a baseline prior to the actual tasks, and calculate the
difference between the baseline and the pupil size measured
during the tasks [30, 37]. This baseline must be recorded in the
same lighting conditions as the actual tasks, and be developed
for each participant as pupil diameter is highly idiosyncratic
[38]. In VR environments with head-mounted display (HMD),
the most common approach is to use this method with fixed
light conditions or fixed targets [39, 40].
However, when using real-world VR scenarios, using fixed
lighting or targets is not practical nor plausible. An alternative
can be to develop baseline formulas that compute pupil
dilation depending on the changing level of luminance, and
calculating the difference between the baseline and measured
pupil size (i.e., TEPR = pupil measured pupil baseline). To
quantify the luminance level, some studies used a photo sensor
implemented in the HMD [41, 42], while some estimated
luminance by using color values of the pixels presented on the
HMD [43, 44]. To find the baseline formulas, various
mathematical models can be fitted depending on the
experimental setups [45-49]. The subtractive correction of
TEPR is applicable as the TEPR is reportedly independent of
the baseline pupil size [50, 51].
Research on the use of pupillometry to measure cognitive
states in VR is still in its infancy, and studies using light-reflex
correction in VR are scarce. Moreover, most of the existing
studies are in the field of engineering or human-computer
interaction that use simple cognitive tasks, rather than
complex real-life tasks for education. In few studies that used
learning paradigm, the learning tasks were not sufficiently
structured nor well-presented [52]. Among the studies
measuring cognitive load in VR, the majority used a 2D
computer screen rather than HMD [52].
C. Present Study
This study aims to demonstrate TEPRs as an effective real-
time measure of cognitive load in VR with HMD by using
real-life problem solving for education of healthcare
professionals. First, TEPRs are determined by employing a
correction method from the human-computer interaction field
[41, 42, 53]. They then are validated as a cognitive load
measure through two approaches: predictive validity that
examines whether task difficulty as a factor of cognitive load
predicts changes in TEPRs, and concurrent validity to test
whether TEPRs positively correlate with other cognitive load
measures such as self-rating. For this determination and
validation, we test a hypothesis through an experimental setup:
H1. Difficult tasks evoke higher cognitive load (measured by
TEPRs) than easy tasks in a VR training environment.
From educational perspectives, cognitive load should be
interpreted in relation with performance [13]. Assuming the
validity found from testing H1, we further explore correlations
between cognitive load and performance in VR. According to
the literature on cognitive load, the direction of the correlation
can differ depending on task contexts and the sources of
cognitive load. As our task deals with complex learning that
requires diverse skills in real life (e.g., cognitive and
metacognitive skills for patient safety), we do not specify the
direction of correlation a priori, resulting in the second
hypothesis: H2. The level of cognitive load correlates with the
performance level in a VR training environment.
II. METHODS
A. Participants and Design
Fourteen undergraduate students (12 females; mean age 20.5;
SD = 1.5) in Health Sciences were recruited at Maastricht
University, the Netherlands. We used a within-subjects design
with task difficulty as the single factor. Two different
conditions were presented with the presentation order
This article has been accepted for publication in IEEE Transactions on Learning Technologies. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TLT.2023.3326473
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
3
TLT-2021-04-0074.R1
counterbalanced: easy task condition (ET) where participants
performed a simple observation task whilst watching a VR
scenario, and difficult task condition (DT) with high task
complexity.
Fig. 1. On the upper side, a photosensor was installed inside
the headset to measure the light intensity of the display. On
the lower side, the Arduino board was mounted on the outside
of the headset to connect the sensor to the computer via USB.
B. Materials and Technical Setup
A VR scenario for home health care was developed by a
research team at Maastricht University. In this scenario, a
healthcare provider visits a patient to deliver medical care and
share social interaction. This scenario takes 9 minutes and is
formatted for a 360-degree HMD.
Two experts in healthcare professions education designed
the observation tasks and the assessment. In ET, a simple
instruction was given: “Observe and report: What is the
homecare provider doing in the scenario?” In DT, more
detailed observations were requested: “What are the patient’s
symptoms? Describe at least three symptoms; What are the
strengths and weaknesses of the provider’s performance?
Describe at least three for each strengths and weaknesses.”
The observation results were reported in writing. The two
experts composed a checklist consisting 7 to 15 items
extracted from the CanMEDS model [54], the internationally
recognized protocol for the competency of healthcare
professionals. These items included professional roles (e.g.,
medical expertise, communication), patient information (e.g.,
age, past medical history), diagnosis (e.g., breathing,
abnormality), and intervention (e.g., medication, social
interaction) [55, 56]. They are selected as the most relevant
parameters for the given scenario. The experts, then, assessed
the accuracy of participants’ reports based on the list.
A personal computer (Intel Core i9-9900K, 32 GB RAM)
ran the scenario, displaying it through an HTC Vive Pro Eye
headset (2880 x 1600 pixels, 110-degree visual field). This
headset has a built-in eye tracker with 120 Hz sampling rate
that uses HTC SRanipal SDK as an interface (version 1.3.1.0,
www.vive.com). To quantify the luminance level, we
implemented a photo sensor (LDR sensor Iduino ST1107) in
the HMD to import the luminance data. An Arduino board
(Arduino UNO Rev3) connected the sensor to the computer
via USB (Fig. 1). The WorldViz Vizard software (version 6.0,
www.worldviz.com) was used to arrange stimuli presentation
and data recording. We used the VR Eye-tracking Analytics
Lab package to synchronize the data from both the eye-tracker
and the photosensor.
For baseline recording, we used a separate VR scene where
an empty room is presented. In this scene, we recorded
participants’ individual light reflex for different light intensity
[41, 57]. The light intensity was arranged to increase in 10
stages, where each stage takes 5 seconds [38]. This scene does
not include any additional visual or auditorial effects to form a
neutral stimulus [44]. For the self-rating of cognitive load, we
employed the widely used 9-point Paas Scale with the value 1
representing the lowest mental effort and the value 9
representing the highest [58].
C. Procedure
The participants were asked to read an informed consent
form and sign it. Individual sessions consisted of two trials,
ET and DT in counterbalanced order. Initially, participants
were presented with instructions about the task. Next, they
were positioned to stand at the center of the VR area, then put
on the VR headset. After a 5-point calibration and the baseline
session, the scenario was presented. Participants could rotate
their heads or walk around in the VR area during the scenario.
After the scenario, they reported their task results, and self-
rated their cognitive load for the task on the Paas Scale.
D. Data Analysis
The raw data included timestamps, light intensity, and pupil
diameter. Using the baseline data, we developed models that
represent the relationship between pupil diameter and light
intensity. In the literature, various mathematical models of this
relationship have been suggested for different context [45-49].
After visual inspection for initial data analysis, we applied an
exponential model using least squares optimization. It is one
of the most universal models [44, 49], with reportedly the
lowest error in VR setups [59]. The model fits for all trials
were sufficient to show that the method can function properly
(F mean = 125.21, SD = 82.30; p mean = 0.00, SD = 0.00).
Applying this model, pupil dilation caused by light reflexes
was predicted to form pupil baseline. The subtractive
correction (i.e., TEPR = pupil measured pupil baseline) was
This article has been accepted for publication in IEEE Transactions on Learning Technologies. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TLT.2023.3326473
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
4
TLT-2021-04-0074.R1
applied, resulting in TEPRs [50, 51]. The baseline pupil size
before the correction was not significantly different between
the two task conditions. The TEPRs then were averaged over
time for each trial. The difference between ET and DT was
tested via a dependent sample t test.
For calculating correlations, the two conditions of ET and
DT were averaged for each participant, except when focusing
on each condition. Based on the normality test of datasets, we
used Spearman’s method to calculate correlations. The task
scores were averaged between the two raters. The inter-rater
reliability was examined (rs = 0.89, p = .00). R (version 3.5.1,
R Core Team, 2019) was used for statistical analysis. We
considered p < .05 to be statistically significant.
III. RESULTS
Fig. 2. The changes in TEPRs between ET and DT for all
participants. The boxplots depict the quartile-based
distribution of individual samples in each condition.
A. TEPRs as a Cognitive Load Measure
Fig. 2 demonstrates the change in TEPRs between ET and
DT for all 14 participants. As predicted, TEPRs were
significantly larger in DT (mean = 0.07, SD = 0.31) than in ET
(mean = -0.21, SD = 0.43) (p = .01). There was no significant
difference in eye fixation position between ET and DT. The
Paas Scale significantly correlated with TEPRs (rs = 0.65, p =
.00). The self-rating score was higher in DT (median = 5,
range = 2-7) than in ET (median = 4, range = 2-6), yet the
difference between the conditions was not statistically
significant (p = .07).
B. Correlations between TEPRs and Performance
When averaging ET and DT, TEPRs did not significantly
correlate with performance. However, when focusing on either
condition, DT showed significant correlations. In DT, task
score was positively correlated with TEPRs (rs = 0.55, p = .04)
and the Paas Scale (rs = 0.68, p = .01) (Table 1).
TABLE I
CORRELATIONS BETWEEN TEPRS, PAAS SCALE,
AND TASK SCORE IN DT CONDITION
Variable
TEPR
Paas Scale
Paas Scale
0.54 (.04)
[.02, .83]
Task score
0.55 (.04)
0.68 (.01)
[.03, .84]
[.24, .89]
p values are presented in parentheses. Significant effects (p < .05) are
in boldface. Values in square brackets indicate the 95% confidence
interval for each correlation.
VI. DISCUSSION
The present study tested the validity of TEPRs as a measure
of cognitive load in a VR training environment. We have
developed a VR learning environment where task complexity
can be adjusted, established a VR system with an eye tracker,
and controlled for the light reflex to determine TEPRs. We
have confirmed the predictive validity as cognitive load
increased with task difficulty, and the concurrent validity as
the cognitive load correlated with the self-ratings.
Additionally, performance measures correlated with TEPRs
only in DT.
The first hypothesis (H1) assumed that cognitive load is
higher in difficult tasks than in easy tasks in a VR training
environment. We found a significant impact of task
complexity on cognitive load by using TEPRs, whereas the
cognitive load self-rating did not show statistically significant
effects. An explanation for the higher sensitivity of TEPRs
compared with the self-rating might reside in the unique
characteristics of VR task environments that provoke dynamic
emotions through immersion [41]. While self-rating scales
have been developed for classroom-based settings and depend
solely on participants’ judgment, pupil dilation is an
involuntary response that reflects arousal and emotions [60].
Simulation training often involves affective and social tasks
that require the management of emotions. Our observation
tasks also included detecting the healthcare provider’s
empathy for the patient. In such task environments,
pupillometry could be a more sensitive measurement of
cognitive load than self-rating. This finding is consistent with
previous research in simulation-based training [12, 23]. We
suggest that more future research is needed to study this effect
of emotional engagement on cognitive load in VR training.
Consequently, we largely confirm H1 based on the higher
sensitivity of TEPRs as a cognitive load measure in VR
training environments. This supports TEPR’s predictive
validity as a cognitive load measure. Although the cognitive
load self-rating showed lower sensitivity, it significantly
This article has been accepted for publication in IEEE Transactions on Learning Technologies. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TLT.2023.3326473
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
5
TLT-2021-04-0074.R1
correlated with TEPRs, which endorses TEPR’s concurrent
validity as a cognitive load measure.
The second hypothesis (H2) posited that the level of
cognitive load correlates with performance level in a VR
training environment. We found that cognitive load measured
by the two indices (e.g., TEPRs and the Paas scale) positively
correlates with the performance measure, but only in DT
condition. Here we suspect there is a ceiling effect in ET that
might have caused a weak discrimination in performance
measures. We recommend that future studies should include a
task analysis beforehand in order to prevent scale attenuation
effects.
Concerning the positive direction of the correlation, it is
likely that the sources of cognitive load in this study were
rather beneficial to performance. For instance, they might
have been stimulating information processing rather than
harming performance. It appears that the task environment in
this study is not as overwhelming as other simulation
environments. The scenario was slow-paced, and the
observation tasks were passive without any psychomotor skills
required. Studies have shown that correlations between
cognitive load and performance varied from positive to
negative across different research contexts [12]. While this
inconsistency may partly stem from measurement limitations
[12], we argue that the nature of the factors that caused
cognitive load determines the direction of the correlations. If
the factors are negative to performance (e.g., distraction) or
make the total cognitive load exceed working memory
capacity, cognitive load should be inversely proportional to
performance. If the factors are positive (e.g., germane
cognitive load [13, 61], self-regulation [23, 62]), cognitive
load can positively correlate with performance. Again, we
emphasize the importance of preceding task analysis to define
the sources of cognitive load in future studies.
The validity and utility of TEPRs found by the present study
open up new possibilities to improve research on VR training
and instructional design for VR environments. Researchers
may expand this finding to more diverse VR training
environments, investigate the potential of pupillometry to
assess cognitive processes during VR training, and test if this
new measure can be used to evaluate performers’ expertise.
For educators who search for an effective assessment tool for
VR training, TEPRs can be a good option that provides an
objective indicator of performers’ competence to manage
situational and emotional challenges in complex
environments. This assessment might compensate for the
lacking discriminatory power of traditional performance
measures (e.g., questionnaires, checklists), improving
instructional design and training programs for VR
environments.
Our study has several limitations. First, as a pilot study, we
used a small sample of participants. This might reduce the
generalizability of our findings. Second, the scenario included
only one domain, i.e., home health care. Our findings should
be tested if they are applicable to other domains. We
recommend future studies to do task analysis for the given
domain before the testing, so the measures can be properly
operationalized for targeted constructs [11]. Third, the VR
content we used was formatted for a 360-degree HMD, which
is only one type of VR technology. Future studies should
examine our methods in more advanced settings such as 3D-
rendered VR with haptic interaction. For these studies, we
propose a careful control for confounding factors as various
sensory modalities are involved in such complex
environments. Also, in different setups, methods to control for
light reflex should be carefully chosen (e.g., using the same
luminous level, photo sensors, or calculation of pixels on
HMD). Lastly, other confounding factors in pupillometry such
as pupil foreshortening error (i.e., the influence of gaze
position on pupil size) [60, 63] were not corrected, due to a
lack of corresponding methods for VR environments. In the
absence of such methods, we recommend making the
experimental conditions comparable as much as possible (e.g.,
minimize the difference in gaze position across the
conditions).
To our knowledge, this study is the first to show the validity
of TEPRs as a cognitive load measure in VR healthcare
training. The hidden potential of using VR training lies in the
utility of datasets from diverse sources such as eye tracking,
which provides rich information about training development.
Continued study is needed to improve the understanding of
these datasets and make VR healthcare training more
effective.
ACKNOWLEDGMENT
The authors would like to thank Dr. Silke Metzelthin for her
assistance with the performance assessment.
REFERENCES
[1] J. O. Woolliscroft, "Innovation in Response to the COVID-19
Pandemic Crisis," Acad. Med., vol. 95, no. 8, pp. 1140-1142, 2020, doi:
10.1097/acm.0000000000003402.
[2] M. Slater, "Measuring presence: A response to the Witmer and Singer
presence questionnaire," Presence, vol. 8, no. 5, pp. 560-565, 1999.
[3] G. Riva et al., "Affective Interactions Using Virtual Reality: The Link
between Presence and Emotions," CyberPsychol. Behav., vol. 10, no. 1,
pp. 45-56, 2007, doi: 10.1089/cpb.2006.9993.
[4] N. Pellas, I. Kazanidis, and G. Palaigeorgiou, "A systematic literature
review of mixed reality environments in K-12 education," Education
and Information Technologies, pp. 1-40, 2019.
[5] R. Hite et al., "Investigating potential relationships between adolescents’
cognitive development and perceptions of presence in 3-D, haptic-
enabled, virtual reality science instruction," Journal of Science
Education and Technology, vol. 28, no. 3, pp. 265-284, 2019.
[6] Z. Merchant, E. T. Goetz, L. Cifuentes, W. Keeney-Kennicutt, and T. J.
Davis, "Effectiveness of virtual reality-based instruction on students'
learning outcomes in K-12 and higher education: A meta-analysis,"
Computers & Education, vol. 70, pp. 29-40, 2014, doi:
10.1016/j.compedu.2013.07.033.
[7] N. Pellas, A. Dengel, and A. Christopoulos, "A Scoping Review of
Immersive Virtual Reality in STEM Education," IEEE Transactions on
Learning Technologies, vol. 13, no. 4, pp. 748-761, 2020.
[8] P. Pantelidis et al., "Virtual and Augmented Reality in Medical
Education," in Medical and Surgical Education: Past, Present and
Future, G. Tsoulfas Ed. London, UK: IntechOpen, 2018, pp. 77-97.
[9] L. Jensen and F. Konradsen, "A review of the use of virtual reality
head-mounted displays in education and training," Education and
Information Technologies, vol. 23, no. 4, pp. 1515-1529, 2018.
[10] C. Fertleman et al., "A Discussion of Virtual Reality As a New Tool for
Training Healthcare Professionals," Frontiers in Public Health, vol. 6,
2018, doi: 10.3389/fpubh.2018.00044.
This article has been accepted for publication in IEEE Transactions on Learning Technologies. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TLT.2023.3326473
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
6
TLT-2021-04-0074.R1
[11] J. J. Van Merriënboer and P. A. Kirschner, Ten steps to complex
learning: A systematic approach to four-component instructional
design. New York: Routledge, 2018.
[12] L. M. Naismith and R. B. Cavalcanti, "Validity of cognitive load
measures in simulation-based training: a systematic review," Acad.
Med., vol. 90, no. 11, pp. S24-S35, 2015, doi:
10.1097/ACM.0000000000000893.
[13] J. Sweller, J. J. Van Merriënboer, and F. Paas, "Cognitive architecture
and instructional design: 20 years later," Educ. Psychol. Rev., vol. 31,
no. 2, pp. 261-292, 2019, doi: 10.1007/s10648-019-09465-5.
[14] K. L. Fraser, P. Ayres, and J. Sweller, "Cognitive Load Theory for the
Design of Medical Simulations," Simulation in Healthcare, vol. 10, no.
5, pp. 295-307, 2015, doi: 10.1097/SIH.0000000000000097.
[15] R. E. Moreno and B. Park, "Cognitive load theory: Historical
development and relation to other theories," 2010.
[16] J. Q. Young, P. S. O’Sullivan, V. Ruddick, D. M. Irby, and O. T. Cate,
"Improving Handoffs Curricula: Instructional Techniques from
Cognitive Load Theory," Acad. Med., vol. 92, no. 5, p. 719, 2017, doi:
10.1097/acm.0000000000001664.
[17] F. A. Haji, D. Rojas, R. Childs, S. De Ribaupierre, and A. Dubrowski,
"Measuring cognitive load: performance, mental effort and simulation
task complexity," Med. Educ., vol. 49, no. 8, pp. 815-827, 2015, doi:
10.1111/medu.12773.
[18] J. Q. Young, S. M. Van Dijk, P. S. O'Sullivan, E. J. Custers, D. M. Irby,
and O. Ten Cate, "Influence of learner knowledge and case complexity
on handover accuracy and cognitive load: results from a simulation
study," Med. Educ., vol. 50, no. 9, pp. 969-978, 2016, doi:
10.1111/medu.13107.
[19] J. J. Van Merrienboer and J. Sweller, "Cognitive load theory and
complex learning: Recent developments and future directions," Educ.
Psychol. Rev., vol. 17, no. 2, pp. 147-177, 2005.
[20] J. Sweller, "Element interactivity and intrinsic, extraneous, and
germane cognitive load," Educ. Psychol. Rev., vol. 22, no. 2, pp. 123-
138, 2010.
[21] J. Y. Lee, A. Szulewski, J. Q. Young, J. Donkers, H. Jarodzka, and J. J.
G. Van Merriënboer, "The Medical Pause: Importance, Processes, and
Training," Med. Educ., 2021, doi: 10.1111/medu.14529.
[22] T. Seufert, "The interplay between self-regulation in learning and
cognitive load," (in English), Educational Research Review, vol. 24, pp.
116-129, Jun 2018, doi: 10.1016/j.edurev.2018.03.004.
[23] J. Y. Lee, J. Donkers, H. Jarodzka, G. Sellenraad, and J. J. G. Van
Merriënboer, "Different effects of pausing on cognitive load in a
medical simulation game," Comput. Human Behav., vol. 110, p.
106385, 2020, doi: 10.1016/j.chb.2020.106385.
[24] A. Szulewski, N. Roth, and D. Howes, "The Use of Task-Evoked
Pupillary Response as an Objective Measure of Cognitive Load in
Novices and Trained Physicians: A New Tool for the Assessment of
Expertise," (in English), Acad. Med., vol. 90, no. 7, pp. 981-987, Jul
2015, doi: 10.1097/Acm.0000000000000677.
[25] A. Korbach, R. Brünken, and B. Park, "Differentiating different types
of cognitive load: A comparison of different measures," Educ. Psychol.
Rev., vol. 30, no. 2, pp. 1-27, 2018, doi: 10.1007/s10648-017-9404-8.
[26] M. G. Glaholt, "Eye tracking in the cockpit: a review of the
relationships between eye movements and the aviators cognitive state,"
2014.
[27] J. L. Rosch and J. J. Vogel-Walcutt, "A review of eye-tracking
applications as tools for training," Cognition Technol. Work, vol. 15, no.
3, pp. 313-327, 2013, doi: 10.1007/s10111-012-0234-7.
[28] B. A. Wilbanks and S. P. McMullan, "A review of measuring the
cognitive workload of electronic health records," CIN: Computers,
Informatics, Nursing, vol. 36, no. 12, pp. 579-588, 2018.
[29] E. H. Hess and J. M. Polt, "Pupil size in relation to mental activity
during simple problem-solving," Science, vol. 143, no. 3611, pp. 1190-
1192, 1964.
[30] J. Hyönä, J. Tommola, and A.-M. Alaja, "Pupil dilation as a measure of
processing load in simultaneous interpretation and other language
tasks," The Quarterly Journal of Experimental Psychology, vol. 48, no.
3, pp. 598-612, 1995.
[31] D. Kahneman and J. Beatty, "Pupil diameter and load on memory,"
Science, vol. 154, no. 3756, pp. 1583-1585, 1966.
[32] K. F. Van Orden, W. Limbert, S. Makeig, and T.-P. Jung, "Eye activity
correlates of workload during a visuospatial memory task," Hum.
Factors, vol. 43, no. 1, pp. 111-121, 2001.
[33] J. Klingner, B. Tversky, and P. Hanrahan, "Effects of visual and verbal
presentation on cognitive load in vigilance, memory, and arithmetic
tasks," Psychophysiology, vol. 48, no. 3, pp. 323-332, 2011, doi:
10.1111/j.1469-8986.2010.01069.x.
[34] M. K. Eckstein, B. Guerra-Carrillo, A. T. Miller Singley, and S. A.
Bunge, "Beyond eye gaze: What else can eyetracking reveal about
cognition and cognitive development?," Dev. Cogn. Neurosci., vol. 25,
pp. 69-91, 2017/06/01/ 2017, doi:
https://doi.org/10.1016/j.dcn.2016.11.001.
[35] J. Brisson, M. Mainville, D. Mailloux, C. Beaulieu, J. Serres, and S.
Sirois, "Pupil diameter measurement errors as a function of gaze
direction in corneal reflection eyetrackers," vol. 45, no. 4, pp. 1322-
1331, 2013, doi: 10.3758/s13428-013-0327-0.
[36] J. Beatty and B. Lucero-Wagoner, "The pupillary system," in
Handbook of Psychophysiology L. G. T. John T Cacioppo, and Gary G
Berntson Ed. Cambridge, United Kingdom: Cambridge University
Press, 2000, ch. Chapter Six, pp. 142162.
[37] P. W. M. Van Gerven, F. Paas, J. J. G. Van Merrienboer, and H. G.
Schmidt, "Memory load and the cognitive pupillary response in aging,"
Psychophysiology, vol. 41, no. 2, pp. 167-174, 2004, doi:
10.1111/j.1469-8986.2003.00148.x.
[38] K. Holmqvist, M. Nyström, R. Andersson, R. Dewhurst, H. Jarodzka,
and J. Van de Weijer, Eye tracking: A comprehensive guide to methods
and measures. OUP Oxford, 2011.
[39] P. Bækgaard, J. P. Hansen, K. Minakata, and I. S. MacKenzie, "A Fitts'
law study of pupil dilations in a head-mounted display," presented at
the Proceedings of the 11th ACM Symposium on Eye Tracking
Research & Applications, Denver, Colorado, 2019. [Online]. Available:
https://doi.org/10.1145/3314111.3319831.
[40] C. Hirt, M. Eckard, and A. Kunz, "Stress generation and non-intrusive
measurement in virtual environments using eye tracking," Journal of
Ambient Intelligence and Humanized Computing, vol. 11, no. 12, pp.
5977-5989, 2020, doi: 10.1007/s12652-020-01845-y.
[41] H. Chen, A. Dey, M. Billinghurst, and R. W. Lindeman, "Exploring
pupil dilation in emotional virtual reality environments," 2017.
[42] J. Iskander et al., "Exploring the Effect of Virtual Depth on Pupil
Diameter," 2019: IEEE, 2019, doi: 10.1109/smc.2019.8913975.
[Online]. Available: https://dx.doi.org/10.1109/smc.2019.8913975
[43] M. Eckert, E. A. P. Habets, and O. S. Rummukainen, "Cognitive Load
Estimation Based on Pupillometry in Virtual Reality with Uncontrolled
Scene Lighting," 2021: IEEE, 2021, doi:
10.1109/qomex51781.2021.9465417. [Online]. Available:
https://dx.doi.org/10.1109/qomex51781.2021.9465417
[44] M. Eckert, T. Robotham, E. A. P. Habets, and O. S. Rummukainen,
"Pupillary Light Reflex Correction for Robust Pupillometry in Virtual
Reality," Proceedings of the ACM on Computer Graphics and
Interactive Techniques, vol. 5, no. 2, pp. 1-16, 2022, doi:
10.1145/3530798.
[45] L. L. Holladay, "The Fundamentals of Glare and Visibility," J. Opt. Soc.
Am., vol. 12, no. 4, pp. 271-319, 1926/04/01 1926, doi:
10.1364/JOSA.12.000271.
[46] S. G. de Groot and J. W. Gebhard, "Pupil Size as Determined by
Adapting Luminance*," J. Opt. Soc. Am., vol. 42, no. 7, pp. 492-495,
1952/07/01 1952, doi: 10.1364/JOSA.42.000492.
[47] P. A. Stanley and A. K. Davies, "The effect of field of view size on
steady-state pupil diameter," Ophthalmic Physiol. Opt., vol. 15, no. 6,
pp. 601-603, 1995/11/01/ 1995, doi: https://doi.org/10.1016/0275-
5408(94)00019-V.
[48] B. Winn, D. Whitaker, D. B. Elliott, and N. J. Phillips, "Factors
affecting light-adapted pupil size in normal human subjects," Invest.
Ophthalmol. Vis. Sci., vol. 35, no. 3, pp. 1132-1137, 1994.
[49] A. B. Watson and J. I. Yellott, "A unified formula for light-adapted
pupil size," Journal of Vision, vol. 12, no. 10, pp. 12-12, 2012, doi:
10.1167/12.10.12.
[50] J. Beatty, "Task-evoked pupillary responses, processing load, and the
structure of processing resources," Psychological bulletin, vol. 91, no. 2,
p. 276, 1982.
[51] J. Reilly, A. Kelly, S. H. Kim, S. Jett, and B. Zuckerman, "The human
task-evoked pupillary response function is linear: Implications for
baseline response scaling in pupillometry," Behav. Res. Methods, vol.
51, no. 2, pp. 865-878, 2019, doi: 10.3758/s13428-018-1134-4.
[52] A. D. Souchet, S. Philippe, D. Lourdeaux, and L. Leroy, "Measuring
Visual Fatigue and Cognitive Load via Eye Tracking while Learning
with Virtual Reality Head-Mounted Displays: A Review," International
Journal of HumanComputer Interaction, vol. 38, no. 9, pp. 801-824,
2022, doi: 10.1080/10447318.2021.1976509.
This article has been accepted for publication in IEEE Transactions on Learning Technologies. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TLT.2023.3326473
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
7
TLT-2021-04-0074.R1
[53] B. Cebeci, U. Celikcan, and T. K. Capin, "A comprehensive study of
the affective and physiological responses induced by dynamic virtual
reality environments," Computer Animation and Virtual Worlds, vol. 30,
no. 3-4, 2019, doi: 10.1002/cav.1893.
[54] J. R. Frank and D. Danoff, "The CanMEDS initiative: implementing an
outcomes-based framework of physician competencies," Med. Teach.,
vol. 29, no. 7, pp. 642-647, 2007, doi: 10.1080/01421590701746983.
[55] J. Sherbino, K. Kulasegaram, A. Worster, and G. R. Norman, "The
reliability of encounter cards to assess the CanMEDS roles," Advances
in Health Sciences Education, vol. 18, no. 5, pp. 987-996, 2013, doi:
10.1007/s10459-012-9440-6.
[56] S. Chou, G. Cole, K. McLaughlin, and J. Lockyer, "CanMEDS
evaluation in Canadian postgraduate training programmes: tools used
and programme director satisfaction," Med. Educ., vol. 42, no. 9, pp.
879-886, 2008, doi: 10.1111/j.1365-2923.2008.03111.x.
[57] O. Palinko and A. L. Kun, "Exploring the effects of visual cognitive
load and illumination on pupil diameter in driving simulators," 2012:
ACM, doi: 10.1145/2168556.2168650. [Online]. Available:
https://dx.doi.org/10.1145/2168556.2168650
[58] F. G. W. C. Paas, "Training Strategies for Attaining Transfer of
Problem-Solving Skill in Statistics - a Cognitive-Load Approach," (in
English), J. Educ. Psychol., vol. 84, no. 4, pp. 429-434, Dec 1992, doi:
10.1037/0022-0663.84.4.429.
[59] B. John, P. Raiturkar, A. Banerjee, and E. Jain, "An evaluation of
pupillary light response models for 2D screens and VR HMDs," 2018:
ACM, doi: 10.1145/3281505.3281538. [Online]. Available:
https://dx.doi.org/10.1145/3281505.3281538
[60] K. Holmqvist and R. Andersson, Eye-tracking: A comprehensive guide
to methods, paradigms and measures. Lund, Sweden: Lund Eye-
Tracking Research Institute, 2017.
[61] A. Szulewski, D. Howes, J. J. G. Van Merriënboer, and J. Sweller,
"From Theory to Practice: The Application of Cognitive Load Theory
to the Practice of Medicine," Acad. Med., vol. Publish Ahead of Print,
2020, doi: 10.1097/acm.0000000000003524.
[62] A. B. H. De Bruin and J. J. G. Van Merriënboer, "Bridging Cognitive
Load and Self-Regulated Learning Research: A complementary
approach to contemporary issues in educational research," Learning
and Instruction, vol. 51, pp. 1-9, 2017, doi:
10.1016/j.learninstruc.2017.06.001.
[63] T. R. Hayes and A. A. Petrov, "Mapping and correcting the influence of
gaze position on pupil size measurements," Behav. Res. Methods, vol.
48, no. 2, pp. 510-527, Jun 2016, doi: 10.3758/s13428-015-0588-x.
Dr. Joy Y. Lee is assistant professor of
Instructional Technology and Health Data
Science at Leiden University, the
Netherlands. She received her PhD (Cum
Laude, SHE Dissertation Award 2022)
from the School of Health Professions
Education at Maastricht University, the
Netherlands. Her research utilizes eye-tracking and data
science to investigate human cognition and technology-
enhanced learning based on educational theories (e.g., the
Medical Pause, cognitive load). In particular, she has an active
interest in using eye-tracking in VR/AR environments and
application of AI to performance assessment and learning
analytics.
Dr. Nynke de Jong is associate professor
at Department of Health Services Research
as well as the School of Health Professions
Education at Maastricht University, the
Netherlands. She finished her studies in
nursing. She holds a master’s degree and a
PhD degree in health sciences. Her
research focuses on e-reality education.
Dr. Jeroen Donkers is assistant professor
at the Department of Educational
Development & Research as well as the
School of Health Professions Education at
Maastricht University, the Netherlands. He
has a PhD degree in artificial intelligence.
He focusses his activities on smart use of
computers in education for learning and for
assessment.
Prof. Dr. Halszka Jarodzka is full
professor of online learning and instruction
at the Open University, the Netherlands.
She holds a diploma in Psychology and a
PhD in the field of Pedagogical and Media
Psycholgogy (dr.rer.nat.). Her research is on
the use of eye-tracking in education to study
and to foster learning, testing, and expertise development.
Dr. Jeroen J.G. van Merriënboer is full
professor of Learning and Instruction at
the Department of Educational
Development & Research as well as the
School of Health Professions Education at
Maastricht University, the Netherlands. He
holds a master’s degree in experimental
psychology and a PhD degree in educational sciences. His
research focuses on instructional design, specifically,
cognitive load theory, four-component instructional design
(4C/ID), and learning in the health professions.
This article has been accepted for publication in IEEE Transactions on Learning Technologies. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TLT.2023.3326473
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
... More specifically in educational settings, this technology is commonly used in higher education, especially with health sciences students [33][34][35][36][37][38][39]. One of the specific functions the technology offers is that it can give the teacher or researcher data that, once processed and analysed, will help in the design of explanatory models of how learning occurs in different types of learners [30]. ...
... Eye tracking technology can be used in educational settings to analyse different learning styles [30], for educational rehabilitation from problems related to dyslexia [31], and to examine the effectiveness of educational methodologies such as evidenced-based learning [32]. More specifically in educational settings, this technology is commonly used in higher education, especially with health sciences students [33][34][35][36][37][38][39]. One of the specific functions the technology offers is that it can give the teacher or researcher data that, once processed and analysed, will help in the design of explanatory models of how learning occurs in different types of learners [30]. ...
Article
Full-text available
The use of eye tracking technology, together with other physiological measurements such as psychogalvanic skin response (GSR) and electroencephalographic (EEG) recordings, provides researchers with information about users’ physiological behavioural responses during their learning process in different types of tasks. These devices produce a large volume of data. However, in order to analyse these records, researchers have to process and analyse them using complex statistical and/or machine learning techniques (supervised or unsupervised) that are usually not incorporated into the devices. The objectives of this study were (1) to propose a procedure for processing the extracted data; (2) to address the potential technical challenges and difficulties in processing logs in integrated multichannel technology; and (3) to offer solutions for automating data processing and analysis. A Notebook in Jupyter is proposed with the steps for importing and processing data, as well as for using supervised and unsupervised machine learning algorithms.
... Eye tracking provides insights into cognitive processing through metrics like pupil dilation, fixation duration, and fixation count [13]. Increased pupil dilation and prolonged fixations are linked to higher cognitive load [24]. HRV is a well-established physiological marker of autonomic nervous system activity and cognitive effort [40]. ...
Preprint
Full-text available
Adaptive Virtual Reality (VR) systems have the potential to enhance training and learning experiences by dynamically responding to users' cognitive states. This research investigates how eye tracking and heart rate variability (HRV) can be used to detect cognitive load and stress in VR environments, enabling real-time adaptation. The study follows a three-phase approach: (1) conducting a user study with the Stroop task to label cognitive load data and train machine learning models to detect high cognitive load, (2) fine-tuning these models with new users and integrating them into an adaptive VR system that dynamically adjusts training difficulty based on physiological signals, and (3) developing a privacy-aware approach to detect high cognitive load and compare this with the adaptive VR in Phase two. This research contributes to affective computing and adaptive VR using physiological sensing, with applications in education, training, and healthcare. Future work will explore scalability, real-time inference optimization, and ethical considerations in physiological adaptive VR.
... Previous studies reported inconclusive results when comparing induced cognitive load in VR (head-mounted displays) with in-person manikin-based, augmented reality, and screen-based simulation environments [91][92][93]. however, research has indicated that familiarity with VR environments can reduce cognitive load when users are engaging with complex tasks [94]. since we did not compare previous experience with VR between groups, it is possible that expert groups may have previous training experience with these environments resulting in decreased cognitive load when engaging with these complex tasks. ...
Article
Full-text available
Background Team leadership during medical emergencies like cardiac arrest resuscitation is cognitively demanding, especially for trainees. These cognitive processes remain poorly characterized due to measurement challenges. Using virtual reality simulation, this study aimed to elucidate and compare communication and cognitive processes-such as decision-making, cognitive load, perceived pitfalls, and strategies-between expert and novice code team leaders to inform strategies for accelerating proficiency development. Methods A simulation-based mixed methods approach was utilized within a single large academic medical center, involving twelve standardized virtual reality cardiac arrest simulations. These 10- to 15-minutes simulation sessions were performed by seven experts and five novices. Following the simulations, a cognitive task analysis was conducted using a cued-recall protocol to identify the challenges, decision-making processes, and cognitive load experienced across the seven stages of each simulation. Results The analysis revealed 250 unique cognitive processes. In terms of reasoning patterns, experts used inductive reasoning, while novices tended to use deductive reasoning, considering treatments before assessments. Experts also demonstrated earlier consideration of potential reversible causes of cardiac arrest. Regarding team communication, experts reported more critical communications, with no shared subthemes between groups. Experts identified more teamwork pitfalls, and suggested more strategies compared to novices. For cognitive load, experts reported lower median cognitive load (53) compared to novices (80) across all stages, with the exception of the initial presentation phase. Conclusions The identified patterns of expert performance — superior teamwork skills, inductive clinical reasoning, and distributed cognitive strategiesn — can inform training programs aimed at accelerating expertise development.
... Cognitive load can be inferred with the help of subjective scales (self-report questionnaires), physiological measures, and dual-task approaches. Physiological measures such as pupil dilation (Lee et al., 2024), functional magnetic resonance imaging (fMRI; Whelan et al., 2007), or galvanic skin response (Shi et al., 2007) are assumed to be indicative of changes in cognitive load. Dual-task methods require learners to perform two tasks simultaneously, with the assumption that performance on the secondary task (e.g., foot tapping) decreases when the primary task (e.g., reading a multimedia instruction) induces high cognitive load (Park & Brünken, 2015). ...
Article
Full-text available
In research practice, it is common to measure cognitive load after learning using self-report scales. This approach can be considered risky because it is unclear on what basis learners assess cognitive load, particularly when the learning material contains varying levels of complexity. This raises questions that have yet to be answered by educational psychology research: Does measuring cognitive load during and after learning lead to comparable assessments of cognitive load depending on the sequence of complexity? Do learners rely on their first or last impression of complexity of a learning material when reporting the cognitive load of the entire learning material after learning? To address these issues, three learning units were created, differing in terms of intrinsic cognitive load (low, medium, or high complexity) as verified by a pre-study (N = 67). In the main-study (N = 100), the three learning units were studied in two sequences (increasing vs. decreasing complexity) and learners were asked to report cognitive load after each learning unit and after learning as an overall assessment. The results demonstrated that the first impression of complexity is the most accurate predictor of the overall cognitive load associated with the learning material, indicating a primacy effect. This finding contrasts with previous studies on problem-solving tasks, which have identified the most complex task as the primary determinant of the overall assessment. This study suggests that, during learning, the assessment of the overall cognitive load is influenced primarily by the timing of measurement.
... Then, we used the Fast Fourier Transform (FFT) to denoise the signal. Afterward, we normalized the data on a scale of 0 to 1. Previous studies show that pupil dilation is strongly correlated with CL [2,8,14] and longer fixation duration can reflect high CL [15,16]. Hence, we extracted fixation duration and pupil dilation using iMotions Lab software (version 10) [12] with built-in R Notebooks. ...
... Then, we used the Fast Fourier Transform (FFT) to denoise the signal. Afterward, we normalized the data on a scale of 0 to 1. Previous studies show that pupil dilation is strongly correlated with CL [2,8,14] and longer fixation duration can reflect high CL [15,16]. Hence, we extracted fixation duration and pupil dilation using iMotions Lab software (version 10) [12] with built-in R Notebooks. ...
Preprint
Full-text available
Virtual Reality (VR) has been a beneficial training tool in fields such as advanced manufacturing. However, users may experience a high cognitive load due to various factors, such as the use of VR hardware or tasks within the VR environment. Studies have shown that eye-tracking has the potential to detect cognitive load, but in the context of VR and complex spatiotemporal tasks (e.g., assembly and disassembly), it remains relatively unexplored. Here, we present an ongoing study to detect users' cognitive load using an eye-tracking-based machine learning approach. We developed a VR training system for cold spray and tested it with 22 participants, obtaining 19 valid eye-tracking datasets and NASA-TLX scores. We applied Multi-Layer Perceptron (MLP) and Random Forest (RF) models to compare the accuracy of predicting cognitive load (i.e., NASA-TLX) using pupil dilation and fixation duration. Our preliminary analysis demonstrates the feasibility of using eye tracking to detect cognitive load in complex spatiotemporal VR experiences and motivates further exploration.
Article
Full-text available
Cognitive Load Theory describes the concept of limited working memory capacity for information processing and cognitive tasks requiring mental effort. Cognitive load is affected by the difficulty levels involved in the learning process and mental tasks. The present meta-analysis examines the effect of difficulty levels in educational tasks based on cognitive load using fixation duration, number of fixations, and pupil dilation. The present study is based on 21 articles containing 34 comparisons between high and low cognitive load conditions in education settings. We evaluated Cohen's d for effect size, the I ² statistic and Q-test to estimate heterogeneity, and Egger's regression test for publication bias. The heterogeneity of the experimental settings was relatively high, nevertheless, we observed meaningful differences in pupil metrics. Pupil dilation significantly increased in hard compared to easy task conditions ( d = 0.72, CI = 0.3672; 1.072, p -value < 0.0001), serving as an indicator of cognitive load level in various educational settings. Fixation duration and number of fixations did not differentiate between the easy and hard levels.
Article
Full-text available
Virtual reality (VR) headsets with an integrated eye tracker enable the measurement of pupil size fluctuations correlated with cognition during a VR experience. We present a method to correct for the light-induced pupil size changes, otherwise masking the more subtle cognitively-driven effects, such as cognitive load and emotional state. We explore multiple calibration sequences to find individual mapping functions relating the luminance to pupil dilation that can be employed in real-time during a VR experience. The resulting mapping functions are evaluated in a VR-based n-back task and in free exploration of a six-degrees-of-freedom VR scene. Our results show estimating luminance from a weighted average of the fixation area and the background yields the best performance. Calibration sequence composed of either solid gray or realistic scene brightness levels shown for 6 s in a pseudo-random order proved most robust.
Article
Full-text available
Research has shown that taking “timeouts” in medical practice improves performance and patient safety. However, the benefits of taking timeouts, or pausing, is not sufficiently acknowledged in workplaces and training programs. To promote this acknowledgment, we suggest a systematic conceptualization of the medical pause, focusing on its importance, processes, and implementation in training programs. By employing insights from educational and cognitive psychology, we first identified pausing as an important skill to interrupt negative momentum and bolster learning. Subsequently, we categorized constituent cognitive processes for pausing skills into two phases: the decision-making phase (determining when and how to take pauses) and the executive phase (applying relaxation or reflection during pauses). We present a model that describes how relaxation and reflection during pauses can optimize cognitive load in performance. Several strategies to implement pause training in medical curricula are proposed: intertwining pause training with training of primary skills, providing second-order scaffolding through shared control, and employing auxiliary tools such as computer-based simulations with a pause function.
Article
Full-text available
Cognitive load theory was introduced in the 1980s as an instructional design theory based on several uncontroversial aspects of human cognitive architecture. Our knowledge of many of the characteristics of working memory, long-term memory and the relations between them had been well-established for many decades prior to the introduction of the theory. Curiously, this knowledge had had a limited impact on the field of instructional design with most instructional design recommendations proceeding as though working memory and long-term memory did not exist. In contrast, cognitive load theory emphasised that all novel information first is processed by a capacity and duration limited working memory and then stored in an unlimited long-term memory for later use. Once information is stored in long-term memory, the capacity and duration limits of working memory disappear transforming our ability to function. By the late 1990s, sufficient data had been collected using the theory to warrant an extended analysis resulting in the publication of Sweller et al. (Educational Psychology Review, 10, 251–296, 1998). Extensive further theoretical and empirical work have been carried out since that time and this paper is an attempt to summarise the last 20 years of cognitive load theory and to sketch directions for future research.
Article
Full-text available
Although Virtual Reality (VR) simulation training has gained prominence, review studies to inform instructors and educators on the use of this technology in Science, Technology, Engineering, and Mathematics (STEM) are still scarce. This scoping review presents various VR-supported instructional design practices in K-12 (Primary and Secondary) and Higher Education (HE) in terms of participants' characteristics, methodological features, and pedagogical uses in alignment with applications, technological equipment, and instructional design strategies. During the selection and screening process, forty-one (n=41) studies published in the period 2009-2019 were included for a detailed analysis and synthesis. This review's results indicate that many studies were focused on the description and evaluation of the appropriateness or the effectiveness of applied teaching practices with VR support. Several studies pointed out improvements in learning outcomes or achievements, positive perspectives on user experience, and perceived usability. Nevertheless, fewer studies were conducted to measure students' learning performance. The current scoping review aims to encourage instructional designers to develop innovative VR applications or integrate existing approaches in their teaching procedures. It will also inform researchers to conduct further research for an in-depth understanding of the educational benefits of immersive VR applications in STEM fields.
Article
Full-text available
In medical training, allowing learners to take pauses during tasks is known to enhance performance. Cognitive load theory assumes that insertion of pauses positively affects cognitive load, thereby enhancing performance. However, empirical studies on how allowing and taking pauses affects cognitive load and performance in dynamic task environments are scarce. We investigated the pause effect, using a computerized simulation game in emergency medicine. Medical students (N = 70) were randomly assigned to one of two conditions: simulation with (n = 40) and without (n = 30) the option to take pauses. All participants played the same two scenarios, during which game logs and eye-tracking data were recorded. Overall, both cognitive load and performance were higher in the condition with pauses than in the one without. The act of pausing, however, temporarily lowered cognitive load, especially during intense moments. Two different manifestations of the pause effect were identified: (1) by stimulating additional cognitive and meta-cognitive processes, pauses increased overall cognitive load; and (2) through relaxation, the act of pausing temporarily decreased heightened cognitive load. Consequently, our results suggest that in order to enhance students’ performance and learning it is important that we encourage them to utilize the different effects of pausing.
Article
Virtual Reality Head-Mounted Displays (HMDs) reached the consumer market and are used for learning purposes. Risks regarding visual fatigue and high cognitive load arise while using HMDs. These risks could impact learning efficiency. Visual fatigue and cognitive load can be measured with eye tracking, a technique that is progressively implemented in HMDs. Thus, we investigate how to assess visual fatigue and cognitive load via eye tracking. We conducted this review based on five research questions. We first described visual fatigue and possible cognitive overload while learning with HMDs. The review indicates that visual fatigue can be measured with blinks and cognitive load with pupil diameter based on thirty-seven included papers. Yet, distinguishing visual fatigue from cognitive load with such measures is challenging due to possible links between them. Despite measure interpretation issues, eye tracking is promising for live assessment. More researches are needed to make data interpretation more robust and document human factor risks when learning with HMDs.
Conference Paper
Virtual reality (VR) technology enables and requires new ways of user experience testing in immersive environments. For various aspects of user experience, objective assessment of the cognitive load can be a useful parameter. With eye-tracking becoming a more widespread feature of current VR headsets, pupillometry is an appealing option to unobtrusively measure cognitive load during a VR experience. This paper shows that pupil size measured by an off-the-shelf VR headset with an integrated eye tracker positively correlates with the self-reported cognitive load during a standard n-back task adapted to a VR environment. To overcome the need for steady scene-lighting conditions, we present a method to correct for the light-induced pupil size changes, otherwise masking the cognitive load effects. Our results show that a commercially available VR headset with eye tracking can be used to measure the cognitive load in unpredictable lighting conditions without additional hardware.
Article
Cognitive load theory has become a leading model in educational psychology and has started to gain traction in the medical education community over the last decade. The theory is rooted in our current understanding of human cognitive architecture in which an individual's limited working memory and unlimited long-term memory interact during the process of learning. Though initially described as primarily a theory of learning, parallels between cognitive load theory and broader aspects of medical education as well as clinical practice are now becoming clear. These parallels are particularly relevant and evident in complex clinical environments, like resuscitation medicine. The authors have built on these connections to develop a recontextualized version of cognitive load theory that applies to complex professional domains and in which the connections between the theory and clinical practice are made explicit, with resuscitation medicine as a case study. Implications of the new model for medical education are also presented along with suggested applications.