To read the full-text of this research, you can request a copy directly from the authors.
Abstract
The Low/High Index of Pupillary Activity (LHIPA), an eye-tracked measure of pupil diameter oscillation, is redesigned and implemented to function in real-time. The novel Real-time IPA (RIPA) is shown to discriminate cognitive load in re-streamed data from earlier experiments. Rationale for the RIPA is tied to the functioning of the human autonomic nervous system yielding a hybrid measure based on the ratio of Low/High frequencies of pupil oscillation. The paper's contribution is drawn from provision of documentation of the calculation of the RIPA. As with the LHIPA, it is possible for researchers to apply this metric to their own experiments where a measure of cognitive load is of interest.
Cognitive load has been shown, over hundreds of validated studies, to be an important variable for understanding human performance. However, establishing practical, non-contact approaches for automated estimation of cognitive load under real-world conditions is far from a solved problem. Toward the goal of designing such a system, we propose two novel vision-based methods for cognitive load estimation, and evaluate them on a large-scale dataset collected under real-world driving conditions. Cognitive load is defined by which of 3 levels of a validated reference task the observed subject was performing. On this 3-class problem, our best proposed method of using 3D convolutional neural networks achieves 86.1% accuracy at predicting task-induced cognitive load in a sample of 92 subjects from video alone. This work uses the driving context as a training and evaluation dataset, but the trained network is not constrained to the driving environment as it requires no calibration and makes no assumptions about the subject's visual appearance, activity, head pose, scale, and perspective.
Quantitative assessment for cognitive load and mental stress is very important in optimizing human-computer system designs to improve performance and efficiency. Traditional physiological measures, such as heart rate variation (HRV), blood pressure and electrodermal activity (EDA), are widely used but still have limitations in sensitivity, reliability and usability. In this study, we propose a novel photoplethysmogram-based stress induced vascular index (sVRI) to measure cognitive load and stress. We also provide the basic methodology and detailed algorithm framework. We employed a classic experiment with three levels of task difficulty and three stages of testing period to verify the new measure. Compared with the blood pressure, heart rate and HRV components recorded simultaneously, the sVRI reached the same level of significance on the effect of task difficulty/period as the most significant other measure. Our findings showed sVRI's potential as a sensitive, reliable and usable parameter.
There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
We use simulations to investigate the effect of sampling frequency on common dependent variables in eye-tracking. We identify two large groups of measures that behave differently, but consistently. The effect of sampling frequency on these two groups of measures are explored and simulations are performed to estimate how much data are required to overcome the uncertainty of a limited sampling frequency. Both simulated and real data are used to estimate the temporal uncertainty of data produced by low sampling frequencies. The aim is to provide easy-to-use heuristics for researchers using eye-tracking. For example, we show how to compensate the uncertainty of a low sampling frequency with more data and post-experiment adjustments of measures. These findings have implications primarily for researchers using naturalistic setups where sampling frequencies typically are low.
Due to simple calculation and good denoising effect, wavelet threshold denoising method has been widely used in signal denoising. In thismethod, the threshold is an important parameter that affects the denoising effect. In order to improve the denoising effect of the existing methods, a new threshold considering interscale correlation is presented. Firstly, a new correlation index is proposed based on the propagation characteristics of the wavelet coefficients.Then, a threshold determination strategy is obtained using the new index. At the end of the paper, a simulation experiment is given to verify the effectiveness of the proposed method. In the
experiment, four benchmark signals are used as test signals. Simulation results show that the proposed method can achieve a good denoising effect under various signal types, noise intensities, and thresholding functions.
Pupil size is known to correlate with the changes of cognitive task workloads, but how the pupil responds to requirements of basic goal-directed motor tasks involved in human-machine interactions is not yet clear. This work conducted a user study to investigate the pupil dilations during aiming in a tele-operation setting, with the purpose of better understanding how the changes in task requirements are reflected by the changes of pupil size. The task requirements, managed by Fitts’ index of difficulty (ID), i.e. the size and distance apart of the targets, were varied between tasks, and pupil responses to different task IDs were recorded. The results showed that pupil diameter can be employed as an indicator of task requirements in goal-directed movements—higher task difficulty evoked higher valley to peak pupil dilation, and the peak pupil dilation occurred after a longer delay. These findings contribute to the foundation for developing methods to objectively evaluate interactive task requirements using pupil parameters during goal-directed movements in HCI.
Abstract We propose a method based on wavelet transform and neural networks for relating pupillary behavior to psychological stress. We tested the proposed method by recording pupil diameter and electrodermal activity during a simulated driving task. Self-report measures were also collected. Participants performed a baseline run with the driving task only, followed by three stress runs where they were required to perform the driving task along with sound alerts, the presence of two human evaluators, and both. Self-reports and pupil diameter successfully indexed stress manipulation, and significant correlations were found between these measures. However, electrodermal activity did not vary accordingly. After training, our four-way parallel neural network classifier could guess whether a given unknown pupil diameter signal came from one of the four experimental trials with 79.2 % precision. The present study shows that pupil diameter signal has good discriminating power for stress detection.
This research program was designed to develop predictive (based on cognitive modeling) and descriptive (based on physiological data) measures of cognitive workload that are highly correlated. Such measures must be theoretically grounded and empirically verified. Our main engineering goals in this project were to show: (1) how the predictive measures (cognitive modeling) could be applied to guide the design of novel interfaces and communication protocols for decision making tasks, and (2) how the descriptive measures (physiological) could be used to measure workload during real-time task performance.
With a focus on presenting information at the right time, the ubicomp community can benefit greatly from learning the most salient human measures of cognitive load. Cognitive load can be used as a metric to determine when or whether to interrupt a user. In this paper, we collected data from multiple sensors and compared their ability to assess cognitive load. Our focus is on visual perception and cognitive speed-focused tasks that leverage cognitive abilities common in ubicomp applications. We found that across all participants, the electrocardiogram median absolute deviation and median heat flux measurements were the most accurate at distinguishing between low and high levels of cognitive load, providing a classification accuracy of over 80% when used together. Our contribution is a real-time, objective, and generalizable method for assessing cognitive load in cognitive tasks commonly found in ubicomp systems and situations of divided attention.
Notifications can have reduced interruption cost if delivered at moments of lower mental workload during task execution. Cognitive theorists have speculated that these moments occur at subtask boundaries. In this article, we empirically test this speculation by examining how workload changes during execution of goal-directed tasks, focusing on regions between adjacent chunks within the tasks, that is, the subtask boundaries. In a controlled experiment, users performed several inter- active tasks while their pupil dilation, a reliable measure of workload, was continuously measured using an eye tracking system. The workload data was extracted from the pupil data, precisely aligned to the corresponding task models, and analyzed. Our principal findings include (i) workload changes throughout the execution of goal-directed tasks; (ii) workload exhibits transient decreases at subtask boundaries relative to the preceding subtasks; (iii) the amount of decrease tends to be greater at boundaries corresponding to the completion of larger chunks of the task; and (iv) dif- ferent types of subtasks induce different amounts of workload. We situate these findings within resource theories of attention and discuss important implications for interruption management systems.
Event detection is used to classify recorded gaze points into periods of fixation, saccade, smooth pursuit, blink, and noise. Although there is an overall consensus that current algorithms for event detection have serious flaws and that a de facto standard for event detection does not exist, surprisingly little work has been done to remedy this problem. We suggest a new velocity-based algorithm that takes several of the previously known limitations into account. Most important, the new algorithm identifies so-called glissades, a wobbling movement at the end of many saccades, as a separate class of eye movements. Part of the solution involves designing an adaptive velocity threshold that makes the event detection less sensitive to variations in noise level and the algorithm settings-free for the user. We demonstrate the performance of the new algorithm on eye movements recorded during reading and scene perception and compare it with two of the most commonly used algorithms today. Results show that, unlike the currently used algorithms, fixations, saccades, and glissades are robustly identified by the new algorithm. Using this algorithm, we found that glissades occur in about half of the saccades, during both reading and scene perception, and that they have an average duration close to 24 msec. Due to the high prevalence and long durations of glissades, we argue that researchers must actively choose whether to assign the glissades to saccades or fixations; the choice affects dependent variables such as fixation and saccade duration significantly. Current algorithms do not offer this choice, and their assignments of each glissade are largely arbitrary.
An experiment was conducted to measure very short-term retention in younger and older Ss by means of a visual display involving a rapidly moving light Results indicated that older Ss slumped in performance much sooner than younger Ss, in both relative and absolute terms. Older Ss also tended to make more errors of omission and more random responses, indicating a lack of ability to "keep up.' " Concluded that the inability to organize incoming and outgoing information as rapidly as the younger Ss caused the older Ss' poorer performance.
The mathematical characterization of singularities with Lipschitz
exponents is reviewed. Theorems that estimate local Lipschitz exponents
of functions from the evolution across scales of their wavelet transform
are reviewed. It is then proven that the local maxima of the wavelet
transform modulus detect the locations of irregular structures and
provide numerical procedures to compute their Lipschitz exponents. The
wavelet transform of singularities with fast oscillations has a
particular behavior that is studied separately. The local frequency of
such oscillations is measured from the wavelet transform modulus maxima.
It has been shown numerically that one- and two-dimensional signals can
be reconstructed, with a good approximation, from the local maxima of
their wavelet transform modulus. As an application, an algorithm is
developed that removes white noises from signals by analyzing the
evolution of the wavelet transform maxima across scales. In two
dimensions, the wavelet transform maxima indicate the location of edges
in images
When conducting eye tracking studies, having a mechanism to collect data, build workflows, and validate results in a FAIR (i.e., findable, accessible, interoperable, and reusable) manner, facilitates automation. Given the vast landscape of vendor-specific eye tracking software, adopting FAIR metadata standards for the eye tracking domain is one step towards this. In this paper, we propose an approach to simplify the creation, execution, and validation of eye tracking studies through metadata. Using a metadata format that we developed, we first describe two eye trackers, and two datasets collected using them. Next, we use this metadata to simulate real-time data collection by replaying each dataset. From this replayed data, we analyze eye movements in real-time, and synthesize eye movement data from analytics in real-time. Based on our results, we discuss the utility of metadata in real-time eye tracking studies, and how this idea can be generalized into other applications.
Continuous assessment of task difficulty and mental workload is essential in improving the usability and accessibility of interactive systems. Eye tracking data has often been investigated to achieve this ability, with reports on the limited role of standard blink metrics. Here, we propose a new approach to the analysis of eye-blink responses for automated estimation of task difficulty. The core module is a time-frequency representation of eye-blink, which aims to capture the richness of information reflected on blinking. In our first study, we show that this method significantly improves the sensitivity to task difficulty. We then demonstrate how to form a framework where the represented patterns are analyzed with multi-dimensional Long Short-Term Memory recurrent neural networks for their non-linear mapping onto difficulty-related parameters. This framework outperformed other methods that used hand-engineered features. This approach works with any built-in camera, without requiring specialized devices. We conclude by discussing how Rethinking Eye-blink can benefit real-world applications.
Cognitive load has been shown, over hundreds of validated studies, to be an important variable for understanding human performance. However, establishing practical, non-contact approaches for automated estimation of cognitive load under real-world conditions is far from a solved problem. Toward the goal of designing such a system, we propose two novel vision-based methods for cognitive load estimation, and evaluate them on a large-scale dataset collected under real-world driving conditions. Cognitive load is defined by which of 3 levels of a validated reference task the observed subject was performing. On this 3-class problem, our best proposed method of using 3D convolutional neural networks achieves 86.1% accuracy at predicting task-induced cognitive load in a sample of 92 subjects from video alone. This work uses the driving context as a training and evaluation dataset, but the trained network is not constrained to the driving environment as it requires no calibration and makes no assumptions about the subject's visual appearance, activity, head pose, scale, and perspective.
Real-time evaluation of a person's cognitive load can be desirable in many situations. It can be employed to automatically assess or adjust the difficulty of a task, as a safety measure, or in psychological research. Eye-related measures, such as the pupil diameter or blink rate, provide a non-intrusive way to assess the cognitive load of a subject and have therefore been used in a variety of applications. Usually, workload classifiers trained on these measures are highly subject-dependent and transfer poorly to other subjects. We present a novel method to generalize from a set of trained classifiers to new and unknown subjects. We use normalized features and a similarity function to match a new subject with similar subjects, for which classifiers have been previously trained. These classifiers are then used in a weighted voting system to detect workload for an unknown subject. For real-time workload classification, our methods performs at 70.4% accuracy. Higher accuracy of 76.8% can be achieved in an offline classification setting.
A novel eye-tracked measure of the frequency of pupil diameter oscillation is proposed for capturing what is thought to be an indicator of cognitive load. The proposed metric, termed the Index of Pupillary Activity, is shown to discriminate task difficulty vis-a-vis cognitive load (if the implied causality can be assumed) in an experiment where participants performed easy and difficult mental arithmetic tasks while fixating a central target (a requirement for replication of prior work). The paper's contribution is twofold: full documentation is provided for the calculation of the proposed measurement which can be considered as an alternative to the existing proprietary Index of Cognitive Activity (ICA). Thus, it is possible for researchers to replicate the experiment and build their own software which implements this measurement. Second, several aspects of the ICA are approached in a more data-sensitive way with the goal of improving the measurement's performance.
We present Brain Automated Chorales (BACh), an adaptive brain-computer system that dynamically increases the levels of difficulty in a musical learning task based on pianists' cognitive workload measured by functional near-infrared spectroscopy. As users' cognitive workload fell below a certain threshold, suggesting that they had mastered the material and could handle more cognitive information, BACh automatically increased the difficulty of the learning task. We found that learners played with significantly increased accuracy and speed in the brain-based adaptive task compared to our control condition. Participant feedback indicated that they felt they learned better with BACh and they liked the timings of the level changes. The underlying premise of BACh can be applied to learning situations where a task can be broken down into increasing levels of difficulty.
In attempting to analyze, on digital computers, data from basically continuous physical experiments, numerical methods of performing familiar operations must be developed. The operations of differentiation and filtering are especially important both as an end in themselves, and as a prelude to further treatment of the data. Numerical counterparts of analog devices that perform these operations, such as RC filters, are often considered. However, the method of least squares may be used without additional computational complexity and with considerable improvement in the information obtained. The least squares calculations may be carried out in the computer by convolution of the data points with properly chosen sets of integers. These sets of integers and their normalizing factors are described and their use is illustrated in spectroscopic applications. The computer programs required are relatively simple. Two examples are presented as subroutines in the FORTRAN language.
Pupil diameter is an important measure of cognitive load. However, pupil diameter is also influenced by the amount of light reaching the retina. In this study we explore the interaction between these two effects in a simulated driving environment. Our results indicate that it is possible to separate the effects of illumination and visual cognitive load on pupil diameter, at least in certain situations.
The pupil diameter (PD), controlled by the autonomic nervous system, seems to provide a strong indication of affective arousal, as found by previous research, but it has not been investigated fully yet. In this study, new approaches based on monitoring and processing the PD signal for off-line and on-line "relaxation" vs. "stress" differentiation are proposed. For the off-line approach, wavelet denoising, Kalman filtering, data normalization, and feature extraction are sequentially utilized. For the on-line approach, a hard threshold, a moving average window and three stress detection steps are implemented. In order to use only the most reliable data, two types of data selection methods (paired t test based on galvanic skin response (GSR) data and subject self-evaluation) are applied, achieving average classification accuracies up to 86.43 and 87.20% for off-line and 72.30 and 73.55% for on-line algorithms, with each set of selected data, respectively. The GSR was also monitored and processed in our experiments for comparison purposes, with the highest classification rate achieved being only 63.57% (based on the off-line processing algorithm). The overall results show that the PD signal is more effective and robust for differentiating "relaxation" vs. "stress," in comparison with the traditionally used GSR signal.
This article has attempted to answer the question "What is a Savitky-Golay filter?" in terms that will be familiar to the DSP community and readers of IEEE Signal Processing Magazine. This article reviewed the definition and properties of S-G filters and showed how they can be designed easily using polynomial approximation of an impulse sequence. In contrast to most discussions of S-G filters, they focused on the frequency domain properties, and offered an approximate formula for the 3-dB cutoff frequency as a function of polynomial order N and impulse response half length M. Engineers with a frequency domain mindset may find this useful if they choose to use S-G filters in their application.
With the continually increasing complexity of e-learning environments, there is a need for integrating concepts of cognitive load theory (CLT) with concepts of human–computer interaction (HCI). Basic concepts of both fields were reviewed and contrasted. A literature review was conducted within the literature database “The Guide to Computing Literature,” searching for “cognitive load theory” and “Sweller.” Sixty-five publications contained “cognitive load” in their titles or abstracts. Each publication was checked to see whether it contained the concepts of intrinsic, extraneous, or germane cognitive load. The review showed that CLT concepts have been adopted in HCI. However, the concept of germane cognitive load has attracted less attention up to the present time. Two conceptual models are proposed. The first model divides extraneous cognitive load into load induced by the instructional design and load caused by software usage. The model clarifies the focus of traditional usability principles and of existing instructional design principles derived from CLT. The second model fits CLT concepts into the basic components of user-centered design. The concept of germane cognitive load illustrates that an increase of cognitive load can be desirable when designing e-learning environments. Areas for future interdisciplinary research are sketched.
This paper is concerned with some of the factors that determine the difficulty of material that needs to be learned. It is suggested that when considering intellectual activities, schema acquisition and automation are the primary mechanisms of learning. The consequences of cognitive load theory for the structuring of information in order to reduce difficulty by focusing cognitive activity on schema acquisition is briefly summarized. It is pointed out that cognitive load theory deals with learning and problem solving difficulty that is artificial in that it can be manipulated by instructional design. Intrinsic cognitive load in contrast, is constant for a given area because it is a basic component of the material. Intrinsic cognitive load is characterized in terms of element interactivity. The elements of most schemas must be learned simultaneously because they interact and it is the interaction that is critical. If, as in some areas, interactions between many elements must be learned, then intrinsic cognitive load will be high. In contrast, in different areas, if elements can be learned successively rather than simultaneously because they do not interact, intrinsic cognitive load will be low. It is suggested that extraneous cognitive load that interferes with learning only is a problem under conditions of high cognitive load caused by high element interactivity. Under conditions of low element interactivity, re-designing instruction to reduce extraneous cognitive load may have no appreciable consequences. In addition, the concept of element interactivity can be used to explain not only why some material is difficult to learn but also, why it can be difficult to understand. Understanding becomes relevant when high element interactivity material with a naturally high cognitive load must be learned.
Considerable evidence indicates that domain specific knowledge in the form of schemas is the primary factor distinguishing experts from novices in problem-solving skill. Evidence that conventional problem-solving activity is not effective in schema acquisition is also accumulating. It is suggested that a major reason for the ineffectiveness of problem solving as a learning device, is that the cognitive processes required by the two activities overlap insufficiently, and that conventional problem solving in the form of means-ends analysis requires a relatively large amount of cognitive processing capacity which is consequently unavailable for schema acquisition. A computational model and experimental evidence provide support for this contention. Theoretical and practical implications are discussed.
Historically, the development of computer systems has been primarily a technology-driven phenomenon, with technologists believing that "users can adapt" to whatever they build. Human- centered design advocates that a more promising and enduring approach is to model users' natural behavior to begin with so that interfaces can be designed that are more intuitive, easier to learn, and freer of performance errors. In this paper, we illustrate different user-centered design principles and specific strategies, as well as their advantages and the manner in which they enhance users' performance. We also summarize recent research findings from our lab comparing the performance characteristics of different educational interfaces that were based on user-centered design principles. One theme throughout our discussion is human- centered design that minimizes users' cognitive load, which effectively frees up mental resources for performing better while also remaining more attuned to the world around them. Categories and Subject Descriptors
This paper presents an improved 3D eye movement analysis algorithm for binocular eye tracking within Virtual Reality for visual inspection training. The user's gaze direction, head position and orientation are tracked to allow recording of the user's fixations within the environment. The paper summarizes methods for (1) integrating the eye tracker into a Virtual Reality framework, (2) calculating the user's 3D gaze vector, and (3) calibrating the software to estimate the user's inter-pupillary distance post-facto. New techniques are presented for eye movement analysis in 3D for improved signal noise suppression. The paper describes (1) the use of Finite Impulse Response (FIR) filters for eye movement analysis, (2) the utility of adaptive thresholding and fixation grouping, and (3) a heuristic method to recover lost eye movement data due to miscalibration. While the linear signal analysis approach is itself not new, its application to eye movement analysis in three dimensions advances traditional 2D approaches since it takes into account the 6 degrees of freedom of head movements and is resolution independent. Results indicate improved noise suppression over our previous signal analysis approach.
to the Human Visual System (HVS).- Visual Attention.- Neurological Substrate of the HVS.- Visual Psychophysics.- Taxonomy and Models of Eye Movements.- Eye Tracking Systems.- Eye Tracking Techniques.- Head-Mounted System Hardware Installation.- Head-Mounted System Software Development.- Head-Mounted System Calibration.- Table-Mounted System Hardware Installation.- Table-Mounted System Software Development.- Table-Mounted System Calibration.- Eye Movement Analysis.- Eye Tracking Methodology.- Experimental Design.- Suggested Empirical Guidelines.- Case Studies.- Eye Tracking Applications.- Diversity and Types of Eye Tracking Applications.- Neuroscience and Psychology.- Industrial Engineering and Human Factors.- Marketing/Advertising.- Computer Science.- Conclusion.
The Index of Cognitive Activity is an innovative technique that provides an objective psychophysiological measurement of cognitive workload. As users operate in increasingly complex environments, it is essential that the designers of these environments understand the cognitive demands placed on the users. The Index of Cognitive Activity (ICA) provides an important estimate of the levels of cognitive effort of the user. The ICA is based on changes in pupil dilation that occur as a user interacts with a visual display. The Index is described, and several applications are presented.
Singularity Detection and Processing with Wavelets