FEEDBACK ON THE USE OF MATB-II TASK FOR MODELING OF COGNITIVE
CONTROL LEVELS THROUGH PSYCHO-PHYSIOLOGICAL BIOSIGNALS
AKIANI - Bordeaux, France
AKIANI - Bordeaux, France
Univ. Bordeaux, CNRS IMS UMR 5218 – Bordeaux France
Dassault Aviation – Paris, France
AKIANI – Bordeaux, France
Modeling individuals’ cognitive control levels in operational situations is a major
challenge for safety in aeronautical industry. Standardized experimental tasks - as
the Multi-Attribute Task Battery II (MATB-II) - are dedicated to such a challenge
that can be faced using psycho-physiological biosignals. These biosignals are
known to be sensitive to cognitive worload, performance, and expertise that are
intricate features of MATB-II subtasks. Thus, it remained necessary to investigate
whether these features could be set to ensure controlled experimental conditions.
Two groups (15 experts in time-pressured decision making and 13 novices)
completed 3 MATB-II sub-tasks (tracking, monitoring, and resource management
tasks). Biosignals accounting for autonomic nervous system activity were
measured continuously, as objective markers of cognition.
Confrontation between performance data and (objective and subjective) cognitive
markers reported contrasting perspectives regarding the exploitation of MATB-II
as a pertinent tool to insure controlled experimental conditions in the context of
cognitive control characterization.
Designing adaptive human-machine interface is a major challenge in aeronautics, where the
stakes relate to security. To this aim, we were looking to characterize experts’ cognitive states in
operational context, using psycho-physiological objectification tools. This paper will set out the
details of the approach chosen to take up this challenge using the Multi-Attribute Task Battery
(MATB-II, Santiago-Espada et al. 2011) computer-based task.
Theoretical framework of this study: Hollnagel’s Extended Control Model
This study is based on Hollnagel’s cognitive control theory (Hollnagel 1998) to address
cognitive resource management. Although cognitive resources are usually seen as a form of
“fuel” for cognitive processes – a fuel that could be assessed to determine the operators’ margins
and limits (Yerkes & Dodson 1908) – the specificity of this model is that it considers a principle
of cognitive resource saving, which is more of a mean of optimization, than a simple cognitive
resource consumption reduction process. Its main advantage is therefore to approach cognitive
processes at an integrative level: there would be cognitive shortcuts to face familiar situations,
and means to protect oneself against the unknown. Mental representation, abstraction ability,
sufficiency principle and anticipation could be some of these means. The suggested ECOM
model (Extended COntrol Model, Hollnagel 1998) mentions 4 identifiable levels of cognitive
control, from long term planning with the highest level of abstraction to short adaptative loops
with short available time. It was thus required to propose a human-machine interaction
environment in which these levels of cognitive control could be simulated, granting access to the
operator’s performance as well.
Experimental simulation: MATB-II
To this end, the Multi-Attribute Task Battery computer-based task developed by the NASA team
(MATB-II in revised version, Santiago-Espada et al. 2011) was identified as a favorable
simulation environment. Although MATB-II is originally a multi-task environment, it offers
isolated subtasks that are resembled levels from the ECOM model. Among the 5 proposed tasks,
3 stood out: a “tracking” task (Track), which consists in holding a sight in the center of a target
using a joystick (short adaptative loop with short available time) ; a “monitoring” task (Monit),
which consists check for abnormal gauge oscillations on 4 gauges and correct it as quickly as
possible, pressing the corresponding key on the keyboard (short-term planning); and a “resource
management” task (Manag), which consists in maintaining the level of 2 tanks consuming
resources, by activating/deactivating pumps that enable the transfer of resources from different
tanks (highest planning level). To satisfy the stakes of this project, the cognitive states needed to
be characterized during the realization of these tasks.
Objectification of the cognitive states in operational situation and interpretations
Heart rate variability (HRV), electrodermal activity (EDA), and pupillary dilatation are known
for cognitive state objectification means in operational situations (e.g., Wilson 2002). As indirect
markers of autonomic nervous system’s activity, these physiological indicators are also
considered representatives of workload, involvement in the task, emotional states, and waking
indicators. Given their ubiquitous nature, we needed to ensure the nature only was a dependent
variable (Track vs. Monit vs. Manag). Confounding variables which are workload, operator’s
involvement and emotional states, therefore needed to be controlled. Since confounding
variables and the nature of the task are closely intertwined, the only way to guarantee control of
the experimental conditions would be to check the confounding variables’ stability between
tasks. Since the MATB-II enables to 1) program difficulty levels, 2) measure the operators’
operational performance, and 3) subjectively assess the workload, it was the perfect tool to
obtain feedback on the participants’ involvement through their performance in each task, and on
the workload perceived, thanks to subjective assessment scales (NASA-TLX, Hart 2006).
Experimental conditions programming
The MATB-II computer-based task offers an environment allowing event programming for each
task, independently from each other. To our knowledge, no gold standard already exist to ensure
standardized difficulty. Pre-testing experimental conditions being a common approach in human
sciences, pre-experimentation on 5 participants allowed to adjust the difficulty of the tasks,
according to subjective feedback on the perceived difficulty.
For Track, level 2 on 3 (pre-programmed medium level) was chosen as default level. For Monit,
adjustments during the pre-tests have led us to consider faulty gauge scheduled every 10
seconds, randomly made the difficulty similar to that of the Track task. For Manag, pre-tests led
us to program one faulty pump event every 10 to 20 seconds for a duration of 15 seconds, to
match Track’s and Monit’s perceived difficulty.
Considering the pre-experimentation efforts, we expected no observation of inter-condition
effects on the confounding variables, be it on the subjectively declared workload or the
performance. Considering the ubiquitous nature of the psycho-physiological variables, we were
expecting to observe significant correlations with the cognitive load and performance levels,
indicating the necessity for data correction.
A group of experts in high time-pressured context task management (high-level handball players,
N = 15;16.6 ± 1.1 years old ; 8.2 ± 2.9 years of practice) was compared with novices (N = 13;
age 20.6 ± 2.0; years of practice < 2 years), for a total of 28 participants. After they had been
informed of the experimentation conditions, adult participants and legal representatives of
underage participants signed a written consent to participate in accordance with the Helsinki
Accords (General Assembly of the World Medical Association, 2014).
Participants were checked for enough sleep the night before and no energy drinks in the last 6
hours. The experimental session was held in a temperate (20°C), constantly lit, soundproof room.
Participants then trained for each condition for 1 minute as a habituation session. We made sure
performance instructions were understood for each condition, if not, a second attempt was
realized. Participants then performed each experimental condition randomly. Each condition
lasted 5 minutes, with at least 3-minute rest between conditions.
As presented in the introduction, 3 out of the 5 MATB-II subtasks leaded to 3 distinct
experimental conditions. These mono-task conditions create a human-machine interaction
relevant with the ECOM model’s definition of control levels.
The participants’ electrodermal activity (EDA) and cardiac activity (ECG) were measured
continuously at 1000-Hz and amplified with a dedicated acquisition chain and an A/D 24-bit
converter (MP150 and BioNomadix system, Biopac, California, USA). For EDA, 2 Ag/AgCl
electrodes were placed respectively on the index’s and the middle finger’s first phalanx, on the
non-dominant hand . For the ECG, 3 Ag/AgCl electrodes were placed in conformity with the
representation of Einthoven’s triangle.Pupillary dilatation data were collected continuously in
60-Hz by a dedicated system (T60 XL Eyetracker, Tobii, Sweden), after individual calibration.
All physiological data were recorded on a shared computer for synchronization.
Subjective measurements (cognitive load)
At the end of each experimental task, the level of cognitive load perceived by the subjects was
assessed using NASA-TLX. Participants assign a score of 0 to 100 (one score every 5 points) to
6 sub-scales including mental demand, physical demand, time demand, global effort, frustration
level, and estimated performance.
For Track, the performance indicator chosen was Root-Mean-Square Deviation (sampled at 1-
Hz). For Monit, reaction times sampled at 100-Hz (maximum allowed by MATB-II software)
were collected as dependent variable. For Manag, the difference between both of the tanks’
target filling and actual filling was collected at 0.1-Hz.
Data were processed in Matlab programming environment (Matlab 2017a, The MathWorks,
Natick, MA, USA).
Time-frequency analysis for 0.08 to 0.24-Hz bands was applied to the EDA signal using complex
demodulation, according with Posada-Quintero et al. (2016). Spectral power was averaged over
time to provide an indication on the sympathetic autonomous activity (EDAsymp, no unit).
Heart rate variability
The ECG raw signal’s was preproced according with Pan and Tompkins (1985, QRS complex
detection) and Dos Santos et al. (2013, correction of abnormal values in R-R values tachogram).
Spectral analysis was performed for each condition on the entire time window (5-min) for low
frequencies (0.04 Hz < LF < 0.15 Hz) and high frequencies (0.15 < HF < 0.4-Hz)(Task force
paper, 1996). Spectral powers LFpow and HFpow are expressed in s2/Hz. The LFpow/HFpow ratio
reports on the sympatho-vagal system (ratio, no unit).
Time series of left and right pupillary diameters were linearly interpolated, merged, then
averaged over time and normalized by the time series’ standard deviation to provide the EyeT
index (no unit).
Cognitive load (subjective)
Scores from the 6 NASA-TLX subscales were averaged to obtain a global score out of 100 (Hart,
For Track and Monit, the data have been averaged over time to provide performance indexes,
respectively Trackperf (in millimeters, mm) and Monitperf (in milliseconds, ms). For Manag, data
from both managed tanks have been averaged between themselves, then averaged over time to
provide the Managperf index (no unit).
Subsequently, each data series has been transformed into z-scores to be compared and merged as
Statistic tests were carried out with XLSTAT (XLSTAT 2018.1, Addinsoft, France). A non-
parametric variance analysis to compare 2 expertise modalities [Intra-subject comparison: Exp
vs. Nov] by 3 experimental conditions [Inter-condition comparison: Track vs. Monit vs. Manag]
was applied to compare the different variables.
Each variable’s averages and standard deviations are presented in table 1a.
Table 1 : Results of data processing.
This work relates to a project whose global goal is to model the cognitive states of operators in
operational situation, using psycho-physiological tools. To this end, MATB-II was suggested as
the appropriate experimental environment, because it offers a range of standardized experimental
tasks addressing situations that mobilize distinct cognitive states as defined in Hollnagel’s
cognitive control levels theory. Psycho-physiological variables measured during these tasks
being indirect witnesses of the autonomic nervous system’s activity, interpretation of the
experimental effects can therefore be tricky, considering their ubiquitous nature. This work’s aim
was thus to make sure that only the nature of the tasks differed between experimental conditions
(i.e., that cognitive load and/or involvement in the task (via performance) should not be
confounding factors). Precautions were taken in the form of pre-tests to adjust the experimental
tasks’ difficulty. Consequently, we made the hypothesis that the experimental condition would
have no detectable effects neither on the subjectively declared workload, nor on the performance.
Interpretation of effects
Contrary to our first hypothesis, a major effect of experimental conditions was detected on
subjectively scores of cognitive load. Although adjustments were carefully made prior to the
experimentation – in accordance with common experimental approaches – it is however not
possible to confirm that cognitive load has been standardized, and thus that only the nature of the
task differed between experimental conditions. This is particularly problematic as far as psycho-
physiological data interpretation is concerned, since we observe a similar effect. The difficulty of
cognitive load standardization resides in the fact that, to our knowledge, there is no existing gold
standard to normalize difficulty between cognitive tasks of different nature. The multiple setting
parameters used to adjust the difficulty of a given task makes it particularly difficult to
implement similar experimental conditions. As an example, the Track condition offered only 3
spatial difficulty levels (i.e. the more difficult the level, the wider the sight’s random
movements). Comparatively, the Manag condition offered dynamic (flow management), spatial
(which pumps?) and time (when? for how long?) setting parameters. It was therefore difficult to
imagine establishing configuration rules to come close to difficulty level standardization.
Regarding performance measurements, variance analysis has allowed to detect an interaction
effect between the expertise level and experimental conditions. Performance being one of the
behavioral witnesses of the involvement in a task, it is once again tricky to confirm that
participants’ involvement is similar from one task to the next, and that only the nature of the task
differed between experimental conditions. We should also note that task performance and
cognitive load are closely linked. Poor performance can indeed reflect both an overload and an
underload, according to Yerkes and Dodson’s law (Yerkes and Dodson 1908). Difficulty level
configuration therefore also plays a crucial role, to make sure that poor performances are not
linked to a form of boredom in case of too simple a task, or to a total disengagement from a task
which is too complex.
About the MATB-II task
Despite the difficulties mentioned in this paper, we should note that MATB-II was used in an
unusual way during this study. This experimental environment has indeed originally been
developed to simulate a given level of cognitive load in multi-task experimental conditions (e.g.,
Fairclough et Venables 2006). To our knowledge, no experimentation had tried comparing
MATB-II experimental sub-tasks between themselves, attempting to control / normalize
confounding factors such as cognitive load and involvement in the task via performance. Even
though the solution put forward to address demands like this project’s still seems relevant to us
today, it is good practice to use this feedback to take precautions when treating and interpreting
data, systematically measuring the levels of subjectively assessed cognitive load and the task
performances. This would allow to adjust dependent variables (psycho-physiological) to model
the operators’ cognitive states as accurately as possible.
Boucsein, W., Fowles, D.C., Grimnes, S., Ben-Shakhar, G., Roth, W.T., Dawson, M.E., &
Filion, D.L.; Society for Psychophysiological Research Ad Hoc Committee on
Electrodermal Measures. Publication recommendations for electrodermal measurements.
Psychophysiol (2012) 1017-1034.
Dos Santos, L., Barroso, J. J., Macau, E. E. N., & de Godoy, M. F. Application of an automatic
adaptive filter for Heart Rate Variability analysis. Med Eng Physics (2013), 1778–1785.
Fairclough, S. H., & Venables, L. Prediction of subjective states from psychophysiology: a
multivariate approach. Biol Psy (2006) 100–110.
Hart S. Nasa-Task Load Index (NASA-TLX); 20 Years Later. Proc Hum Factors Ergon Soc
Annu Meet (2006) 904–908.
Hollnagel, E. Context, cognition and control. In Y. Waern (Ed.), Co-operative process
management, cognition and information technology. In Collection, London: Taylor &
Francis (1998) 27–52.
Posada-Quintero, H.F., Florian, J.P., Orjuela-Cañón, Á.D., Chon, K.H. Highly sensitive index of
sympathetic activity based on time-frequency spectral analysis of electrodermal activity.
Am J Physiol Regul Integr Comp Physiol (2016) 582-591.
Santiago-Espada, Y., Myer, R., Latorella, K., Comstock, J. The Multi-Attribute Task Battery II
(MATB-II) software for human performance and workload research: a user’s guide
Task Force of the European Society of Cardiology and the North American Society of Pacing
and Electrophysiology. Heart rate variability. Standards of measurement, physiological
interpretation, and clinical use”. Eur Heart J (1996) 354–381.
Wilson, G. F. An analysis of mental workload in pilots during flight using multiple
psychophysiological measures. Int J Aviation Psych (2002) 3–18.
Yerkes, R. M., & Dodson, J. D. The relation of strength of stimulus to rapidity of habit-
formation. J Comp Neurol Psychol, (1908) 459–482.