ArticlePDF Available

A multi-modal sensor dataset for continuous stress detection of nurses in a hospital


Abstract and Figures

Advances in wearable technologies provide the opportunity to continuously monitor many physiological variables. Stress detection has gained increased attention in recent years, especially because early stress detection can help individuals better manage health to minimize the negative impacts of long-term stress exposure. This paper provides a unique stress detection dataset that was created in a natural working environment in a hospital. This dataset is a collection of biometric data of nurses during the COVID-19 outbreak. Studying stress "in the wild" in a work environment is complex due to the influence of many social, cultural and individuals experience in dealing with stressful conditions. In order to address these concerns, we captured both the physiological data and associated context pertaining to the stress events. Specific physiological variables that were monitored included electrodermal activity, heart rate, skin temperature, and accelerometer data of the nurse subjects. A periodic smartphone-administered survey also captured the contributing factors for the detected stress events. A database containing the signals, stress events, and survey responses is available upon request.
Content may be subject to copyright.
Scientific DATA | (2022) 9:255 |
A multimodal sensor dataset for
continuous stress detection of
nurses in a hospital
Seyedmajid Hosseini1, Raju Gottumukkala1 ✉ , Satya Katragadda1, Ravi Teja Bhupatiraju1,
Ziad Ashkar1, Christoph W. Borst1 & Kenneth Cochran1,2
Advances in wearable technologies provide the opportunity to monitor many physiological variables
continuously. Stress detection has gained increased attention in recent years, mainly because early
stress detection can help individuals better manage health to minimize the negative impacts of
long-term stress exposure. This paper provides a unique stress detection dataset created in a natural
working environment in a hospital. This dataset is a collection of biometric data of nurses during the
COVID-19 outbreak. Studying stress in a work environment is complex due to many social, cultural, and
psychological factors in dealing with stressful conditions. Therefore, we captured both the physiological
data and associated context pertaining to the stress events. We monitored specic physiological
variables such as electrodermal activity, Heart Rate, and skin temperature of the nurse subjects. A
periodic smartphone-administered survey also captured the contributing factors for the detected stress
events. A database containing the signals, stress events, and survey responses is publicly available on
Background & Summary
Prolonged exposure to stress factors such as high workload, lack of autonomy, and long hours can negatively
impact employee health. Many studies point out that prolonged exposure to stress leads to chronic conditions
such as obesity1 or hypertension2, which may exacerbate conditions such as type-II diabetes3. Monitoring and
understanding stress in workplaces is important, especially in professions with increased exposure to stress,
oen leading to burnout and increased turn over4.
is dataset provides physiological stress signals from a nursing population working in real-world hospital
settings during the COVID-19 outbreak. e primary motivation to create a wearable biometric nursing stress
dataset is to help advance research into understanding and improving the emotional health of nurses in a nat-
uralistic environment through the development of early detection of work-related stress detection algorithms.
Our work was inspired by several previous works on wearables to monitor physiological signals related to
stress. e AectiveRoad dataset5 used Empatica E4 and Zephyr Bioharness 3.0 to study the eect of driving
conditions on the stress levels of 10 drivers. is study was conducted with drivers taking a 1 hour 26 minute
driving test. e WESAD data-set6 used RespiBAN and Empatica E4 to study the stress of 15 students watching
a movie and taking a TSST test7. e SWELL dataset8 used Mobi, uLog, video, and Kinect to study stress and the
associated postures and facial expressions of 25 students. Finally, MDPSD (multi-modal dataset for psychologi-
cal stress detection)9 collected a comprehensive multimodal stress detection dataset on university students using
electrodermal activity of skin (EDA) and photoplethysmography (PPG) signals while performing dierent tests
(e.g., Trier Social Stress Test TSST10 and color-word tests11).
TILES-2018 dataset from Mundnich et al.12 is a multi-sensor dataset that uses a battery of surveys to cover
personality traits, behavioral states, job performance, and well-being over time. Compared to TILES-2018, our
dataset is much smaller in scope. Our dataset was generated during COVID-19, and the stress events we cap-
tured were linked to stress contributors. Tesserae1319, is a large multi-university project that studied various
aspects of the workplace performance of information workers using wearables. Compared to the Tesserae, our
dataset is focused on nurses instead. Our dataset is much smaller in scope but focuses on stress bio-metrics. We
contextualize this dataset as a unique data collection during the rst year of COVID when nursing was stressed
1University of Louisiana at Lafayette, Lafayette, LA, USA. 2Opelousas General Health System, Opelousas, LA, USA.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Scientific DATA | (2022) 9:255 |
like never before. We use EDA and skin temperature in addition to the more common sensors of contemporary
wrist-worn wearables such as pulse rate and accelerometer data that were not covered by either the TILES-2018
dataset or the Tesserae project.
Our dataset provides physiological stress signals collected using signal streams from Empatica E4 for a nurs-
ing population. Our primary motivation to create this dataset was to conduct a stress study under real-world
work conditions. Further, this study was conducted during the COVID-19 outbreak. Nursing is a stressful pro-
fession, and prior literature identied several factors that contribute to stress. Moreover, the study was conducted
during the second wave of the COVID-19 outbreak, and all the nurses were dealing with the inux of COVID-19
patients, which made it an event-rich environment. e combination of wearable data and end-of-shi surveys
oers a useful window into nursing stress.
In this section, we describe the experimental procedure and materials used for data collection. Fig.1 shows the
overall stress detection apparatus used for data collection. e Institutional Review Board of the University of
Louisiana at Lafayette approved the protocol of the study: FA19–50 INFOR.
Experimental procedure. Recruitment. The research team presented the overall study design and
approach to measure stress to the executive of the hospital, nurse managers, and human resources. Aer the
nursing department expressed interest, the research team obtained approval from hospital compliance. e con-
senting nurse subjects were enrolled in the study. e stress detection scenario was described to the nurses from
the Emergency Room (ER) department. e research team also ran a pilot of the study for a week and adjusted
the frequency of survey responses to minimize inconvenience to the nurses job duties. e overall study was done
in three phases, where 7 nurses were included in each phase. No incentives were oered. Six nurses, however, did
not complete the study, and these participants were dropped from the dataset. e subjects consented for the data
to be publicly released with anonymization. Each subject was assigned a unique identier that cannot be linked
to the participant.
Duration and demographics. Data was gathered for approximately one week from 15 female nurses working
regular shis at a hospital. e age of the nurses ranged from 30 to 55 years. is amounts to 1,250 hours of data
collected between 04/14/2020 to 12/11/2020 in two phases (Phase-I is from 4/15/2020 to 8/6/2020 and Phase-II
from 10/08/2020 to 12/11/2020). e exclusion criteria were pregnancy, heavy smoking, mental disorders, and
chronic or cardiovascular diseases. Table1 presents the signals captured from individual participants that can
be used for stress analysis.
Data collection. e subjects recruited in the study were nurses working on their usual schedules. Wearable
devices can be worn during regular shis and continuously monitor the physiological signals with minimum
intrusion. In this study, we aimed to detect the stress of nurses during their daily routine using only wearables.
An Empatica E4 was worn on the wrist of the dominant arm. We instructed our subjects to keep the phone in
proximity and wear the device to maintain close contact with the skin to minimize data loss. We only include the
data where the signals are collected continuously throughout the nurses work shi (which is typically 8 hours).
e nurse can terminate the data collection by turning o the device. In some cases, the collected data does not
include Inter-Beat Interval (IBI) and Heart Rate data due to noisy artifacts from the PPG sensor. We did not
notice any missing data from the EDA sensor. If any of the data collected during the shi is missing, the data
from that shi is removed.
Stressful event
Data Collection App.
Machine Learning
Stress Detection
Fig. 1 Stress detection apparatus.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Scientific DATA | (2022) 9:255 |
During these experiments, we collected the physiological data from the nurses continuously from the start
of their shi to the end of the shi. e overall end-to-end system was designed to perform data collection,
processing, and stress detection in near real-time. While the proposed data collection mechanism, stress detec-
tion algorithm, and the survey apparatus were designed for real-time stress detection and feedback collection.
e stress notications were sent, and the survey responses were collected at the end of the shi for each of the
nurses. is process reduces interruptions for nurses while on duty. While the delay may have produced some
degree of recall bias, it is still an improvement over traditional surveys, which do not specify the precise time of
stress events and are conducted at monthly or quarterly intervals. Data is transmitted from the wearable to the
phone using Bluetooth, and the phone transmits the data to the analytics server using Wi-Fi in the background
without the involvement of the participant. e stress detection algorithm uses a Random Forest model to
identify epochs of potential stress. e nurses responded with a survey at the end of the day from their mobile
phones, where they validated the detected events. 171 hours of data were detected as stressful. Table2 shows
the data description. e data was anonymized to remove publicly identiable information and is available on
Data collection tools. Figure1 presents the data collection framework. A detailed explanation of each of
the components is presented below.
Empatica wristband. An E4 wristband device (Empatica Inc., Milano, Italy) that collects physiological data
such as EDA, Heart Rate, skin temperature, and accelerometer data from the right wrist of the subject. EDA
is measured via E4’s silver (Ag) electrode (valid range [0.01–100] μ S), while Heart Rate is measured via E4’s
PPG sensor. e E4 wristband is powered by a rechargeable lithium battery and transmits data to the subjects
smartphone, using Bluetooth, in near-real-time. All the data collected from the E4 wristband and the sampling
frequencies are presented in Table1. e physiological data is then transmitted to the data collection app on a
nurse’s phone in near real-time. e nurses can also tag the data using the tag button on the e4 device to indicate
an undetected stress experience, and this is also transmitted to the data collection app.
Nurses accessed the survey instrument through their mobile phone to validate the stress response. e sur-
vey instrument has the list of events detected (by the model) along with the time these events occurred. e sur-
vey instrument allows them to provide context for the event. Because some events may not have been detected,
the nurses were asked to report any stress events undetected by the model through the survey instrument. e
Signal Abbreviation Frequency
electrodermal activity EDA 4.0 Hz
Heart R ate HR 1.0 Hz
skin temperature ST 1.0 Hz
accelerometer ACC 32 Hz
inter-beat interval IBI 64 Hz
blood volume pulse BVP 64 Hz
Tab le 1. Signals and frequency of Empatica E4.
ID Total Data
Collected Duration of
Stress Detected Number of Stress
Events Detec ted
Feedback (Agreement) Feedback Not
ReceivedYes No
15 72:50 9:28 30 16 2 12
83 149:47 15 30 16 5 9
94 80:30 14:06 43 9 11 23
5C 99:19 11:48 15 10 2 3
6B 58:06 14:42 23 11 2 10
6D 23:57 4:08 4 3 1 0
7A 79:18 16:52 47 32 3 12
7E 50:42 2:12 7 3 4 0
8B 27:14 4:41 17 13 3 1
BG 76:02 8:28 25 14 4 7
CE 111:01 16:12 20 7 0 13
DF 127.41 17:27 21 6 2 13
E4 116.12 17:56 40 29 6 5
EG 107.52 8:07 11 5 0 6
F5 69:38 4:39 26 25 1 0
Tab le 2. Stress events and feedback.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Scientific DATA | (2022) 9:255 |
nurses also had an additional mechanism to report the events from the wristwatch by pushing the E4 button to
tag stress events. Although, no one used the watch to report additional stress events.
Data collection application. The data collection application is a mobile application that runs on iOS and
Android. It connects to the E4 wristband through Bluetooth. e physiological signals are collected in near
real-time. ese signals and any tags input by the nurses are transmitted to the machine learning based stress
detection model. Fig.2 shows screenshots of the data collection app.
e survey instrument was designed for collecting survey responses. We oered nurses two ways of com-
pleting the surveys aer stress detection: a custom mobile application and a web application (both oered the
same functionality). e installation of the mobile application on the phone proved to be inconvenient as the
application was not available on the App store and needed to be sideloaded onto the subject’s phone. So, the
nurses opted to ll out the survey through the web application. e source code of the custom applications is
made available on GitHub20.
e data collected on the watch is transmitted to the Empatica application on the subject’s phone through
Bluetooth. ese signals are then transmitted to the Empatica server through the mobile phone’s Wi-Fi con-
nection. In the event of a loss of network connectivity, data is buered (stored temporarily) on the phone and
uploaded when the connection is restored. e researcher downloads data from the Empatica server to inspect
the data for any losses to run the machine learning model.
Machine learning and stress detection model. At the end of the working day, the machine learning based stress
detection model consumes the physiological and accelerometer data gathered from the E4 wristbands. Stress
signals are detected from these physiological signals over time using a stress detection algorithm. e skin
temperature and lag-based features were used in the stress detection model. In cases where the model detects
Fig. 2 Mobile and Web application screenshots.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Scientific DATA | (2022) 9:255 |
several stress incidents, the nurses were asked to label the six longest durations per shi. e nurses had two
ways to enter the labels. Nurses can select and edit the stress events detected by the stress detection system at
various time slots and ll out a survey to indicate if they experienced stress, and if they did, the stress level, and
the contributors of stress. ey also had the option to report additional time slots where they experienced stress,
along with the stress level and contributors of stress.
e machine learning based stress detection model was run on the physiological data (collected from the E4
wristbands at the end of each shi). e events detected by the model were transmitted back to the participant as
part of the survey instrument. More details about the survey instrument are provided in the Survey sub-section
Stress detection algorithm. We trained, tested, and validated the Random Forest based machine learning model
on the Aective Road dataset. ree signals were used, namely EDA, Skin temperature, and Heart Rate. is
trained model was used for stress detection in nurses.
e stress detection algorithm has three parts. First, a sliding window of 10 seconds and a step-size of 5 sec-
onds were used to extract signal features. e features used by the model include statistical features (mean, min,
max, skewness, kurtosis, number of peaks) of EDA, skin temperature, and Heart Rate for the current window.
We include three features (mean of skin temperature, Heart Rate, and EDA) from the previous 10 windows to
provide an antecedent context of signals before stress. is resulted in 48 features. In the second stage, these 48
features are fed into the pre-trained machine learning algorithm to generate labels that represent stress catego-
ries (represented 0, 1, and 2 where 0 = no stress; 1 = low stress; 2 = high stress). Random Forest model was
trained on the AectiveRoad dataset, hyper-parameters were optimized using a grid search across a range of
valid parameters: number of estimators (300, 400, 500, 700, and 1000), minimum samples per leaf (3, 4, 5, 6, and
7), maximum depth (10, 12, 15, 18, 20), and maximum number of features (auto, sqrt, and log2), and 10-fold
cross-validation is used to ensure that the model is not overtting. Finally, a sliding window-based change point
detection algorithm from Ruptures package21 was used to identify discrete sessions from continuous stress sig-
nals. A sliding window based segmentation is used to determine a set of change points22. Each session is rep-
resented by a start time, end time, and stress label, where the label is determined by the average stress value ‘S’
between the start time and end time. e labels for each session are calculated based on the average stress values
during the session: ‘no stress’ if S<=0.65, ‘medium stress’ if 0.65<=S<=1.3, and ‘high stress’ if S = ‘high stress’
S>=1.3. ese thresholds were adopted based on the AectiveRoad datasets5 and were further validated using
the survey feedback during the initial phase of the study. e source code for the stress detection algorithm com-
prises of feature extraction, stress detection, and change-point detection is provided in the GitHub repository20.
Survey. e nurses were requested to complete a questionnaire about their experience during the detected
stress events to identify the cause of the stress. e survey itself was not administered during the event in order
to not add to the stress. Instead, the survey was administered at the end of the shi. Our approach to conduct
surveys aer a relatively short duration aer the stress event (i.e., immediately aer the shi) helps minimize the
recall bias23 associated with traditional surveys, which are administered aer much longer durations such as a
month24 or a quarter. Our original research design before the pandemic had alerts sent to nurses an hour aer
the stress event had subsided. However, with the increased workload during the pandemic, it was suggested that
we limit surveys to minimize disruptions. Conventional surveys also do not target specic stress events and are
instead based on general recall of events across the epoch.
e questions in the questionnaire were selected based on a review of literature studying stress on nurses
in a hospital environment, as well as from our discussions with nurses2536. A list of questions in the survey is
presented in Table3.
Category Stress inducer
COVID COVID related [CR]
Treating a COVID patient [TCP]
Medical Patient in crisis [PiC]
Interaction related
Patient or patient’s family [PoPF]
Doctors or colleagues [DoC]
Administration, lab, pharmacy, radiology, or other ancillary services [Ad]
Oce-related stress
Increased Workload [IWL]
Technology related stress [TR]
Lack of supplies [LoS]
Documentation [Doc]
Competency related stress [CRS]
Environment and safety Safety (physical or physiological threats) [Saf]
Work Environment - Physical or others: work processes or procedures [WE]
Tab le 3. Survey questions and their categories.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Scientific DATA | (2022) 9:255 |
Data Records
e complete dataset is available on Dryad37. e dataset contains 15 folders of physiological signals extracted
from Empatica E4 wristbands for each participant. e corresponding stress survey responses from 15 nurses
who wore the wristband is provided in SurveyResults.xlsx le. Both these les are linked by the Nurse identier
and the date-time eld.
Preprocessing. e four signals: Heart Rate, skin temperature, EDA, and BVP, have dierent sampling rates.
e frequency of these signals ranges from 1 Hz for the Heart Rate to 64Hz for the BVP. e frequency of EDA
and skin temperature range from 4 Hz and 10 Hz each. For our stress detection, we use a frequency of 4 Hz to
minimize information loss; this provided higher accuracy with a reasonable computation time on the pre-trained
data that was used to predict the stress level. We evaluated various frequencies to detect stressful events using
physiological stress. Our analysis shows that the computation time increases as the sampling frequency of the
signals increase. In addition, certain signals like Heart Rate with lower sampling frequency are ignored due to low
variability in the signal. Given the dierences in the frequency of the signals from E4, we need to select a single
frequency rate to process the signals. We evaluated dierent scenarios with various window sizes and frequencies
for stress detection using the AectiveRoad dataset. We observed that as the window size increases, the accuracy
of stress detection decreases. Similarly, the accuracy is higher for higher frequencies. However, higher frequencies
also result in an increase in computation time. erefore a frequency of 4Hz is selected to maintain a balance
between higher accuracy and reasonably near real-time stress detection. e data points for the three signals
are interpolated to 4 Hz linearly. e code associated with the signal interpolation, cleaning, and pre-processing
is provided along with the code repository. In addition, the raw signals with their original frequencies from
Empatica E4 are also provided.
Filename Columns Measure Description Unit
Column I Accelerometer x-axis Acceleration of the device along the x-axis m/s2
Column II Accelerometer y-axis Acceleration of the device along the y-axis m/s2
Column III Accelerometer z-axis Acceleration of the device along the z-axis m/s2
BVP.csv Column I BVP e volume of blood that passes through the tissues in the wrist and is
used to measure IBI and Heart Rate N/A
HR.csv Column I Heart R ate A derived metric that measures the number of beats per minute based
on Blood Volume Pulse bpm
EDA.csv Column I EDA Measurement of the skin conductivity levels μS
IBI.csv Column I Time Time interval Second
Column II IBI Beat-to-beat interval Second
TEMP.csv Column I Skin Temperature e external temperature of the skin Celsius
Tab le 4. Empatica E4 Signal Description.
Feature Entropy Information Gain
EDA_Mean 0.126 0.102
EDA_Min 0.112 0.219
EDA_Max 0.122 0.218
EDA_Std 0.053 0.018
EDA_Kurtosis 0.014 0
EDA_Skew 0.017 0.002
EDA_Num_Peaks 0.003 0
EDA_Amphitude 0.016 0.035
EDA_Duration 0.009 0.015
Heart R ate
HR_Mean 0.041 0.005
HR_Min 0.042 0.009
HR_Max 0.043 0.01
HR_Std 0.019 0.002
HR_RMS 0.023 0.002
Skin Temperature
temp_Mean 0.111 0.056
temp_Min 0.096 0.036
temp_Max 0.105 0.04
temp_Std 0.041 0.012
Tab le 5. Entropy and Information gain of dierent features.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Scientific DATA | (2022) 9:255 |
Data description. e following is a description of various directories and les in the dataset. e zip le holds the data of 15 participants in dierent folders. Each folder contains raw
data signals in CSV format in a sub-folder. A raw data folder consists of 6 dierent CSV les, including (1) EDA.
csv (EDA), (2) HR.csv (Heart Rate), (3) TEMP.csv (skin temperature), (4) IBI.csv (IBI), (5) BVP.csv (BVP), and
(6) ACC.csv (accelerometer data). Each biometric signal data has the following information:
• Start time (epoch): e DateTime oating point value that contains the time that signal was generated using
the internal clock of the wristband. e DateTime is stored in the rst row of every data column.
• Frequency: e second cell of each column shows the data collection frequency (32)
Table4 shows the csv les, column descriptions and units.
Each le name is identical to the participants ID in both data and survey les. All of the signals were syn-
chronized to bring them to a common frequency. e accelerometer data is not used in the stress detection
model. Some of the basic physical activities can be estimated from the accelerometer sensor, which could be
further used to potentially include the activity context in stress detection38.
SurveyResults.xlsx: e Excel le holds all participant survey results and their annotated stress level in 11
Excel sheets (a sheet for each participant). Sheet names are the participant’s IDs. However, the IDs are generated
in an ID column for all les for more convenience. e following are the excel sheet columns:
Group 1: General information of the stress event.
• Column A: ID Anonymized Id of the user.
• Column B: Start time: Event start time.
• Column C: End time: Event start time.
• Column D: Duration Duration of the event.
• Column E: Date Date of data collection.
Group 2: Stress Level.
• Column F: Stress level Reported stress level by the nurse.
Group 3: Nurses’ responses regarding the nature of the stress.
• Column G: COVID Related
• Column H: Treating a COVID patient
• Column I: Patient in Crisis
• Column J: Patient or patient’s family
• Column K: Doctors or colleagues
• Column L: Administration, lab, pharmacy, radiology, or other ancillary services
• Column M: Increased Workload
• Column N: Technology related stress
• Column O: Lack of supplies
• Column P: Documentation
• Column Q: Safety (physical or physiological threats)
• Column R: Lack of supplies
• Column S: Work Environment - Physical or others: work processes or procedures
• Column T: Description
We consulted with the nurses ahead of the study to ensure that the stress labels were meaningful to them. e
description eld was used in place of “None of the Above. If the nurses agreed that there was stress and it was
not covered by any stress label, it would be used to describe the stress event. It should be noted that it was also
used to elaborate on the stress event even when one of the stress classes was selected.
Technical Validation
Methods for evaluating psychological stress detection include self-report questionnaires and interviews.
While stress surveys are considered suciently reliable and widely adopted39, they oer insights mainly on the
moments of their administration. e responses are coarse designations of stress and are unable to detect subtler
shis in stress over time40. In this paper, we used near-real-time stress evaluation surveys at the end of the shi
in conjunction with our bio-metric stress detection to minimize the biases of recall.
Label description. e dataset provides more than 1,250 hours of accelerometer and physiological sig-
nals collected from 15 nurses during their daily routine responsibilities. 83 hours of data are labeled with stress
descriptors based on the validated stressful events by nurses. We included the unlabeled signals since we expect
the unlabeled signals to have predictive value in anticipating future stress events. Table2 shows the stress detec-
tion results of participants based on their answers. Fig.3 shows the distribution of stress events in the nurses.
Table2 shows the data durations we collected from each participant.
While the nurses were instructed to use the E4 buttons to indicate the situations where they were stressed,
none of the nurses actually used them. e false positives are identied based on the surveys of stress events
where the nurses suggested that they were not stressed during the stress event detected by the stress detection
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Scientific DATA | (2022) 9:255 |
model. e nurses were also provided the opportunity to provide additional times when they were stressed but
were not detected by the model. But no additional data points were provided by the nurses beyond what was
detected by the stress detection model.
e “na” events are labelled using the stress detection model but are not validated by the subject, unlike the
stress events, which are validated. We requested the subjects to add undetected stress events at the end of the
survey. is would validate the No Stress epochs, but we did not receive responses. So, only the labels 0, 1, and 2,
corresponding to no-stress, medium-stress, and high-stress, respectively, are validated.
Physiological signals versus reported stress events. e box plots (Figs.58) show four individual
physiological signals recorded by the wearable device, namely the Heart Rate, EDA, skin temperature, BVP, and
the reported stress levels from the survey. Pekka et al.41, have shown that the stress detection models are person-
alized signals and features, and their importance in the machine learning models has to be computed at the user
level. Given the same signals, feature selection from within the signal streams at the user level improves perfor-
mance, as demonstrated by Pekka’s 5-fold cross-validations. Figs.4, 5, 6 illustrate that the biometric signals like
EDA, Heart Rate, and skin temperature vary between subjects.
e relationship between these signals and stress is not necessarily linear. In addition, there is also an inter-
play between these signals. Machine learning techniques can model stress behavior with respect to each of these
features derived from the signals. In Table5, we provide the entropy and information gain for the random forest
Fig. 3 Distribution of stress levels for each subject.
Fig. 4 Overall skin temperatures of participants.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Scientific DATA | (2022) 9:255 |
model to identify the stress level of the subjects. e entropy is higher for EDA-based signals compared to skin
temperature and Heart Rate. e EDA-based features (mean, min, and max) for each of the subjects within a
given window is a better predictor of an individual under stress. While earlier studies indicated that Heart Rate
and Heart Rate variability are good features to use for stress detection42,43, this is not evident in our analysis. is
could be due to various reasons. Earlier studies relied on simulated data in laboratory settings where the subjects
were in stationary/sedentary positions. However, in our real-world analysis, there is a lot of physical activity
performed by nurses, leading to their Heart Rates being elevated.
e skin temperature of a healthy individual is about 33°C or 91°F. Skin temperature varies during various
activities due to skin blood temperature and its ow and is normally within the range of 33.5°C to 36.9°C44.
However, this can vary quite widely based on the type and length of activity and room temperature. Given the
open-ended nature of the experiment, there are some anomalies in the data. e subjects were twice as likely
to have a higher skin temperature with high stress than during no stress. e general trend agrees with reports
from prior literature. However, the trend was not statistically signicant in our dataset. We analyzed the data
with a paired sample t-test. With the test statistic of 1.3 and the p-value at 0.22, we cannot reject the null hypoth-
esis that there is no dierence in temperature with respect to stress45. e relationship between skin temperature
and stress has been discussed by several authors45,46 and observed skin warming in stressful events.
e Heart Rate of a healthy individual, irrespective of gender, ranges from 60 to 100, when in a resting state.
However, the Heart Rate varies greatly with activity. Given that the subjects are typically performing dierent
activities, one can observe high variations in Heart Rate. Fig.5 shows the distribution of Heart Rate and asso-
ciated stress level for all the subjects. Based on the visual inspection of the plot, stress does not appear to have
a strong correlation with the Heart Rate. Heart rate is generally associated with high stress situations, but high
Heart Rate should not imply high stress since it is more commonly inuenced by non-stressful physical activity.
Fig. 5 Overall HR of participants.
Fig. 6 Overall EDA of participants.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Scientific DATA | (2022) 9:255 |
Heart rate is best interpreted with accelerometer data, which can signify physical activity. We do not currently
provide activity recognition labels and models from wrist-based accelerometer data.
Figure6 shows the distribution of EDA and associated stress levels for all the subjects. Based on visual
inspection of the plot, stress has a positive correlation with the EDA. e average EDA is higher in stressful sit-
uations for some participants. However, for some participants, the EDA is not a good indicator of stress because
EDA does not vary or it is not positively correlated. ere is high variability in EDA signals for various subjects
in stressful events. e normal range for humans is from 1 to 20 μS. We observe that the average skin EDA for
all the participants when there is no stress reported is below 5, and the range for medium stress is the same as
stress-free situations. e EDA contains two main components, skin conductance response (SCR) and skin con-
ductance level (SCL). e skin conductance response is a phasic response to external stimuli, whereas SCL is a
gradual change in skin conductance over time. In our analysis, the SCL plays a major role, as the stress detection
is performed in windows and compared to a normal baseline. For example, the EDA was observed to reach a
maximum of 60 μS compared to 10 μS during non-stress time-periods on average.
e Heart Rate and Heart Rate variability signals can be derived from the BVP signal by computing the
inverse of the time between two successive peaks. Fig.7 shows the overall distribution of the BVP signal of dif-
ferent participants. We analyzed the BVP signals using a paired sample t-test with the test statistic of 2.3 and the
p-value at 0.0501. We cannot reject the null hypothesis that there is no dierence in BVP with respect to stress.
Figure8 shows the distribution of stress intensity within each of the contributors to stress, as well as the
cumulative number of hours under each stress level per contributor. Table3 provides additional details about
the factors. e survey results indicate that treating a COVID-19 patient was the most signicant contributor
to detected high-stress events. Not all COVID-related stress classes ranked high, however. ere was no event
where the nurses are worried about contracting COVID. In addition to COVID, the nurses indicate that they
were impacted by the lack of supplies; however, it was not mentioned as a contributor of acute stress for any of
our detected events. is could mean that while COVID-19 and supplies could have been signicant contribu-
tors to stress in the background, they themselves did not cause acute stress events detectable by our apparatus.
ese results are also veried by other researchers studying the impact of COVID on nurses47. is should
be seen as a limitation of biometric stress detection; it cannot detect general concerns, and in a holistic stress
evaluation, biometric data collection must always be paired with qualitative interviews. It is a complementary
modality, rather than a replacement, to traditional stress monitoring.
Usage Notes
Potential applications. Human well-being is an important consideration both for individuals and organiza-
tions. As such, organizations need mechanisms to carefully monitor and manage high-stress environments such
as hospitals in order to improve both employee wellbeing as well as patient satisfaction. We believe this study can
be useful for researchers from many domains. First, researchers in signal processing and machine learning might
be able to use the dataset to develop new machine learning models that improve stress detection performance.
Second, the accelerometer signals can be used to contextualize stress in order to understand the relationship
between movement patterns and stress.
e data collected here can be useful for nursing, machine learning, and hospital management communities.
e physiological signals, along with the activity data, can be used to better contextualize signal streams that
are impacted by the intensity of physical activity, such as Heart Rate. e interpretation context of our dataset is
signicantly dierent given that our subjects were engaged in real-world tasks that have varied physical compo-
nents to them. us, further analyses can benet from augmentation with inferred metrics of mobility.
Stress Contributor
Medium Stress
High Stress
No Stress
Duration (Hours)
Fig. 7 Overall BVP of participants.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Scientific DATA | (2022) 9:255 |
Finally, researchers in human resources / human factors / organizational psychology would nd the survey
dataset, along with biometric signals useful because it is a unique dataset that makes the association between
biometric signals and stress-related factors during the COVID-19 outbreak.
Additionally, the dataset illustrates the relative frequency of various work-related stressors during the
Limitations. We conceptualized the study before the pandemic. e original design had an onsite observer
making independent assessments of tasks and apparent stress behaviors alongside the system stress assessments.
e outbreak prevented us from placing an investigator onsite but provided a dataset under critically unique
clinical circumstances. e nurses were busier than usual, and we had to ensure that we were not interrupting
them too oen.
Not all of the dataset is covered by stress labels. Unlabelled data does not necessarily imply a lack of stress;
it just means that we did not detect stress using our signal streams and that the subjects did not independently
report it as a stressful period. We did not insist on complete coverage of labels, as the most important priority of
nurses was taking care of the patients.
We also provide the unlabelled data because we suspect that it may contain predictive markers of stress that
future analyses may reveal.
Our study is distinguished by the following features and limitations that must be taken into account while
interpreting the data. ese are simply inherent to the naturalistic setting and due to the mitigating eects of the
pandemic scenario.
• Compared to the laboratory scenarios where the undivided attention of the subject is available, in high-im-
pact real-world scenarios, the subjects may not be distracted or interrupted frequently from their professional
tasks for labeling.
• We only validated stressful events because of our focus on acute stress detection. Because nursing was stress-
ful during the peaks of the COVID-19 pandemic, we provided no more than 6 events a day for the nurses.
When there are more than 6 stress events, we prioritized stress events of longer duration since we expected
the subjects to recall these events better at the end of the day. We also chose to spread the events across the
nurses’ work shis since this is more likely to reduce recall conicts.
• In prior literature, researchers have claimed that chronic stress can be mitigated by early detection of acute
stress40,48. e literature does not, however, validate this claim. ere is not sucient research on chronic
stress detection using biometric signals49. is dataset is also focused on acute stress detection and is unable
to detect chronic stress.
• Social distancing requirements upended a prior experimental design that included onsite investigator data
collection describing nurse tasks in conjunction with stress signals.
• While we had methods for immediate stress detection, we opted to delay the surveys to the end of the day in
order to not interrupt the work of the nurses. While the latency may have produced some degree of recall bias,
it is still an improvement over traditional surveys, which do not specify the precise time of stress events and
are conducted at monthly or quarterly intervals24.
• e nurses agreed with the stress detection algorithm far more than they disagreed. Given the stress of the
pandemic, and because the stress reports of nurses were not corroborated by onsite investigator observations
of stressful behavior, the data can only be interpreted as subject reports of stress.
Finally, unlike laboratory studies that are typically conducted in a controlled environment, stress detection in
a natural environment is more complex due to the inuence of many social, cultural, and psychological factors.
Fig. 8 Distribution of stress levels within detected events across stress contributors.
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Scientific DATA | (2022) 9:255 |
While we have attempted to provide some context to stress by conducting a survey based on the available lit-
erature, there are several social, cultural, and individual variables we did not consider in our survey. Given the
stress of the pandemic, and because the stress reports of nurses were not corroborated by onsite investigator
observations of stressful behavior. e data should specically be interpreted as subject reports of stress, which
may be mitigated by other factors such as a desire to quickly complete the survey questions aer a stressful day.
Code availability
e code is available on GitHub20. e data collection was performed in Central Standard Time of the United
States which is 6 hours behind GMT.
Received: 6 October 2021; Accepted: 28 April 2022;
Published: xx xx xxxx
1. Peternel, ., Pogačni, M., Tavčar, . & os, A. A presence-based context-aware chronic stress recognition system. Sensors 12,
15888–15906 (2012).
2. Bicford, M. Stress in the worplace: A general overview of the causes, the eects, and the solutions. Canadian Mental Health
Association Newfoundland and L abrador Division 8, 1–3 (2005).
3. Wellen, . E. et al. Inammation, stress, and diabetes. e Journal of clinical investigation 115, 1111–1119 (2005).
4. Greenglass, E. ., Bure, . J. & Fisenbaum, L. Worload and burnout in nurses. Journal of community & applied social psychology
11, 211–215 (2001).
5. Haouij, N. E., Poggi, J.-M., Sevestre-Ghalila, S., Ghozi, . & Jaïdane, M. Aectiveroad system and database to assess driver’s
attention. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing, 800–803 (2018).
6. Schmidt, P., eiss, A., Duerichen, ., Marberger, C. & Van Laerhoven, . Introducing wesad, a multimodal dataset for wearable
stress and aect detection. In Proceedings of the 20th ACM international conference on multimodal interaction, 400–408 (2018).
7. irschbaum, C., Pire, .-M. & Hellhammer, D. H. e ‘trier social stress test’–a tool for investigating psychobiological stress
responses in a laboratory setting. Neuropsychobiology 28, 76–81 (1993).
8. Srirampraash, S., Prasanna, V. D. & Murthy, O. . Stress detection in woring people. Procedia computer science 115, 359–366
9. Chen, W., Zheng, S. & Sun, X. Introducing mdpsd, a multimodal dataset for psychological stress detection. In Big Data: 8th CCF
Conference, BigData 2020, Chongqing, China, October 22–24, 2020, evised Selected Papers, vol. 1320, 59 (Springer Nature, 2021).
10. Birett, M. A. e trier social stress test protocol for inducing psychological stress. JoVE (Journal of Visualized Experiments) e3238
11. Scarpina, F. & Tagini, S. e stroop color and word test. Frontiers in psychology 8, 557 (2017).
12. Mundnich, . et al. Tiles-2018, a longitudinal physiologic and behavioral data set of hospital worers. Scientic Data 7, 1–26 (2020).
13. Martinez, G. J. et al. Improved sleep detection through the fusion of phone agent and wearable data streams. In 2020 IEEE
International Conference on Per vasive Computing and Communications Worshops (PerCom Worshops), 1–6 (IEEE, 2020).
14. Martinez, G. J. et al. On the quality of real-world wearable data in a longitudinal study of information worers. In 2020 IEEE
International Conference on Per vasive Computing and Communications Worshops (PerCom Worshops), 1–6 (IEEE, 2020).
15. DATA, I. P. & EDUCATION, I. Curriculum vitae–aaron d. striegel. Ethics 16 (2004).
16. Mirjafari, S. et al . Dierentiating higher and lower job performers in the worplace using mobile sensing. Proceedings of the ACM on
Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 1–24 (2019).
17. Saha, . et al. Imputing missing social media data stream in multisensor studies of human behavior. In 2019 8th International
Conference on Aective Computing and Intelligent Interaction (ACII), 178–184 (IEEE, 2019).
18. Saha, . et al. Social media as a passive sensor in longitudinal studies of human behavior and wellbeing. In Extended Abstracts of the
2019 CHI Conference on Human Factors in Computing Systems, 1–8 (2019).
19. Mattingly, S. M. et al. e tesserae project: Large-scale, longitudinal, in situ, multimodal sensing of information worers. In Extended
Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, 1–8 (2019).
20. avi, M. S. Stress-detection-in-nurse. (2021).
21. Truong, C., Oudre, L. & Vayatis, N. ruptures: change point detec tion in python. arXiv preprint arXiv:1801.00826 (2018).
22. Truong, C., Oudre, L. & Vayatis, N. Selective review of oine change point detection methods. Signal Processing 167, 107299 (2020).
23. Tarrant, M. A., Manfredo, M. J., Bayley, P. B. & Hess,  . Eects of recall bias and nonresponse bias on self-report estimates of angling
participation. North American Journal of Fisheries Management 13, 217–222 (1993).
24. Sveinsdottir, H., Biering, P. & amel, A. Occupational stress, job satisfaction, and woring environment among icelandic nurses: a
cross-sectional questionnaire survey. International journal of nursing studies 43, 875–889 (2006).
25. Adriaenssens, J., De Gucht, V. & Maes, S. Causes and consequences of occupational stress in emergency nurses, a longitudinal study.
Journal of nursing management 23, 346–358 (2015).
26. Brown, S., Whichello, . & Price, S. e impact of resiliency on nurse burnout: An integrative literature review. Medsurg Nursing 27,
349 (2018).
27. Jovanov, E., Frith, ., Anderson, F., Milosevic, M. & Shrove, M. T. eal-time monitoring of occupational stress of nurses. In 2011
Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 3640–3643 (IEEE, 2011).
28. Gelsema, T. I., Van Der Doef, M., Maes, S., Aerboom, S. & Verhoeven, C. Job stress in the nursing profession: e inuence of
organizational and environmental conditions and job characteristics. International Journal of Stress Management 12, 222 (2005).
29. Hersch, . . et al. educing nurses’ stress: A randomized controlled trial of a web-based stress management program for nurses.
Applied nursing research 32, 18–25 (2016).
30. hamisa, N., Oldenburg, B., Peltzer, . & Ilic, D. Wor related stress, burnout, job satisfaction and general health of nurses.
International journal of environmental research and public health 12, 652–666 (2015).
31. uri, . Stress management among nurses: Literature review of causes and coping strategies (2018).
32. urnat-Thoma, E., Ganger, M., Peterson, . & Channell, L. educing annual hospital and registered nurse staff turnover—a
10-element onboarding program inter vention. SAGE Open Nursing 3, 2377960817697712 (2017).
33. Lo, W.-Y., Chien, L.-Y., Hwang, F.-M., Huang, N. & Chiou, S.-T. From job stress to intention to leave among hospital nurses: A
structural equation modelling approach. Journal of advanced nursing 74, 677–688 (2018).
34. Lu, H., Zhao, Y. & While, A. Job satisfaction among hospital nurses: A literature review. International journal of nursing studies 94,
21–31 (2019).
35. Gjoresi, M., Gjoresi, H., Luštre, M. & Gams, M. Continuous stress detection using a wrist device: in laboratory and real life. In
proceedings of the 2016 ACM international joint conference on pervasive and ubiquitous computing: Adjunct, 1185–1193 (2016).
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Scientific DATA | (2022) 9:255 |
36. Lopez-Martinez, D., El-Haouij, N. & Picard, . Detection of real-world driving-induced aective state using physiological signals
and multi-view multi-tas machine learning. In 2019 8th International Conference on Aective Computing and Intelligent Interaction
Worshops and Demos (ACIIW), 356–361 (IEEE, 2019).
37. Hosseini, S. et al. A multi-modal sensor dataset for continuous stress detection of nurses in a hospital. Dryad
dryad.5hqbzh6f (2021).
38. Foerster, F., Smeja, M. & Fahrenberg, J. Detection of posture and motion by accelerometry: a validation study in ambulatory
monitoring. Computers in human behavior 15, 571–583 (1999).
39. Tsutsumi, A., Inoue, A. & Eguchi, H. How accurately does the brief job stress questionnaire identify worerswith or without
potential psychological distress? Journal of occupational health 17–0011 (2017).
40. Alberdi, A., Aztiria, A. & Basarab, A. Towards an automatic early stress recognition system for oce environments based on
multimodal measurements: A review. Journal of biomedical informatics 59, 49–75 (2016).
41. Siirtola, P. & öning, J. Comparison of regression and classication models for user-independent and personal stress detection.
Sensors 20, 4402 (2020).
42. Melillo, P., Bracale, M. & Pecchia, L. Nonlinear heart rate variability features for real-life stress detection. case study: students under
stress due to university examination. Biomedical enginee ring online 10, 1–13 (2011).
43. Boonnithi, S. & Phongsuphap, S. Comparison of heart rate variability measures for mental stress detection. In 2011 Computing in
Cardiology, 85–88 (IEEE, 2011).
44. Tanda, G. e use of infrared thermography to detect the sin temperature response to physical activity. In Journal of Physics:
Conference Series, vol. 655, 012062 (IOP Publishing, 2015).
45. arthieyan, P., Murugappan, M. & Yaacob, S. Descriptive analysis of sin temperature variability of sympathetic nervous system
activity in stress. Journal of Physical erapy Science 24, 1341–1344 (2012).
46. Baer, L. M. & Taylor, W. M. e relationship under stress between changes in sin temperature, electrical sin resistance, and pulse
rate. Journal of experimental psychology 48, 361 (1954).
47. Morgantini, L. A. et al. Factors contributing to healthcare professional burnout during the covid-19 pandemic: a rapid turnaround
global survey. PloS one 15, e0238217 (2020).
48. Giannaais, G. et al. eview on psychological stress detection using biosignals. IEEE Transactions on Aective Computing (2019).
49. Baumgartl, H., Fezer, E. & Buettner, . Two-level classication of chronic stress using machine learning on resting-state eeg
recordings (2020).
is project is funded by NSF Grants 1650551, CNS-1429526, and by the Louisiana Board of Regents Support
Fund contract LEQSF (2019–20)-ENH-DE-22. e authors would also like to acknowledge the reviewers for their
constructive inputs that helped improve the clarity of the manuscript. e authors would also like to acknowledge
the reviewers for their constructive inputs that helped improve the clarity of the manuscript.
Author contributions
Conceptualization: R.G., Z.A., K.C, and C.B.; Methodology: S.K., S.H., and R.B.; Soware: S.K., S.H., and R.B.;
Validation: S.K., S.H., and R.B., C.B., K.C., and R.G.; Formal analysis: S.H. S.K., and R.B.; Investigation: R.G., C.B.,
C.K., and Z.A.; Resources: R.G., and C.B.; Data duration: R.B. and S.K.; Writing—original dra preparation: S.H.,
and R.B.; Writing—review and editing: R.G., Z.A., C.B. and K.C.; Visualization: R.B., S.K. and S.H.; Supervision:
R.G., Z.A., C.B. and K.C.; Project administration: R.G.; Funding acquisition: R.G., C.B., Z.A., and R.B. All authors
reviewed the manuscript.
Competing interests
e authors declare no competing interests.
Additional information
Correspondence and requests for materials should be addressed to R.G.
Reprints and permissions information is available at
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional aliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Cre-
ative Commons license, and indicate if changes were made. e images or other third party material in this
article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons license and your intended use is not per-
mitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the
copyright holder. To view a copy of this license, visit
© e Author(s) 2022
Content courtesy of Springer Nature, terms of use apply. Rights reserved
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
... Despite limitations with battery time and number of available sensors in certain models, compared to controlled laboratory measurement devices, wearables are non-intrusive and easier to use. This ease has facilitated many experiments using wearables [4][5][6][7][8][9][10][11][12][13][14][15], and predominantly utilizing Empatica's latest E4 device [16], which have yielded a number of well-studied public datasets [6,[17][18][19][20][21][22][23]. Table 1 provides a summary of the datasets which were considered in this study. ...
... In supervised learning, models are trained using data that is accurately labeled for the response you are predicting for; in the context of this paper, the labeling would be for elevated levels of stress with labels as binary yes/no indicators or a numeric scale to indicate stress level, generally a range between 0 (no perceived stress) and 1 (maximum perceived stress). Table 1, the datasets included in this study were labeled using one of two methods: (i) periodic [6,17,18,20,22], where specific time frames during the experiment were either labeled as stressed or non-stressed, while the test subject was placed under that perceived condition (a stressful test or action, or non-stressed, restful period), or (ii) scored as experiencing stress or no stress during a particular period, either by completing a self-scoring evaluation [21,23], or by an observer [19] who perceived a level of stress by observing the emotional reaction of the subject during that period. ...
... Based on the prior experiments and reviewed literature (Table 2), publicly available datasets including WESAD [6] and SWELL [17] were utilized in this study. Additionally, since these datasets all included Empatica E4 sensor biomarker data, the Toadstool [20], UBFC-Phys [21], Non-EEG Dataset for Assessment of Neurological Status (NEURO) [18], Wearable Exam Stress Dataset (EXAM) [22], AffectiveROAD [19] and Multimodal Sensor Dataset for Continuous Stress Detection of Nurses in a Hospital [23] public datasets, which also are collected using Empatica E4, were considered. Table 1 provides a summary of all datasets considered. ...
Full-text available
Introduction. We investigate the generalization ability of models built on datasets containing a small number of subjects, recorded in single study protocols. Next, we propose and evaluate methods combining these datasets into a single, large dataset. Finally, we propose and evaluate the use of ensemble techniques by combining gradient boosting with an artificial neural network to measure predictive power on new, unseen data. Methods. Sensor biomarker data from six public datasets were utilized in this study. To test model generalization, we developed a gradient boosting model trained on one dataset (SWELL), and tested its predictive power on two datasets previously used in other studies (WESAD, NEURO). Next, we merged four small datasets, i.e. (SWELL, NEURO, WESAD, UBFC-Phys), to provide a combined total of 99 subjects,. In addition, we utilized random sampling combined with another dataset (EXAM) to build a larger training dataset consisting of 200 synthesized subjects,. Finally, we developed an ensemble model that combines our gradient boosting model with an artificial neural network, and tested it on two additional, unseen publicly available stress datasets (WESAD and Toadstool). Results. Our method delivers a robust stress measurement system capable of achieving 85% predictive accuracy on new, unseen validation data, achieving a 25% performance improvement over single models trained on small datasets. Conclusion. Models trained on small, single study protocol datasets do not generalize well for use on new, unseen data and lack statistical power. Ma-chine learning models trained on a dataset containing a larger number of varied study subjects capture physiological variance better, resulting in more robust stress detection.
... We have used five different datasets to train and evaluate M3Sense: VerBIO [111], WESAD [91], K-EmoCon [71], MMSDN [32], and a new collected dataset, called MAYA. Among these, MMSDN and MAYA were used as unlabeled datasets for training only, the others were used for both training and evaluation. ...
... and 5.1.5). Previous works in the literature have shown significant correlations among these domains with stress and anxiety [1,32,47]. On the contrary, compared to stress and anxiety detection, emotion detection is a more complex and difficult task, as it involves a variety of emotions placed in the circumplex model [84]. Thus, these results show clinical relevance with the aforementioned works in the literature. ...
... Unlabeled Dataset -1: MMSDN dataset, presented in[32] is a multi-modal affective domain dataset which aimed for detecting the stressful situations and mental workload of overall 15 nurses in a hospital. The ...
Full-text available
Modern smartwatches or wrist wearables having multiple physiological sensing modalities have emerged as a subtle way to detect different mental health conditions, such as anxiety, emotions, and stress. However, affect detection models depending on wrist sensors data often provide poor performance due to inconsistent or inaccurate signals and scarcity of labeled data representing a condition. Although learning representations based on the physiological similarities of the affective tasks offer a possibility to solve this problem, existing approaches fail to effectively generate representations that will work across these multiple tasks. Moreover, the problem becomes more challenging due to the large domain gap among these affective applications and the discrepancies among the multiple sensing modalities. We present M3Sense, a multi-task, multimodal representation learning framework that effectively learns the affect-agnostic physiological representations from limited labeled data and uses a novel domain alignment technique to utilize the unlabeled data from the other affective tasks to accurately detect these mental health conditions using wrist sensors only. We apply M3Sense to 3 mental health applications, and quantify the achieved performance boost compared to the state-of-the-art using extensive evaluations and ablation studies on publicly available and collected datasets. Moreover, we extensively investigate what combination of tasks and modalities aids in developing a robust Multitask Learning model for affect recognition. Our analysis shows that incorporating emotion detection in the learning models degrades the performance of anxiety and stress detection, whereas stress detection helps to boost the emotion detection performance. Our results also show that M3Sense provides consistent performance across all affective tasks and available modalities and also improves the performance of representation learning models on unseen affective tasks by 5% - 60%.
... HR or pulse rate is a widely used measure in stress research as HR increases significantly during stressful events [28,29]. Similarly, core body temperature increases during periods of stress and anxiety; studies have validated wrist ST as an indicator of stress [30,31]. EDA is the skin conductance of an individual and is influenced by the surface sweat glands. ...
... Phasic EDA-the acute, timevarying spikes in skin conductance level-significantly increases during states of high emotional arousal and stress [32,33]. Though a few studies used physiological indices to measure occupational stress of ICU nurses with promising results [30,34], the studies were cross-sectional rather than comparative. ...
Full-text available
Intensive care nurses are highly prone to occupational stress and burnout, affecting their physical and mental health. The occurrence of the pandemic and related events increased nurses’ workload and exacerbated stress and burnout. We conducted a prospective longitudinal mixed-methods study with a cohort of nurses working in a medical ICU (COVID unit; n = 14) and cardiovascular ICU (non-COVID unit; n = 5). Each participant was followed for six 12-hour shifts. Validated questionnaires measured occupational stress and burnout prevalence. Wrist-worn wearable technologies recorded physiological indices of stress. Participants elaborated on the contributors to stress via post-study questionnaire. Data were analyzed using statistical and qualitative methods. Participants who cared for COVID patients at the COVID unit were 3.71 times more likely to experience stress (p < .001) in comparison to non-COVID unit participants. No differences in stress levels were found when the same participants worked with COVID and non-COVID patients at different shifts at the COVID unit. The cohorts expressed similar contributors to stress including communication tasks, patient acuity, clinical procedures, admission processes, proning, labs, and assisting coworkers. Nurses in COVID units, irrespective of whether they care for a COVID patient, may experience high occupational stress and burnout.
... The combination of this information resulted in the highest three stress level (low, moderate and high) classification accuracy of 61% using logistic regression-based leave-one-outcross-validation. Hosseini et al. [22] created a multi-sensor dataset of nurses working in the hospital during the COVID-19 outbreak. They used Empatica E4 watches to collect information about the electrodermal activities, heart rate, and skin temperature of the subjects. ...
... In this study, an Empatica E4 watch (Figure 2, adopted from [35]) was used to measure individual physiological changes based on PPG, which was previously used in several similar studies [16,18,22,[36][37][38][39]. The watch is a medical-grade device that is classified as Class IIA Medical Device according to the 93/42/EEC Directive. ...
Full-text available
With the recent advancements in the field of wearable technologies, the opportunity to monitor stress continuously using different physiological variables has gained significant interest. The early detection of stress can help improve healthcare and minimizes the negative impact of long-term stress. This paper reports outcomes of a pilot study and associated stress-monitoring dataset, named the “Stress-Predict Dataset”, created by collecting physiological signals from healthy subjects using wrist-worn watches with a photoplethysmogram (PPG) sensor. While wearing these watches, 35 healthy volunteers underwent a series of tasks (i.e., Stroop color test, Trier Social Stress Test and Hyperventilation Provocation Test), along with a rest period in-between each task. They also answered questionnaires designed to induce stress levels compatible with daily life. The changes in the blood volume pulse (BVP) and heart rate were recorded by the watch and were labelled as occurring during stress-inducing tasks or a rest period (no stress). Additionally, respiratory rate was estimated using the BVP signal. Statistical models and personalised adaptive reference ranges were used to determine the utility of the proposed stressors and the extracted variables (heart rate and respiratory rate). The analysis showed that the interview session was the most significant stress stimulus, causing a significant variation in heart rate of 27 (77%) participants and respiratory rate of 28 (80%) participants out of 35. The outcomes of this study contribute to the understanding the role of stressors and their association with physiological response and provide a dataset to help develop new wearable solutions for more reliable, valid, and sensitive physio-logical stress monitoring.
... AffectiveROAD [31] included physiological and environmental sensors for driver attention assessment. In [32], biometric data were collected from nurses in a hospital during the COVID-19 outbreak, using a wristband while on duty. WESAD [33] is a multimodal dataset where subjects were exposed to stressful activities, even if it is limited by low sample size and lack of diversity. ...
Full-text available
The roles of emergency responders are challenging and often physically demanding, so it is essential that their duties are performed safely and effectively. In this article, we address real-time bio-signal sensor monitoring for responders in disaster scenarios. In particular, we propose the integration of a set of health monitoring sensors suitable for detecting stress, anxiety and physical fatigue in an Internet of Cooperative Agents architecture for search and rescue (SAR) missions (SAR-IoCA), which allows remote control and communication between human and robotic agents and the mission control center. With this purpose, we performed proof-of-concept experiments with a bio-signal sensor suite worn by firefighters in two high-fidelity SAR exercises. Moreover, we conducted a survey, distributed to end-users through the Fire Brigade consortium of the Provincial Council of Málaga, in order to analyze the firefighters’ opinion about biological signals monitoring while on duty. As a result of this methodology, we propose a wearable sensor suite design with the aim of providing some easy-to-wear integrated-sensor garments, which are suitable for emergency worker activity. The article offers discussion of user acceptance, performance results and learned lessons.
... e reported accuracy from their work is 70%. Other datasets available for stress detection include the one in [38]; it is a real world biometric dataset collected from nurses working in a hospital at the time of COVID-19. e physiological variables measured in this dataset include EDA, heart rate, GSR, and accelerometer reading. ...
Full-text available
The early diagnosis of stress symptoms is essential for preventing various mental disorder such as depression. Electroencephalography (EEG) signals are frequently employed in stress detection research and are both inexpensive and noninvasive modality. This paper proposes a stress classification system by utilizing an EEG signal. EEG signals from thirty-five volunteers were analysed which were acquired using four EEG sensors using a commercially available 4-electrode Muse EEG headband. Four movie clips were chosen as stress elicitation material. Two clips were selected to induce stress as it contains emotionally inductive scenes. The other two clips were chosen that do not induce stress as it has many comedy scenes. The recorded signals were then used to build the stress classification model. We compared the Multilayer Perceptron (MLP) and Long Short-Term Memory (LSTM) for classifying stress and nonstress group. The maximum classification accuracy of 93.17% was achieved using two-layer LSTM architecture.
Full-text available
Affective computing has garnered researchers' attention and interest in recent years as there is a need for AI systems to better understand and react to human emotions. However, analyzing human emotions, such as mood or stress, is quite complex. While various stress studies use facial expressions and wearables, most existing datasets rely on processing data from a single modality. This paper presents EmpathicSchool, a novel dataset that captures facial expressions and the associated physiological signals, such as heart rate, electrodermal activity, and skin temperature, under different stress levels. The data was collected from 20 participants at different sessions for 26 hours. The data includes seven different signal types, including both computer vision and physiological features that can be used to detect stress. In addition, various experiments were conducted to validate the signal quality.
Full-text available
We present six datasets containing telemetry data of the Mars Express Spacecraft (MEX), a spacecraft orbiting Mars operated by the European Space Agency. The data consisting of context data and thermal power consumption measurements, capture the status of the spacecraft over three Martian years, sampled at six different time resolutions that range from 1 min to 60 min. From a data analysis point-of-view, these data are challenging even for the more sophisticated state-of-the-art artificial intelligence methods. In particular, given the heterogeneity, complexity, and magnitude of the data, they can be employed in a variety of scenarios and analyzed through the prism of different machine learning tasks, such as multi-target regression, learning from data streams, anomaly detection, clustering, etc. Analyzing MEX’s telemetry data is critical for aiding very important decisions regarding the spacecraft’s status and operation, extracting novel knowledge, and monitoring the spacecraft’s health, but the data can also be used to benchmark artificial intelligence methods designed for a variety of tasks.
Full-text available
Measurement(s) Overall Sleep Quality Rating • Step Unit of Distance • Speech • Mean Heart Rate • Proximity • Electrocardiogram Sequence • heart rate variability measurement • Respiratory Rate • physical activity measurement • light • door motion • Changes in Ambient Temperature in Medical Device Environment • humidity • Overall Emotional Well-Being • Stress • psychological flexibility • work-related acceptance • work engagement • psychological capital • intelligence • job performance • organizational citizenship behavior • counter-productive work behavior • personality trait measurement • Negative affectivity • positive affectivity • anxiety-related behavior trait • Alcohol Use History • Overall Health Rating During Past Week Technology Type(s) photoplethysmography • Accelerometer • Microphone Device • Bluetooth-enabled Activity Monitor • electrocardiogram • Sensor Device • Photodetector Device • Temperature Sensor Device • questionnaire • Multidimensional Psychological Flexibility Inventory (MPFI) • Utrecht work engagement scale • survey method • individual task proficiency • Search Results Web results Organizational Citizenship Behavior Checklist • big five inventory • Positive and Negative Affect Schedule (PANAS-X) • State-Trait Anxiety Inventory Sample Characteristic - Organism Homo sapiens Sample Characteristic - Environment hospital Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.12465101
Full-text available
Background: Healthcare professionals (HCPs) on the front lines against COVID-19 may face increased workload and stress. Understanding HCPs' risk for burnout is critical to supporting HCPs and maintaining the quality of healthcare during the pandemic. Methods: To assess exposure, perceptions, workload, and possible burnout of HCPs during the COVID-19 pandemic we conducted a cross-sectional survey. The main outcomes and measures were HCPs' self-assessment of burnout, indicated by a single item measure of emotional exhaustion, and other experiences and attitudes associated with working during the COVID-19 pandemic. Findings: A total of 2,707 HCPs from 60 countries participated in this study. Fifty-one percent of HCPs reported burnout. Burnout was associated with work impacting household activities (RR = 1·57, 95% CI = 1·39-1·78, P<0·001), feeling pushed beyond training (RR = 1·32, 95% CI = 1·20-1·47, P<0·001), exposure to COVID-19 patients (RR = 1·18, 95% CI = 1·05-1·32, P = 0·005), and making life prioritizing decisions (RR = 1·16, 95% CI = 1·02-1·31, P = 0·03). Adequate personal protective equipment (PPE) was protective against burnout (RR = 0·88, 95% CI = 0·79-0·97, P = 0·01). Burnout was higher in high-income countries (HICs) compared to low- and middle-income countries (LMICs) (RR = 1·18; 95% CI = 1·02-1·36, P = 0·018). Interpretation: Burnout is present at higher than previously reported rates among HCPs working during the COVID-19 pandemic and is related to high workload, job stress, and time pressure, and limited organizational support. Current and future burnout among HCPs could be mitigated by actions from healthcare institutions and other governmental and non-governmental stakeholders aimed at potentially modifiable factors, including providing additional training, organizational support, and support for family, PPE, and mental health resources.
Full-text available
In this article, regression and classification models are compared for stress detection. Both personal and user-independent models are experimented. The article is based on publicly open dataset called AffectiveROAD, which contains data gathered using Empatica E4 sensor and unlike most of the other stress detection datasets, it contains continuous target variables. The used classification model is Random Forest and the regression model is Bagged tree based ensemble. Based on experiments, regression models outperform classification models, when classifying observations as stressed or not-stressed. The best user-independent results are obtained using a combination of blood volume pulse and skin temperature features, and using these the average balanced accuracy was 74.1% with classification model and 82.3% using regression model. In addition, regression models can be used to estimate the level of the stress. Moreover, the results based on models trained using personal data are not encouraging showing that biosignals have a lot of variation not only between the study subjects but also between the session gathered from the same person. On the other hand, it is shown that with subject-wise feature selection for user-independent model, it is possible to improve recognition models more than by using personal training data to build personal models. In fact, it is shown that with subject-wise feature selection, the average detection rate can be improved as much as 4%-units, and it is especially useful to reduce the variance in the recognition rates between the study subjects.
Conference Paper
Full-text available
While there are several works that diagnose acute stress using electroencephalographic recordings and machine learning, there are hardly any works that deal with chronic stress. Currently, chronic stress is mainly determined using questionnaires, which are, however, subjective in nature. While chronic stress has negative influences on health, it also greatly influences decision-making processes in humans. In this paper we propose a novel machine learning approach based on the fine-graded spectral analysis of resting-state EEG recordings, to diagnose chronic stress. By using this new machine learning approach, we achieve a very good balanced accuracy of 81.33%, outperforming the current benchmark by 10%. Our algorithm allows an objective assessment of chronic stress, is accurate, robust, fast and cost-efficient and substantially contributes to decision-making research, as well as Information Systems research in healthcare.
Full-text available
Besides passive sensing, ecological momentary assessments (EMAs) are one of the primary methods to collect in-the-moment data in ubiquitous computing and mobile health. While EMAs have the advantage of low recall bias, a disadvantage is that they frequently interrupt the user and thus long-term adherence is generally poor. In this paper, we propose a less-disruptive self-reporting method, "assisted recall," in which in the evening individuals are asked to answer questions concerning a moment from earlier in the day assisted by contextual information such as location, physical activity, and ambient sounds collected around the moment to be recalled. Such contextual information is automatically collected from phone sensor data, so that self-reporting does not require devices other than a smartphone. We hypothesized that providing assistance based on such automatically collected contextual information would increase recall accuracy (i.e., if recall responses for a moment match the EMA responses at the same moment) as compared to no assistance, and we hypothesized that the overall completion rate of evening recalls (assisted or not) would be higher than for in-the-moment EMAs. We conducted a two-week study (N=54) where participants completed recalls and EMAs each day. We found that providing assistance via contextual information increased recall accuracy by 5.6% (p = 0.032) and the overall recall completion rate was on average 27.8% (p < 0.001) higher than that of EMAs.
As we all know, long-term stress can have a serious impact on human health, which requires continuous and automatic stress monitoring systems. However, there is a lack of commonly used standard data sets for psychological stress detection in affective computing research. Therefore, we present a multimodal dataset for the detection of human stress (MDPSD). A setup was arranged for the synchronized recording of facial videos, photoplethysmography (PPG), and electrodermal activity (EDA) data. 120 participants of different genders and ages were recruited from universities to participate in the experiment. The data collection experiment was divided into eight sessions, including four different kinds of psychological stress stimuli: the classic Stroop Color-Word Test, the Rotation Letter Test, the Stroop Number-Size Test, and the Kraepelin Test. Participants completed the test of each session as required, and then fed back to us the self-assessment stress of each session as our data label. To demonstrate the dataset’s utility, we present an analysis of the correlations between participants’ self-assessments and their physiological responses. Stress is detected using well-known physiological signal features and standard machine learning methods to create a baseline on the dataset. In addition, the accuracy of binary stress recognition achieved 82.60%, and that of three-level stress recognition was 61.04%.