Content uploaded by Philipp V. Rouast
Author content
All content in this area was uploaded by Philipp V. Rouast on Nov 28, 2018
Content may be subject to copyright.
Remote Photoplethysmography: Evaluation of
Contactless Heart Rate Measurement in an
Information Systems Setting
Philipp V. Rouast 1, Marc T. P. Adam 2, Verena Dorner 1, Ewa Lux 1
1 Karlsruhe Institute of Technology, Germany
2 The University of Newcastle, Australia
Abstract. As a source of valuable information about a person’s affective state, heart
rate data has the potential to improve both understanding and experience of human-
computer interaction. Conventional methods for measuring heart rate use skin contact
methods, where a measuring device must be worn by the user. In an Information Sys-
tems setting, a contactless approach without interference in the user’s natural environ-
ment could prove to be advantageous. We develop an application that fulfils these con-
ditions. The algorithm is based on remote photoplethysmography, taking advantage of
the slight skin color variation that occurs periodically with the user’s pulse. When eval-
uating this application in an Information Systems setting with various arousal levels
and naturally moving subjects, we achieve an average root mean square error of 7.32
bpm for the best performing configuration. We find that a higher frame rate yields
better results than a larger size of the moving measurement window. Regarding algo-
rithm specifics, we find that a more detailed algorithm using the three RGB signals
slightly outperforms a simple algorithm using only the green signal.
1 Introduction
Throughout the past decade, interest in affective states has been steadily increasing
within Information Systems (IS) research [1]. Affective states provide valuable insights
for the evaluation of artifacts in a number of IS related domains, with heart rate meas-
urement (HRM) as one of the physiological measures typically employed for their as-
sessment [2]. These domains include human-computer interaction and decision support
systems. For instance, name please [3] used HRM to evaluate the impact of computer-
ized agents on bidding behaviour in electronic auctions and name please [4] used neu-
rophysiological correlates to investigate cognitive absorption in enactive training.
There are many promising applications of real-time heart rate (HR) data as feedback
signal in various IS domains, such as technostress applications [5], e-learning systems
[6, 7], financial decision making [8–10], and electronic auctions [11–14].
Established methods for collecting HR data typically involve skin contact with elec-
tronic (electrocardiogram) or optical (photoplethysmogram) sensors. However, rela-
tively new developments in affective computing make the need of skin contact for
HRM increasingly redundant. Subtle changes in the facial region can be captured re-
motely with RGB imaging, and an estimate of the HR derived. Due to their similarity
to traditional photoplethysmography (PPG), such approaches are known as remote pho-
toplethysmography (rPPG) [15]. The primarily used signal source in rPPG is a periodic
color variation that occurs as light reflects off the skin and varies with blood volume
[16]. While much earlier research on rPPG has demonstrated its feasibility in a station-
ary setting, more recent work focuses on settings where users are allowed to move nat-
urally [15]. So far, very few studies had discussed online (i.e., real-time) applications
of rPPG algorithms. We believe that real-time HRM could prove to be particularly use-
ful for a range of applications in IS research, such as technostress applications, elec-
tronic commerce, and technology enhanced learning.
In this paper, we develop and evaluate a customizable approach for rPPG that is
suitable for real-time applications. Our first research objective is to design an artifact
with customizable parameters, based on existing approaches for rPPG, which enables
both online and offline measurements and permits parameterization of the algorithm
based on the computing capabilities of the platform [17]. We propose an algorithm
based on the phenomenon of facial skin color variation to transform images from a
video feed to HRM, in line with the general framework for rPPG proposed by [15]. In
this way, unobtrusive HRM can be made available to researchers in various domains or
directly integrated in systems as a real-time input. Our second research objective is to
evaluate the artifact in an IS context, and use offline computations to study the impact
of parameter variations on the feasibility of an online application. For this purpose, we
conduct a lab experiment in which participants are asked to complete a series of arousal-
inducing tasks.
The remainder of this paper is structured as follows: In Section 2, we discuss the
theoretical foundation for the algorithm and review existing approaches for rPPG. Sec-
tion 3 features a detailed description of our proposed algorithm, providing an overview
of the configurable parameters of the algorithm. Details and results on the evaluation
in an IS context are given in Section 4. We end with discussion and conclusion in Sec-
tion 5.
2 Theoretical Background
In PPG, human HR is derived from an optically obtained volumetric measurement (ple-
thysmogram) of the heart. Hertzman and Spealman [18] first noted that a variation in
light transmission of a finger could be measured using a photoelectronic cell. This
change in light transmission and reflection on the skin as an indication of cardiac activ-
ity is related to the optical properties of blood in motion [19]. Today, PPG using skin
contact and dedicated light sources is also commonly used in smart watches and fitness
bands, such as Fitbit Charge HR and Microsoft Band.
Only recently, researchers have started using ambient light sources and digital cam-
eras to capture the plethysmographic signal remotely. Verkruysse et al. [20] showed
that a video captured using an inexpensive, consumer-grade camera contained a rich
enough plethysmographic signal to measure functions like HR and respiration rate.
Fig. 1. A typical application of rPPG. An RGB camera captures at least the facial region of the
subject which is illuminated by ambient light. The distance between camera and subject may be
up to several meters
A typical application of rPPG (Figure 1) involves a subject – often seated at a desk –
and a video camera positioned up to several meters away. The camera captures at least
the subject’s face, which is illuminated by ambient light. Any continuous segment of
the resulting video sequence may be used to produce a HR estimate. If the temporal
development of the HR is of interest, a sliding time window can be used to produce a
series of HR estimates. Choosing the size of this sliding time window presents a trade-
off: While a smaller time window reduces computational complexity and allows for a
higher temporal resolution, a greater time window reduces the theoretically expected
minimum estimation error. This estimation error follows from the frequency resolution
where denotes the size of the sliding time window in seconds. For
example, with a window size of 6 seconds, HR can only be measured with an accuracy
of 10 bpm. Assuming uniformly distributed HR, it follows that the expected minimum
estimation error equals
, or 2.5 bpm.
In the following, we discuss the three key steps in rPPG: (i) extraction of the raw
signal, (ii) estimation of the plethysmographic signal, and (iii) HR estimation. There
exists a multitude of possible choices for each of these three steps, choices being in part
dependent on the specifics of the planned application. These include, e.g., expected
movement of the subject and available resources for computation. In our case, we are
specifically interested in an IS setting, i.e., users moving naturally while working at a
desktop workstation.
2.1 Extraction of the Raw Signal
The first step in rPPG is extracting the raw signal from an input sequence of images of
the subject’s head. This generally involves a number of computations which are re-
peated and yield one or multiple real values for each input frame. A region of interest
(ROI), usually in the subject’s face, is marked in each frame. The raw signal is extracted
as one or multiple of the RGB color channels using spatial pooling.
While in earlier work about rPPG the ROI was selected manually in the first frame
of the video (e.g., [20, 21]), a common option nowadays is to use an algorithm for
automated face detection to find facial boundaries [e.g., 19–21]. For more accurate po-
sition information, some researchers use algorithms for facial landmark detection [e.g.,
22, 23] or skin detection [e.g., 21, 24].
The simplest choice for ROI is the bounding box returned by the classifier [e.g., 18,
25]. As this naïve ROI may cause noise due to included background pixels, many au-
thors only include 60% of its width [e.g., 19, 20, 26]. Further research has shown that
signal strength is not uniformly distributed over facial skin. The forehead and the
cheeks exhibit maximum signal strength [30]. These areas are therefore common
choices for ROI [e.g., 18, 28, 29].
Unless a subject remains absolutely stationary, the ROI needs to be updated for each
frame in order to make the pixels in the ROI invariant to subject motion. In an IS setting
with natural motion, this is an important component of the first step. Re-running the
detection step for every frame [e.g., 19, 21, 30] is a simple, but not computationally
efficient way to achieve this functionality. Some work [25, 31, 34] estimates an affine
transformation for the ROI from frame to frame by tracking a set of suitable points in
the face. This way, tracking arbitrary ROIs at reasonable levels of complexity becomes
possible.
Finally, the raw signal is computed by spatially pooling all pixels comprising the
chosen ROI [e.g., 17–19], i.e., averaging the values of the desired color channels within
the ROI. While the green channel contains the strongest plethysmographic signal [20],
both the red and blue channel also contain complementary information. Combinations
of all three RGB channels [e.g., 19–21], two channels [21] as well as the green channel
only [25, 26] have been used successfully.
2.2 Estimation of the Plethysmographic Signal
The raw signal can be interpreted as the temporal development of the absolute intensi-
ties of the selected RGB color channels. This multidimensional time series contains a
periodic component, which corresponds to the HR, but also contains unwanted high-
and low frequency noise. Low frequency noise can be caused by gradual movements
and illumination changes; high frequency noise by sudden movements. The second step
of rPPG aims at improving the signal-to-noise ration by removing frequencies that lie
outside the frequency band expected for the HR. When multiple color channels are
used, this step also reduces the signal to one dimension.
Since solely the periodicity of the signal is of interest, the raw signal is typically
normalized before it is processed any further [e.g., 19, 20, 30]. Both unwanted high-
and low frequency noise can be removed using a bandpass filter [e.g., 18, 22, 32]. Cut-
off frequencies of 0.7 Hz and 4 Hz are usually applied [15]. Alternatively, low fre-
quency noise can be removed by using a detrending filter [36] which presents a high
pass equivalent. Correspondingly, high frequency noise can be removed with a low pass
equivalent such as a moving average filter [e.g., 20, 22, 34].
If multiple channels are used, the dimensionality of the signal is typically reduced
by linearly combining the channels. The optimal parameter choice for this combination
is a much discussed issue. Most authors rely on techniques from the field of Blind
Source Separation (BSS) such as Independent Component Analysis (ICA) [e.g., 19, 20,
35] or Principal Component Analysis (PCA) [e.g., 18, 34, 36]. From the results, the
component with the highest periodicity is selected, according to spectral power [e.g.,
20, 21, 30].
2.3 HR Estimation
Given the estimated plethysmographic signal, the HR is estimated using frequency
analysis. Most authors use an algorithm such as the Fast Fourier Transform (FFT) to
perform a Discrete Fourier Transform (DFT) [e.g., 18, 19, 21]. Then, the index of the
maximum power response in the frequency domain corresponds to the detected HR. If
the individual beat-by-beat intervals are of interest, a peak detection algorithm should
be applied [e.g., 20].
3 Approach
Between the choice of rPPG algorithm – e.g., signal used, steps to filter the signal and
estimate the HR – and practical choices such as temporal window size and frame rate
(due to limited computing resources, particularly in online analysis), there is a multi-
tude of options for algorithm parametrization. We narrow the range of possible param-
eters down to three major choices, and evaluate their impact on the accuracy of HR
estimation in the following section.
Table 1. Command line arguments for the rPPG application. Each argument has several options
and a default parameter setting.
Flag
Description
Options
-i
Path to input video
Omit flag to use webcam
-a
Specify rPPG algorithm variant
g to use only green channel (default)
rgb to use red, green, and blue channel
with PCA
-max
Maximum size of the sliding time
window in seconds
Any positive integer (default: 6)
-ds
Down-sample by using every xth
frame
Any positive integer (default: 1)
-gui
Display the GUI
true or false (default: true)
-r
Re-detection interval in seconds
Any positive integer (default: 1)
We developed a command line rPPG application that takes as input either a video file
or a real-time feed from a video camera. The application supports a simple rPPG algo-
rithm that uses only the green channel, and a more advanced rPPG algorithm that uses
all RGB channels. Both algorithms use filtering methods commonly used in past works
on rPPG. HR estimates are calculated and written to a log file for every step using a
sliding window with customizable size. If a video is used as input, the frame rate can
optionally be downsampled. Table 1 lists the available parameters.
Both pre-recorded input video and real-time webcam feed are handled by the same
algorithm. For pre-recorded input, the effectively achieved frame rate is pre-deter-
mined, but can be downsampled. For real-time video, the achieved frame rate is dy-
namic and dependent on the computation rate. Once a face is recognized, the time win-
dow is populated with raw data and estimates are produced once the minimum window
size is reached. The window starts moving when the maximum size is reached, such
that new estimates are always based on the past seconds in the window. If the GUI is
activated, this process is visualized.
We use the Viola-Jones (VJ) object detector [40] as most previous works do [15] to
find the biggest face in the frame. Using Haar-like features, this classifier is trained to
detect frontal faces and returns a bounding box of the detected object. Once a face has
been detected, we use the coordinates of the bounding box to select a rectangle on the
forehead as the ROI. Specifically, the ROI has 40% of bounding box width and 15% of
bounding box height, as shown graphically in Fig. 2. Both the bounding box and ROI
are tracked in subsequent frames. For this, we find a set of prominent tracking points
within the ROI selected using the algorithm of [41]. These points are subsequently
tracked from frame to frame using the Kanade-Lucas-Tomasi algorithm [42]. We then
use the two sets of original and tracked points to calculate an optimal affine transform
which is applied to the bounding box of the face and ROI, similar to the approach of
[31]. Thus, we are able to track the ROI smoothly without having to run face detection
for every frame. For greater robustness, we re-detect the face at an adjustable interval.
Fig. 2. The ROI is defined based on the bounding box from the Viola-Jones algorithm. A set of
tracking points is used to update the ROI in subsequent frames
By applying the respective ROI as a mask, we extract the raw signal as the average R,
G, and B channels for every frame. This step gives the one-dimensional green signal
for the simple algorithm variant and the three-dimensional RGB signal for the more
advanced rPPG algorithm. Depending on the effective frame rate and window size, the
length of the signal can vary, e.g., from 90 frames (at a window size of 6 seconds and
effective frame rate of 15 frames per second (fps)) to 360 frames (at a window size of
12 seconds and effective frame rate of 30 fps).
The rPPG application then removes unwanted high- and low frequency noise. Since
re-detection can cause the ROI to ‘jump’, which is reflected in the raw signal, we ini-
tially apply a custom filter to clear any rapid leaps caused by re-detection. To this end,
we keep track of when re-detection occurred and set the first difference in the signal to
zero in these instances. In the following steps, we have chosen common choices from
existing work on rPPG [15]. The resulting de-noised signal is first normalized, the level
being irrelevant for our analysis. Low frequency noise, typically a trend in the signal,
is subsequently removed with the advanced detrending filter proposed by [36]. Finally,
we remove high frequency noise by applying a moving average filter to the signal. Fig.
3 illustrates these steps using exemplary data from the green channel.
Fig. 3. Exemplary values for a simple rPPG algorithm using only the green channel
In the case of the simple rPPG algorithm variant, the steps described above are applied
to the one-dimensional signal from the green channel as in Fig. 3, to yield the estimated
plethysmographic signal with a distinct periodicity. For the advanced approach using
the RGB channels, the first three steps are applied to each channel individually: Re-
moval of noise due to re-detection, normalization and detrending. Hereafter, we run a
PCA using the three filtered RGB channels. The PCA produces three linearly uncorre-
lated components, each a linear combination of the three RGB signals. Following [21],
we assume that one of the components corresponds to the plethysmographic signal,
containing a distinct periodicity. We hence select the component with the most distinct
periodicity: After converting each component to the frequency domain using a DFT,
we find the maximum power response of a single frequency for each component. The
component with the greatest power response is selected. Finally, we apply a moving
average filter to this component to remove the remaining high frequency noise, yielding
the estimated plethysmographic signal for this algorithm. Fig. 4 reports exemplary data
for this approach using the same video as Fig. 3. Note that the selected principal com-
ponent in this example is very similar to the filtered signal from the green channel.
Fig. 4. Exemplary values for an rPPG algorithm using all three RGB channels. The PCA is used
to produce three components, from which the one with the highest periodicity is selected
Estimation of the HR concludes each rPPG algorithm. Using the DFT, the plethysmo-
graphic signal is converted to the frequency domain, and we find the frequency with
the maximum power response. Using the frequency index of the maximum power re-
sponse , the size of the signal , and the effective sampling rate
, we calculate the
corresponding HR estimate as
.
4. Experimental Evaluation
Data for the evaluation of both rPPG algorithm variants was collected in a lab experi-
ment at KD2Lab in Karlsruhe, Germany.
1
A total of 20 participants (8 females, 12
males) were recruited from a pool of students. Each participant was seated at a desk in
front of a computer monitor and asked to participate in four different experiment phases
with differing tasks. Meanwhile, the participant was recorded on video using a Logitech
C270 webcam at 640x480 VGA resolution and a frame rate of 30 fps. The video was
encoded using the H.264 codec and stored in an mp4 container. Distance from the cam-
era was approximately 0.5 m. Baseline HR data was collected simultaneously using
Bioplux finger PPG and Bioplux ECG [43]. During the experiment, participants moved
naturally when interacting with the computer and working on the experiment tasks. All
participants gave consent to having their video and physiological data used for HR es-
timation and validation.
Fig. 5. Experimental setup: The subject is seated at a desk and presented with an experiment task.
Video and HR data are captured using webcam and ECG/PPG
The experiment comprised four phases which differed with regard to levels of arousal
and mobility. Before each experiment phase, the participant received written instruc-
tions on paper, such that he/she had the opportunity to read them while they were played
back from an audio recording. Instructions also included information about the perfor-
mance-based payoff in real money. After each phase, the participant filled out a short
questionnaire on-screen.
The first phase was a rest phase, where participants were asked to relax for five
minutes. This was followed by two phases with dynamic auctions. We built on the de-
sign of a recent Dutch auction experiment by [44], since this dynamic auction format
is known to induce emotional arousal. In order to induce different levels of emotional
arousal, one block of six auctions was configured with low value uncertainty and low
time pressure (clock speed: 0.4 seconds per price step), the other block of six auctions
with high value uncertainty and high time pressure (clock speed: 0.2 seconds per price
step). The order of these two phases was varied randomly and duration was approxi-
mately 5 and 8 minutes for the fast and slow Dutch auctions, respectively. The fourth
1
A computer-based experimental laboratory, see http://www.kd2lab.kit.edu/.
and last phase consisted of an arousal inducing task as described in [45]. Here, partici-
pants were asked to find a specific sequence of symbols amongst 20 alternatives under
time pressure. This last phase took 5 minutes. Including instructions, questionnaires
and test rounds, experiment duration averaged approximately 32 minutes. The experi-
mental software was implemented in Brownie [46, 47].
5. Results
Our evaluation focuses on the effect on HRM accuracy of (i) the selection of color
channels, (ii) the frame rate, and (iii) the size of the time windows. First, with respect
to selection of color channels, we apply one parametrization where only the green chan-
nel (henceforth the G algorithm) is used. In a further parametrization, all three RGB
channels are combined using a PCA (henceforth the RGB algorithm). All other steps
(apart from signal choice and additional use of the PCA) are identical. Second, with
respect to the impact of the effective frame rate on HR accuracy, we compare the results
achieved using video at 30 fps to the results achieved using down-sampled video at 15
fps. Third, with respect to size of the time window, we investigate the difference in
accuracy at window sizes of 6 seconds and 12 seconds. Theoretically, a larger window
decreases the expected minimum estimation error as discussed in Section 2, but possi-
ble side-effects on typical errors in rPPG are unclear.
Table 2. Average RMSE for different algorithm and parameter combinations
Algorithm
G
RGB
window size
window size
6 seconds
12 seconds
6 seconds
12 seconds
frame
rate
15 fps
12.26 bpm
13.70 bpm
10.53 bpm
11.20 bpm
30 fps
8.72 bpm
9.12 bpm
8.18 bpm
7.32 bpm
To reiterate, we are interested in detecting the temporal development of HR using rPPG.
We calculated HR as mean HR based on rPPG every 10 seconds and, for validation,
mean HR based on the finger clip PPG sensor. Missing data for this baseline measure-
ment was complemented using the ECG measurements. For each participant and ex-
periment phase, this gives us the root mean square error (RMSE) between a given rPPG
configuration and the baseline HRM. Our analysis is based on all four phases of the
experiment. Table 2 lists the mean RMSE for the different algorithm and parameter
combinations. For each algorithm-parameter combination, this represents the mean
RMSE across all participants and phases.
In the following, we discuss the implication of these results with regards to the
choice of algorithm, frame rate, and window size. A visualization of the results includ-
ing error bars is displayed in Fig. 6.
Fig. 6. Average RMSE for each algorithm and parameter combination
An immediate observation from Fig. 6 is that the higher frame rate of 30 fps seems to
lead to more accurate HRM across both algorithms and window sizes. This intuitive
finding is supported by Welch t-tests: The null hypothesis of error rates being equal can
be rejected for each combination of algorithm and window size (algorithm G and win-
dow size 6s: p = .0015; G and 12s: p = .0001; RGB and 6s: p = .015; RGB and 12s: p
= .0002). In contrast, the window size used in our rPPG algorithms does not have a
significant effect on the average RMSE, despite the theoretically smaller minimum es-
timation error. Due to the higher actual error rates, this effect may be irrelevant here.
Note that on average, a greater window size leads to a higher RMSE for the G algo-
rithm, although it is associated with a significantly higher computational complexity.
Table 3. Number of channels and frames for algorithm and parameter combination. The RGB
algorithm uses three channels.
Algorithm
G
RGB
window size
window size
6 seconds
12 seconds
6 seconds
12 seconds
frame
rate
15 fps
1 channel
90 frames
1 channel
180 frames
3 channels
90 frames
3 channels
190 frames
30 fps
1 channel
180 frames
1 channel
360 frames
3 channels
180 frames
3 channels
360 frames
In general, the number of frames upon which HR estimation is based is a major deter-
minant of the algorithm’s computational complexity, which increases at least linearly
with the number of frames. Both a combination of a 12 second window with a frame
rate of 15 fps and a 6 second window with a frame rate of 30 fps lead to 180 frames in
the buffer per channel (Table 3), such that the respective increase in computational
complexity is comparable. Hence, our results indicate that for both implemented algo-
rithms, a higher frame rate should be preferred over a larger window size.
Comparing the two rPPG algorithm variants, the RGB version on average performs
better for all combinations of frame rate and window size. Using Welch t-tests, a sig-
nificant difference can be detected for the combination with window size of 12 seconds
(For frame rate 15fps: p = .0444; for 30fps: p = .0559). Hence, considering the addi-
tional computational complexity of two channels and PCA, we recommend choosing
the G approach for scenarios where computation power is costly, such as an online
application scenario, particularly in mobile settings, and the RGB approach when com-
putation with a larger window size can be done offline.
Since we are particularly interested in online non-stationary settings, we now have a
closer look at the G algorithm with a window of 6 seconds and the RGB algorithm with
a window of 12 seconds. For each, we choose the full frame rate of 30 seconds. Fig. 7
gives an example for a participant where rPPG using the RGB algorithm performed
comparably well, with RMSE between 5 and 7 bpm. The experiment phases (rest phase,
two auction phases and arousal task) are marked in grey.
Fig. 7. Timeline of a participant’s HR Baseline measurement and corresponding rPPG measure-
ments. Experiment phases are marked in grey
In the first auction phase and the arousal game, the participant’s HR peaks when the
task starts and then decreases. This temporal development of the participant’s affective
state is captured by the rPPG algorithm. In between phases, and occasionally within
phases, outliers are observable that could be removed in a more sophisticated rPPG
algorithm, e.g., by removing values that are outside a certain range. Note that partici-
pants were reading instructions in between phases, possible turning their face away
from the camera, which may explain some of the inaccuracies between phases.
In a direct comparison between the selected G and RGB algorithms, the difference
in accuracy can be compared beyond the average RMSE found in Table 3. Using all
individual pairs of HRM from rPPG and Bioplux baseline, we find a correlation of
Pearson’s r = .64 for the G algorithm, and Pearson’s r = .73 for the RGB algorithm.
This difference is visualized in Fig. 8. Note that due to the large amount of data points,
outliers appear visually slightly exaggerated. Points are colored according to the exper-
iment phase they were recorded in.
Fig. 8. Scatterplot of Baseline versus rPPG HRM for two selected algorithms
For both algorithms, many of the extreme outliers appear to belong to the phase of the
arousal task, which may be attributed to both the higher HR and increased subject
movement in this phase. There does not appear to be any significant measurement bias
for the algorithms: On average, the G algorithm underestimates the baseline HR by 1.01
bpm, while the RGB algorithm overestimates the baseline by .85 bpm.
6. Conclusion and Outlook
In an IS context, HR data are becoming increasingly valuable as a source of information
about a subject’s affective states [3, 4, 48]. The recently explored methods for remote
HRM using rPPG [15] promise a low-cost application without interfering in a profes-
sional work environment, enabling less obtrusive measurements in situ.
In this paper, we introduced a customizable implementation of rPPG with low-cost
RGB cameras. This implementation is based on an approach representative for existing
work on rPPG and draws on methods commonly used to measure the HR based on
rPPG. Customizing options include (i) choice of using the green channel only or all
available RGB channels, (ii) sampling rate, and (iii) window size used for measure-
ments. As computational resources are limited in online environments and particularly
when using mobile devices, we evaluated different parametrizations of our rPPG im-
plementation in a laboratory experiment with 20 participants who participated in four
tasks to induce different levels of emotional arousal.
We find that the frame rate has a significant influence on HRM accuracy. Higher
frame rates, rather than larger window sizes, improve HRM accuracy considerably.
Concerning the choice of signal channels, we find that using all three RGB channels
delivers slightly better results on average, especially in combination with a larger meas-
urement window. If computational resources are sparse however, we recommend fall-
ing back to the green channel, which carries the strongest plethysmographic signal.
While the overall RMSE is not as small as reported in other work on rPPG [15], it is
known that error rates are difficult to compare, since they depend on a number of cir-
cumstances, such as the movements patterns due to the experimental task or laboratory
setting. Our application concentrates on the temporal development of HR as we con-
sider a continuous series of measurements made using rPPG in tasks with different lev-
els of arousal. We developed an application for rPPG measurements that can be used
for video file or real-time feeds from a video camera and provide a set of parameters
that can be adjusted to increase measurement accuracy. Hence, our work is encouraging
for future work on real-time applications of rPPG.
References
1. Riedl, R., Davis, F.D., Hevner, A.R.: Towards a NeuroIS Research
Methodology: Intensifying the Discussion on Methods, Tools, and
Measurement. J. Assoc. Inf. Syst. 15 (2014) i–xxxv
2. Adam, M.T.P., Krämer, J., Gamer, M., Weinhardt, C.: Measuring emotions in
electronic markets. In: ICIS 2011 Proceedings. (2011) 1–19
3. Teubner, T., Adam, M.T.P., Riordan, R.: The Impact of Computerized Agents
on Immediate Emotions, Overall Arousal and Bidding Behavior in Electronic
Auctions. J. Assoc. Inf. Syst. 16 (2015) 838–879
4. Léger, P.-M., Davis, F.D., Cronan, T.P., Perret, J.: Neurophysiological
Correlates of Cognitive Absorption in an Enactive Training Context. Comput.
Human Behav. 34 (2014) 273–283
5. Adam, M.T.P., Gimpel, H., Maedche, A., Riedl, R.: Design Blueprint for
Stress-sensitive Adaptive Enterprise Systems. Bus. Inf. Syst. Eng. (2016)
6. Shen, L., Wang, M., Shen, R.: Affective E-Learning: Using “Emotional” Data
to Improve Learning in Pervasive Learning Environment. Educ. Technol. Soc.
12 (2009) 176–189
7. Astor, P.J., Adam, M.T.P., Jerčić, P., Schaaff, K., Weinhardt, C.: Integrating
biosignals into information systems: A neurois tool for improving emotion
regulation. J. Manag. Inf. Syst. 30 (2013) 247–278
8. Hariharan, A., Adam, M.T.P.: Blended Emotion Detection For Decision
Support. IEEE Trans. Human-Machine Syst. 45 (2015) 510–517
9. Adam, M.T.P., Kroll, E.B.: Physiological evidence of attraction to chance. J.
Neurosci. Psychol. Econ. 5 (2012) 152–165
10. Hariharan, A., Adam, M.T.P., Astor, P.J., Weinhardt, C.: Emotion regulation
and behavior in an individual decision trading experiment: Insights from
psychophysiology. J. Neurosci. Psychol. Econ. 8 (2015) 186–202
11. Adam, M.T.P., Krämer, J., Müller, M.B.: Auction fever! How time pressure
and social competition affect bidders’ arousal and bids in retail auctions. J.
Retail. 91 (2015) 468–485
12. Adam, M.T.P., Krämer, J., Weinhardt, C.: Excitement up! Price down!
Measuring emotions in dutch auctions. Int. J. Electron. Commer. 13 (2012) 7 –
39
13. Adam, M.T.P., Astor, P.J., Krämer, J.: Affective images, emotion regulation
and bidding behavior: An experiment on the influence of competition and
community emotions in internet auctions. J. Interact. Mark. 35 (2016) 56–69
14. Müller, M.B., Adam, M.T.P., Cornforth, D.J., Chiong, R., Krämer, J.,
Weinhardt, C.: Selecting physiological features for predicting bidding behavior
in electronic auctions. In: Proceedings of the Forty-Ninth Annual Hawaii
International Conference on System Sciences (HICSS). (2016) 396–405
15. Rouast, P. V, Adam, M.T.P., Chiong, R., Cornforth, D.J., Lux, E.: Remote heart
rate measurement using low-cost RGB face video: A technical literature
review. Front. Comput. Sci. (2016)
16. Allen, J.: Photoplethysmography and its application in clinical physiological
measurement. Physiol. Meas. 28 (2007) R1–R39
17. Rouast, P. V, Adam, M.T.P., Cornforth, D.J., Lux, E., Weinhardt, C.: Using
contactless heart rate measurements for real-time assessment of affective states.
In: Davis, F.D., Riedl, R., Vom Brocke, J., Léger, P.-M., and Randolph, A.B.
(eds.): Information Systems and Neuroscience. (2016)
18. Hertzman, A.B., Spealman, C.R.: Observations on the finger volume pulse
recorded photoelectrically. Am. J. Physiol. 119 (1937) 334–335
19. Roberts, V.C.: Photoplethysmography - fundamental aspects of the optical
properties of blood in motion. Trans. Inst. Meas. Control. 4 (1982) 101–106
20. Verkruysse, W., Svaasand, L.O., Nelson, J.S.: Remote plethysmographic
imaging using ambient light. Opt. Express. 16 (2008) 21434–21445
21. Lewandowska, M., Ruminski, J., Kocejko, T.: Measuring pulse rate with a
webcam - A non-contact method for evaluating cardiac activity. In:
Proceedings of the 2011 Federated Conference on Computer Science and
Information Systems (FedCSIS). (2011) 405–410
22. Poh, M.-Z., McDuff, D.J., Picard, R.W.: Non-contact, automated cardiac pulse
measurements using video imaging and blind source separation. Opt. Express.
18 (2010) 10762–10774
23. Poh, M.-Z., McDuff, D.J., Picard, R.W.: Advancements in noncontact,
multiparameter physiological measurements using a webcam. IEEE Trans.
Biomed. Eng. 58 (2011) 7–11
24. De Haan, G., Jeanne, V.: Robust pulse rate from chrominance-based rPPG.
IEEE Trans. Biomed. Eng. 60 (2013) 2878–2886
25. Li, X., Chen, J., Zhao, G., Pietikäinen, M.: Remote heart rate measurement
from face videos under realistic situations. In: Proceedings of the 2014 IEEE
Computer Society Conference on Computer Vision and Pattern Recognition.
(2014) 4264–4271
26. Tasli, H.E., Gudi, A., Den Uyl, M.: Remote ppg based vital sign measurement
using adaptive facial regions. In: Proceedings of the 2014 IEEE International
Conference on Image Processing (ICIP). (2014) 1410–1414
27. Lee, K.-Z., Hung, P.-C., Tsai, L.-W.: Contact-free heart rate measurement
using a camera. In: Proceedings of the 2012 Ninth Conference on Computer
and Robot Vision (CRV). (2012) 147–152
28. Xu, S., Sun, L., Rohde, G.K.: Robust efficient estimation of heart rate pulse
from video. Biomed. Opt. Express. 5 (2014) 1124–35
29. Wei, L., Tian, Y., Wang, Y., Ebrahimi, T.: Automatic webcam-based human
heart rate measurements using laplacian eigenmap. In: Lecture Notes in
Computer Science. (2013) 281–292
30. Lempe, G., Zaunseder, S., Wirthgen, T., Zipser, S., Malberg, H.: Roi selection
for remote photoplethysmography. In: Meinzer, H.-P., Deserno, M.T., Handels,
H., and Tolxdorff, T. (eds.): Informatik aktuell. (2013) 99–103
31. Feng, L., Po, L.-M., Xu, X., Li, Y.: Motion artifacts suppression for remote
imaging photoplethysmography. In: Proceedings of the 19th International
Conference on Digital Signal Processing (DSP). (2014) 18–23
32. Feng, L., Po, L.M., Xu, X., Li, Y., Ma, R.: Motion-resistant remote imaging
photoplethysmography based on the optical properties of skin. IEEE Trans.
Circuits Syst. Video Technol. 25 (2015) 879–891
33. Kwon, S., Kim, H., Park, K.S.: Validation of heart rate extraction using video
imaging on a built-in camera system of a smartphone. In: Proceedings of the
2012 IEEE Annual International Conference of the Engineering in Medicine
and Biology Society. (2012) 2174–2177
34. Kumar, M., Veeraraghavan, A., Sabharwal, A.: DistancePPG: Robust non-
contact vital signs monitoring using a camera. Biomed. Opt. Express. 6 (2015)
1565–1588
35. Hsu, Y., Lin, Y.L., Hsu, W.: Learning-based heart rate detection from remote
photoplethysmography features. In: Proceedings of the 2014 IEEE
International Conference on Acoustics, Speech and Signal Processing
(ICASSP). (2014) 4433–4437
36. Tarvainen, M.P., Ranta-Aho, P.O., Karjalainen, P.A.: An advanced detrending
method with application to hrv analysis. IEEE Trans. Biomed. Eng. 49 (2002)
172–175
37. Holton, B.D., Mannapperuma, K., Lesniewski, P.J., Thomas, J.C.: Signal
recovery in imaging photoplethysmography. Physiol. Meas. 34 (2013) 1499–
1511
38. McDuff, D., Gontarek, S., Picard, R.W.: Improvements in remote
cardiopulmonary measurement using a five band digital camera. IEEE Trans.
Biomed. Eng. 61 (2014) 2593–2601
39. Wang, W., Stuijk, S., De Haan, G.: Exploiting spatial redundancy of image
sensor for motion robust rppg. IEEE Trans. Biomed. Eng. 62 (2015) 415–425
40. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple
features. In: Proceedings of the 2001 IEEE Computer Society Conference on
Computer Vision and Pattern Recognition. (2001) 511–518
41. Shi, J., Tomasi, C.: Good features to track. In: Proceedings of the 1994 IEEE
Computer Society Conference on Computer Vision and Pattern Recognition.
(1994) 593–600
42. Lucas, B.D., Kanade, T.: An iterative image registration technique with an
application to stereo vision. In: Proceedings of the 7th International Joint
Conference on Artificial Intelligence (IJCAI). (1981) 674–679
43. Bioplux: Wireless Biosognals, http://www.plux.info/index.php/en/ [accessed
2016-08-19]
44. Hariharan, A., Adam, M.T.P., Teubner, T., Weinhardt, C.: Think, feel, bid: The
impact of environmental conditions on the role of bidders’ cognitive and
affective processes in auction bidding. Electron. Mark. (2016) 1–17
45. Schaaff, K., Degen, R., Adler, N., Adam, M.T.P.: Measuring Affect Using a
Standard Mouse Device. Biomed. Eng. (NY). 57 (2012) 761–764
46. Hariharan, A., Adam, M.T.P., Dorner, V., Lux, E., Müller, M.B., Pfeiffer, J.,
Weinhardt, C.: Brownie: A platform for conducting neurois experiments.
(2015)
47. Müller, M.B., Hariharan, A., Adam, M.T.P.: A NeuroIS Platform for Lab
Experiments. In: Gmunden Retreat on NeuroIS. (2014) 15–17
48. Riedl, R.: On the biology of technostress: Literature review and research
agenda. ACM SIGMIS Database. 44 (2013) 18–55