Conference PaperPDF Available

Abstract

Non-contact measurement of the heart rate is more comfortable than classical methods and can facilitate new applications. However, current approaches are very susceptible to motion. Aiming at overcoming this limitation, we propose a new, more robust approach to estimate the heart rate from a videotaped face. It features non-planar motion compensation, fusion of multiple ROI signals, and a RANSAC-like time-domain heart rate estimation algorithm. In experiments with a comprehensive pain recognition dataset we show that our approach outperforms previous methods in the presence of spontaneous head movement and facial expression.
AUTOMATIC HEART RATE ESTIMATION FROM PAINFUL FACES
Philipp Werner?Ayoub Al-Hamadi?Steffen WalterSascha GrussHarald C. Traue
?Institute for Information Technology and Communications, University of Magdeburg, Germany
{Philipp.Werner, Ayoub.Al-Hamadi}@ovgu.de
Department for Psychosomatic Medicine and Psychotherapy, University of Ulm, Germany
ABSTRACT
Non-contact measurement of the heart rate is more comfort-
able than classical methods and can facilitate new applica-
tions. However, current approaches are very susceptible to
motion. Aiming at overcoming this limitation, we propose
a new, more robust approach to estimate the heart rate from
a videotaped face. It features non-planar motion compen-
sation, fusion of multiple ROI signals, and a RANSAC-like
time-domain heart rate estimation algorithm. In experiments
with a comprehensive pain recognition dataset we show that
our approach outperforms previous methods in the presence
of spontaneous head movement and facial expression.
Index Termsheart rate, imaging photoplethysmogra-
phy (PPG), motion artifacts, facial expression, pain
1. INTRODUCTION
The heart rate (HR) and its variability are important physi-
ological parameters, not only for patients in life-threatening
conditions, but also for risk assessment [1] or finding the
right balance in sporting activities [2]. The gold standard
measurement method is electrocardigraphy (ECG). However,
it requires medical staff to properly attach electrodes to the
patient. These can cause skin irritation and discomfort. An
easier to use alternative for getting the heart rate is a pulse
oximetry sensor. It measures the peripheral blood perfusion
optically through a method called photoplethysmography
(PPG). As blood absorbs more light than surrounding tis-
sue, the blood volume pulse is reflected in periodic changes
of the light absorption. Usually, the sensor is attached to a
finger or earlobe with a spring-loaded clip, which may be
uncomfortable or even painful when worn too long.
Next to the clinically established contact PPG method,
several approaches for remote PPG has been proposed, most
working with cheap consumer cameras. They promise very
comfortable heart rate measurement and open up prospects
for new applications, e. g. in tele-medicine or sports. Some of
the methods rely on a dedicated light source [3, 4, 5, 6, 2, 7, 8]
as contact PPG. Others showed that it is possible to measure
This work was funded by the German Research Foundation (DFG),
project AL 638/3-1 AOBJ 585843.
several physiological parameters with ambient light only [9,
10, 11, 12, 13, 14].
In general, imaging-based PPG methods extract the mean
intensity of a region of interest (ROI) on face or hand for
each video frame, obtaining a temporal signal for further pro-
cessing. In case of color-cameras the signal is either ob-
tained from the mean of the green channel [9, 6, 7, 13], or
by extracting the mean-of-region signal for each color chan-
nel and applying independent component analysis [10, 11],
principal component analysis [12] or non-linear dimension
reduction techniques [14]. The latter methods aim at sepa-
rating the original PPG from interfering signals. The imag-
ing PPG signal, i. e. mean intensity, mean of green channel
or result of blind source separation is either directly analyzed
using frequency-domain methods (e. g. [9, 10, 2]) or band-
pass filtered and interpolated for detection of peaks or zero-
crossings in the time domain, which ideally correspond to
heart beats (e. g. [11, 14]). The resulting time series is filtered
for wrong beats followed by the calculation of the inter-beat-
intervals (IBIs, corresponding to RR intervals in ECG). The
mean heart rate is then determined from the mean of IBIs.
Imaging PPG measures the amount of reflected light
which changes with the cardiac cycle. However, this change
has a low amplitude compared to variations of other factors
like the location of measurement, its illumination or the cam-
era configuration. Consider an involuntary slight movement
of the head. For example, it might 1) alter the measurement
location to one with different tissue structure (skin texture), 2)
shadow the measurement location, or 3) trigger a automatic
gain correction of the camera. Each of these effects induces
artifacts in the measured signal, very often with higher am-
plitude than the pulsatile PPG signal component.
Most previous studies tried to avoid motion-induced arti-
facts by instructing the participants to minimize movement.
Even studies which aimed at motion-compensation only al-
lowed slow movement [10] or planar movement [2]. How-
ever, these constraints restrict the practical applicability. In
various potential applications the subject cannot be expected
to be motionless (e. g. for long term monitoring, or during
sports) and it never can be enforced. Further, several of the
above mentioned works do not describe complete algorithms,
but rely on manual intervention e. g. for selecting the blind
This is the accepted manuscript. The final, published version is available on IEEE Xplore.
P. Werner, A. Al-Hamadi, S. Walter, S. Gruss, und H. C. Traue, "Automatic Heart Rate Estimation from Painful Faces", in
IEEE International Conference on Image Processing, Paris, France, 2014, pp. 1947–1951.
pbl pbr
pell
pelr
perl perr
pml pmr
Fig. 1. Regions of interest and facial points used as anchors.
source separation component [2, 12] or ensuring correctness
of the extracted IBIs [8]. This is also not acceptable in practi-
cal applications.
In this paper, we contribute a novel, fully automatic imag-
ing PPG method that is more robust to strong head motion and
facial expression than previous approaches. Sect. 2 describes
our algorithm in detail. In Sect. 3 we summarize the experi-
ments we conducted with the BioVid Heat Pain Database [15,
16]. Conclusion and outlook follow in Sect. 4.
2. ROBUST HEART RATE ESTIMATION
The new approach we propose here, extracts multiple PPG
signals from motion-stabilized ROIs (Sect. 2.1), detects and
groups peaks from the signals to create a set of hypothetic
beat-IBI pairs (Sect. 2.2) and looks for most plausible IBI se-
ries (Sect. 2.3) which is used to calculate the HR. For the fol-
lowing description and our experiments we assumed the HR
to be in the range of 30 to 200 beats per minute (bpm).
2.1. Motion-Compensated PPG Signals from Video
As in several previous works, we use the face detector by
Lienhart et al. [17] as provided in the OpenCV library. Pre-
vious works determine their region of interest (ROI) directly
from the facial bounding box. In contrast, we refine the lo-
cation using state-of-the-art facial feature point detection by
Xiong et al. [18]. Among others, it provides the points de-
picted in Fig. 1. We temporally smooth the points with a five
frame median filter. From the resulting points we calculate
four ROIs as specified in the appendix. One is on the fore-
head, one on each of the cheeks, and one covering a larger
part of the face (see Fig. 1). The way of determining the ROIs
stabilizes the location of measurement for a wide range of
motions, including out-of-plane rotations. This significantly
reduces motion artifacts. Using multiple ROIs reduces the
probability of loosing PPG peaks through local artifacts, as
they are common for facial expressions (e. g. deepening of
the nasolabial fold or closing of the eyes). For each ROI
70
80
90
100
amplitude
2
0
2
amplitude
Face Forehead
Left cheek Right cheek
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8
0
5
time (s)
amp.
Fig. 2. Raw mean-of-green-channel signals (top), band-pass
filtered signals (middle), ECG signal (bottom) and some of
the corresponding video frames.
and frame we calculate the mean of the green channel (dis-
cussion in Sect. 3). The resulting signals are band-pass fil-
tered (zero-phase FIR filter, 64-point Hamming window, 0.5-
3.33 Hz). However, as obvious in Fig. 2 the proposed motion
compensation and filtering is not sufficient to remove artifacts
completely. Several problems like lighting change or occlu-
sions/disocclusions remain. As we allow for spontaneously
fast movements, the artifact’s frequency spectrum often over-
laps with the band of interest. So artifacts introduce additional
peaks or phase shifts in the band-pass filtered signal.
2.2. Inter-Beat-Interval Candidates
The next step is to detect potential PPG peaks (corresponding
to heart beats), from which we derive IBI candidates. For this,
the band-pass filtered ROI signals are first interpolated with
a cubic spline function to a sampling rate of 100 Hz. This
refines the temporal resolution for peak detection, which is
the next step. We only consider the upper half of each signal,
i. e. we subtract the signal’s median and truncate the resulting
signal on the zero-level (see Fig. 3a). We observed, that this
reduces the rate of artifact peaks. Further, if there are multiple
peaks in a period less than the minimum IBI (0.3 s) in one
signal, only the peak with highest amplitude is taken.
Next, regardless of their origin (ROI) or amplitude, the
peaks are grouped in the time dimension employing agglom-
erative clustering with complete linkage to obtain the PPG
peak candidates (see Fig. 3b-c). More precisely, if there are
multiple peaks in a period of less than 0.2 s, they are replaced
by one peak at their average time.
For each PPG peak candidate, we calculate the periods to
all potentially correct PPG peak successors, i. e. to all candi-
dates within the following 2 s (maximum IBI). This yields the
IBI candidates which can be plotted in time-IBI plane. The
plot implicitly defines a directed graph of all possible IBI se-
ries (see Fig. 3d).
0 0.5 1 1.5 2 2.533.5 4 4.5 5 5.5 6 6.5 7 7.5 8
0
0.5
1
1.5
amp.
(a)
(b)
(c)
0 0.5 1 1.5 2 2.5 3 3.5 4 4.555.5 6 6.5 7 7.5 8
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
time (s)
IBI (s)
(d)
Fig. 3. Getting the IBI candidates. (a) Upper halves of the
spline interpolated band-pass filtered ROI signals with de-
tected peaks. (b) Temporal distribution of peaks from all
ROIs. (c) Result of peak clustering (PPG peak candidates).
(d) IBI candidates with graph of all possible IBI series. Dot-
ted vertical lines are ECG R-wave peaks (for comparison).
2.3. Interval Series Consensus
The last step is to find the most plausible series of IBI inter-
vals in the time-IBI plane. Our algorithm to solve this task
is inspired by the RANSAC algorithm [19], which can fit a
model to a set of samples, even if it is dominated by outliers.
Basically, we apply the idea of creating a set of hypotheses,
ranking them and selecting the best. Each hypothesis is cre-
ated from a minimal sample set. The ranking is based on
the consensus set, i. e. the samples which are consistent with
the hypothesized model. As the IBI model we choose a lin-
ear function of time. Samples which deviate less than ±20%
from the model are considered to be consistent, i. e. are in the
consensus set (see Fig. 4).
In contrast to RANSAC, we do not determine our hy-
potheses from randomly chosen samples. Since HR does not
change abruptly, most of the randomly chosen pairs result in
a model line with too high slope. Thus, to reduce compu-
tation time we systematically select the pairs, first by only
combining samples from the left half of the plane with sam-
ples from the right, second by only considering the lowest
slope hypotheses (10% of all).
A second difference to RANSAC and its variants is how
we rank a consensus set. Again, we exploit application do-
main knowledge. First, the graph of model-consistent IBI se-
ries is created (arrows in Fig. 4). It may consist of multiple
unconnected subgraphs. Next, we look for the longest path in
this graph, i. e. the longest continuous IBI series regarding the
number of intervals (blue arrows in Fig. 4). The length of this
series is used to rank a consensus set. So, the longest of all
0 0.5 1 1.522.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
time (s)
IBI (s)
Fig. 4. Interval Series Consensus. Three hypothetic linear IBI
models (solid lines), each with its consensus set range (gray
background), IBI series graph (arrows) and longest connected
IBI series (thick blue arrows). The middle series is selected
as the best solution. ECG IBIs (orange) are shown for com-
parison. The HR estimation error of this sample is 1.14 bpm.
nearly linear IBI series with low slope is considered to be the
most plausible and evident. Finally, we calculate the HR as
60/IBI with IBI being the mean IBI of the longest series.
3. EXPERIMENTS
We conduct experiments with the BioVid Heat Pain Database
[15, 16], a comprehensive dataset collected in a study with
90 participants. In the study, each participant was subjected
to a series of painful heat stimuli alternating with periods of
rest. Video (25 Hz) of the upper body and ECG (512 Hz)
were recorded synchronously. The participants were allowed
to move freely and behave spontaneously while remaining
seated in front of the camera. So the recorded material con-
tains a lot of head motion, including non-planar motion, and
even more facial expression, namely expression of pain. We
extract 3,600 non-overlapping time windows of 8s length,
half during stimulation of highest induced pain level and half
during the resting periods. For each of the time windows, we
determined the mean heart rate (HR) from ECG (as ground
truth) and from video employing several methods. ECG heart
beats (R wave peaks) were detected through the algorithm of
Hamilton and Tompkins [20]. Some detections errors that oc-
curred were manually corrected. The determined ground truth
HR ranges from 41.5 to 112.7 bpm (mean 72.4, SD 11.3). For
each method and sample (time window), we calculate the HR
estimation error as E=HRiPP G HRECG .
Table 1 and Fig. 5 compare the error of the proposed
imaging PPG method with some other methods. In Table 1
we list the mean and standard deviation of the error, as well
as the percentage of samples with an error in range ±3 and
±10 bpm. Fig. 5 shows the empirical distribution functions
of the absolute error. We compare our approach (IBISC)
with that of Poh et al. [10] (F-ICA-2), since they describe
their method as motion-tolerant. For this, we extracted the
012345678910
0
0.2
0.4
0.6
0.8
1
absolute error e(bpm)
empir. prob. P(|E| ≤ e)
F-ICA-2 [10] F-ICA-2 ZP3 F-ICA-P ZP3
F-Green [9] IBISC (ours)
Fig. 5. Empirical cumulative distribution functions of abso-
lute heart rate estimation error |E|for several methods.
E(bpm) P(|E| ≤ ...)
Method Mean SD 3 bpm 10 bpm
F-ICA-2 [10] -5.8 22.7 0.38 0.57
F-ICA-2 ZP3 -5.8 22.0 0.47 0.59
F-ICA-P ZP3 -6.9 15.2 0.64 0.72
F-Green [9] -6.5 13.8 0.71 0.78
IBISC (ours) 0.2 10.9 0.78 0.91
Table 1. HR estimation error of several methods. Mean and
standard deviation of the error Eand empirical probability of
an absolute error |E|of less or equal to 3 and 5 bpm.
mean of the color channels from our motion-compensated
face ROI, then applied the JADE independent component
analysis (ICA) algorithm on the three color signals and fi-
nally selected the frequency with the highest power in the
band of 0.7-4 Hz from the second independent component, as
described in their paper [10]. Due to the characteristics of our
data, it was not possible to use time windows of 30 s and apply
their sliding window artifact rejection. Under this conditions,
the HR determined by Poh’s method is strongly biased and
has a high deviation from ground truth. More precisely, in
the mean the HR was estimated about 6 bpm too low and
only 57% of the samples deviated less than 10 bpm from the
correct HR. Since our shorter time window leads to less fre-
quency bins, we try to modify the approach by zero-padding
the signal before applying Fourier transform as proposed by
Verkruysse et al. [9]. I. e. we subtract the mean and append a
zero signal three times the length of the original signal. The
modified method (F-ICA-2 ZP3) is more accurate (higher
initial slope in Fig. 5). We further modified the method to
choose the independent component with the highest power
peak (F-ICA-P ZP3) as proposed by Poh et al. in a follow-up
publication [11]. This significantly improves performance.
However, it is still outperformed by simply skipping the ICA
and using the green channel instead (F-Green), as previously
done by Verkruysse et al. [9] and others. We conclude, that
there is no benefit from ICA if there are very high power
artifacts, since it introduces the additional problem to select
an output component for further processing.
You can see in Table 1 that all frequency-domain methods
(F-...) suffer from a strong bias, i. e. they tend to measure a
too low HR. The reason is their implicit assumption, that the
HR is the dominant frequency in the considered band. This
assumption is often violated, because we allowed the subjects
to move spontaneously. These movements introduce a broad
variety of artifacts, often including in-band spectral compo-
nents with higher power than the pulsatile PPG signal com-
ponent.
In contrast, our method (IBISC) is far less biased (only
0.2 bpm), as it does not rely on the spectral power assumption.
It further has the lowest error standard deviation and the best
absolute error distribution of the compared methods. E. g. the
deviation from ground truth is less than 1 bpm for 55% of the
tested samples and less than 3 bpm for 78%.
4. CONCLUSION
In this work, we have introduced a new algorithm for auto-
matic heart rate estimation from a videotaped face. It is based
on motion-compensated imaging PPG signals from multiple
ROIs and an inter-beat-interval series consensus (IBISC) al-
gorithm finding the most plausible IBI series. Our approach
has shown to be more robust to strong head movement and
facial expression than previous methods.
However, the measurement accuracy is still not sufficient
for several applications. A promising direction for further re-
search is to reduce the rate of artifact peaks. This decreases
the probability of selecting a false hypothesis in the IBISC
algorithm, and consequently decreases the rate of high mea-
surement errors.
5. APPENDIX
Here we define the regions of interest (ROI) used for the
PPG signal extraction. Given the points pell ,pelr,perl ,perr,
pbl,pbr ,pml,pmr as shown in Fig. 1 and COG(X)being
the center of gravity of the point set X, we define pec =
COG({pell,pelr ,perl,per r}),de=COG({perl ,perr})
COG({pell,pelr })and dv=COG({pml,pmr })pec. The
corner points of the face ROI are calculated as r1,i =pec +
αide+βidvwith 1i4and α1=α4=0.8,α2=α3=0.8,
β1=β2=0.6and β3=β4=1.2. The forehead ROI cor-
ner points are r2,i =pbr +γ3idvfor i=1,2and r2,i =
pbl +γi2dvfor i= 3,4with γ1=0.02 and γ2=0.3.
The left and right cheek ROI points (rcl,i and rcr,i) are
rcj,i =pmj +δ3i(pejr pmj )for i=1,2and rcj,i =
pmj +δi2(pejl pmj )for i=3,4, both for j∈ {l, r }(left,
right) with δ1=0.25 and δ2=0.75.
6. REFERENCES
[1] A. J. Camm, M. Malik, J. T. Bigger, G. Breithardt,
S. Cerutti, R. J. Cohen, P. Coumel, E. L. Fallen, H. L.
Kennedy, and R. E. Kleiger, “Heart rate variability:
standards of measurement, physiological interpretation
and clinical use.,” European Heart Journal, vol. 17, pp.
354–381, 1996.
[2] Y. Sun, S. Hu, V. Azorin-Peris, S. Greenwald, J. Cham-
bers, and Y. Zhu, “Motion-compensated noncontact
imaging photoplethysmography to monitor cardiorespi-
ratory status during exercise, Journal of Biomedical
Optics, vol. 16, no. 7, pp. 077010, 2011.
[3] T. Wu, “PPGI: new development in noninvasive and
contactless diagnosis of dermal perfusion using near In-
fraRed light,” J. of the GCPD e.V., vol. Vol. 7, no. No.
1, pp. 17–24, Oct. 2003.
[4] K. Humphreys, T. Ward, and C. Markham, “Noncontact
simultaneous dual wavelength photoplethysmography:
a further step toward noncontact pulse oximetry,” Re-
view of scientific instruments, vol. 78, no. 4, pp. 044304,
2007.
[5] G. Cennini, J. Arguel, K. Akit, and A. van Leest, “Heart
rate monitoring via remote photoplethysmography with
motion artifacts reduction, Optics Express, vol. 18, no.
5, pp. 4867, Feb. 2010.
[6] U. Rubins, V. Upmalis, O. Rubenis, D. Jakovels, and
J. Spigulis, “Real-time photoplethysmography imag-
ing system,” in 15th Nordic-Baltic Conference on
Biomedical Engineering and Medical Physics (NBC
2011), number 34 in IFMBE Proceedings, pp. 183–186.
Springer Berlin Heidelberg, Jan. 2011.
[7] C. G. Scully, J. Lee, J. Meyer, A. M. Gor-
bach, D. Granquist-Fraser, Y. Mendelson, and K. H.
Chon, “Physiological parameter monitoring from opti-
cal recordings with a mobile phone,” IEEE Transactions
on Biomedical Engineering, vol. 59, no. 2, pp. 303–306,
Feb. 2012.
[8] Y. Sun, S. Hu, V. Azorin-Peris, R. Kalawsky, and S.
Greenwald, “Noncontact imaging photoplethysmogra-
phy to effectively access pulse rate variability,” Journal
of Biomedical Optics, vol. 18, no. 6, pp. 061205, 2013.
[9] W. Verkruysse, L. O. Svaasand, and J. Stuart Nel-
son, “Remote plethysmographic imaging using ambient
light,” Optics express, vol. 16, no. 26, pp. 21434-21445,
2008.
[10] M. Z. Poh, D. J. McDuff, and R. W. Picard, “Non-
contact, automated cardiac pulse measurements using
video imaging and blind source separation,” Optics Ex-
press, vol. 18, no. 10, pp. 10762-10774, 2010.
[11] M. Z. Poh, D. J. McDuff, and R. W. Picard, Advance-
ments in noncontact, multiparameter physiological mea-
surements using a webcam,” Biomedical Engineering,
IEEE Transactions on, vol. 58, no. 1, pp. 7-11, 2011.
[12] M. Lewandowska, J. Ruminski, T. Kocejko, and
J. Nowak, “Measuring pulse rate with a webcam - a
non-contact method for evaluating cardiac activity,” in
2011 Federated Conference on Computer Science and
Information Systems (FedCSIS), 2011, pp. 405–410.
[13] Y. Sun, C. Papin, V. Azorin-Peris, R. Kalawsky, S.
Greenwald, and S. Hu, “Use of ambient light in re-
mote photoplethysmographic systems: comparison be-
tween a high-performance camera and a low-cost web-
cam,” Journal of Biomedical Optics, vol. 17, no. 3, pp.
037005, Mar. 2012.
[14] L. Wei, Y. Tian, Y. Wang, T. Ebrahimi, and T. Huang,
Automatic webcam-based human heart rate measure-
ments using laplacian eigenmap,” in Computer Vision -
ACCV 2012, pp. 281–292. Springer, 2013.
[15] P. Werner, A. Al-Hamadi, R. Niese, S. Walter, S. Gruss,
and H. C. Traue, “Towards pain monitoring: Facial ex-
pression, head pose, a new database, an automatic sys-
tem and remaining challenges,” in Proceedings of the
British Machine Vision Conference. 2013, BMVA Press.
[16] S. Walter, P. Werner, S. Gruss, H. Ehleiter, J. Tan, H. C.
Traue, A. Al-Hamadi, A. O. Andrade, G. Moreira da
Silva, and S. Crawcour, “The BioVid heat pain database:
Data for the advancement and systematic validation of
an automated pain recognition system,” in Cybernetics
(CYBCONF), 2013 IEEE International Conference on,
2013, pp. 128–131.
[17] R. Lienhart, A. Kuranov, and V. Pisarevsky, “Empirical
analysis of detection cascades of boosted classifiers for
rapid object detection,” in DAGM 25th Pattern Recog-
nition Symposium, 2003, pp. 297–304.
[18] X. Xiong and F. De la Torre, “Supervised descent
method and its applications to face alignment, in
Computer Vision and Pattern Recognition (CVPR), 2013
IEEE Conference on, 2013, pp. 532–539.
[19] M. A. Fischler and R. C. Bolles, “Random sample con-
sensus: A paradigm for model fitting with applications
to image analysis and automated cartography, Com-
mun. ACM, vol. 24, no. 6, pp. 381–395, June 1981.
[20] P. S. Hamilton and W. J. Tompkins, “Quantitative in-
vestigation of QRS detection rules using the MIT/BIH
arrhythmia database,” Biomedical Engineering, IEEE
Transactions on, vol. 12, pp. 1157–1165, 1986.
... Only a few papers used public datasets for remote HR estimation from face videos (Li et al., 2014;Werner et al., 2014;Lam and Yoshinori, 2015;Tulyakov et al., 2016). While other researchers gathered their own datasets, where experiment settings vary substantially from camera settings, lighting situations to ground truth HR measurements as shown in Appendix I in Supplementary Material. ...
... Model-based approaches have been applied with accurate localization and tracking of facial landmarks (Werner et al., 2014;Lam and Yoshinori, 2015;Tulyakov et al., 2016). Datcu et al. (2013) uses a statistical method called active appearance model to handle shape and texture variation. ...
Article
Full-text available
Remotely measuring physiological activity can provide substantial benefits for both the medical and the affective computing applications. Recent research has proposed different methodologies for the unobtrusive detection of heart rate (HR) using human face recordings. These methods are based on subtle color changes or motions of the face due to cardiovascular activities, which are invisible to human eyes but can be captured by digital cameras. Several approaches have been proposed such as signal processing and machine learning. However, these methods are compared with different datasets, and there is consequently no consensus on method performance. In this article, we describe and evaluate several methods defined in literature, from 2008 until present day, for the remote detection of HR using human face recordings. The general HR processing pipeline is divided into three stages: face video processing, face blood volume pulse (BVP) signal extraction, and HR computation. Approaches presented in the paper are classified and grouped according to each stage. At each stage, algorithms are analyzed and compared based on their performance using the public database MAHNOB-HCI. Results found in this article are limited on MAHNOB-HCI dataset. Results show that extracted face skin area contains more BVP information. Blind source separation and peak detection methods are more robust with head motions for estimating HR.
... They also expose superficial blood vessels where the light can easily transmit and bounce back to the HR's camera. Furthermore, these selected regions are less affected by the muscle motions compare to the other facial regions [35][36][37]. Initially, the corresponding forehead region is extracted with respect to the specified facial feature points in the x, y coordinates as illustrated in Fig. 9 and calculated as: ...
Article
Full-text available
Populations are ageing and the healthcare costs are increasing accordingly. Humanoid Robots (HRs) perform basic but crucial health checkups such as the heart rate can efficiently meet the healthcare demands. This paper develops a 9-stage heart rate estimation algorithm and implements it to an HR. The 9-stages cover the recognition of the face with the Viola–Jones algorithm, determination of the facial regions with the geometric-based facial distance measurement technique, extraction of the forehead and cheek regions, tracking of these facial regions with the Hierarchical Multi Resolution algorithm, decomposition of the facial regions in the Red–Green–Blue (RGB) color channels, averaging and normalization of the RGB color data, elimination of the artifacts with the Adaptive Independent Component Analysis (ICA) technique, calculating the power spectrum of the data with the Fast Fourier Transform (FFT) technique, and finally determining the peaks inside the threshold reflecting the human heart rate boundaries. One of the key contributions of this paper is building and incorporating the Hierarchical Multi Resolution technique in the heart rate estimation algorithm to eliminate the deteriorating effects of the human and camera motions. A further contribution of this paper is generating a rule-based approach to discard the effects of the sudden movements. These two contributions have noticeably improved the accuracy of the heart rate estimation algorithm in the dynamic environments. The algorithm has been assessed extensively with 5 different experimental scenarios consisting of 33 conditions.
... Obwohl Signalverläufe eng miteinander korrelieren, können einige Biosignale durch verschiedene Sensoren gemessen werden, z. B. die Herzfrequenz mittels EKG, PPG, ACM, Kameras usw.[33][34][35]. Die Mimik kann sowohl mittels Kameras als auch mittels oEMG erfasst werden, das oEMG ist dabei auf die Ableitung des M. corrugator und M. zygomaticus beschränkt[36]. ...
Article
Background In patients with limited communication skills, the use of conventional scales or external assessment is only possible to a limited extent or not at all. Multimodal pain recognition based on artificial intelligence (AI) algorithms could be a solution.Objective Overview of the methods of automated multimodal pain measurement and their recognition rates that were calculated with AI algorithms.Methods In April 2018, 101 studies on automated pain recognition were found in the Web of Science database to illustrate the current state of research. A selective literature review with special consideration of recognition rates of automated multimodal pain measurement yielded 14 studies, which are the focus of this review.ResultsThe variance in recognition rates was 52.9–55.0% (pain threshold) and 66.8–85.7%; in nine studies the recognition rate was ≥80% (pain tolerance), while one study reported recognition rates of 79.3% (pain threshold) and 90.9% (pain tolerance).Conclusion Pain is generally recorded multimodally, based on external observation scales. With regard to automated pain recognition and on the basis of the 14 selected studies, there is to date no conclusive evidence that multimodal automated pain recognition is superior to unimodal pain recognition. In the clinical context, multimodal pain recognition could be advantageous, because this approach is more flexible. In the case of one modality not being available, e.g., electrodermal activity in hand burns, the algorithm could use other modalities (video) and thus compensate for missing information.
... Continuous valued output is obtained with the related Support Vector Regression [6], [7], [10], [35], [41], [42] and Relevance Vector Regression [5], [12], [13] models. Other widely used models are Random Forests (RF) [38], [46]- [49], [55], [58]- [63], [65], [72], [80], [179], Nearest Neighbor (NN) classifiers [14], [17], [22], [40], [48], [68], [72], [73], variations of Conditional Random Fields (CRF) [8], [16], [30], [31], [33], and various neural networks. Neural networks models used for pain recognition include Convolutional Neural Networks (CNN) [27], [37], [43], [67], [80], Long Short-Term Memory (LSTM) networks [16], [27], [80], Radial Basis Function (RBF) networks [44], [49], Restricted Bolzmann Machine (RBM) [71], [105], multi-task network [50], PCA neural network [81], and traditional multi-layer perceptrons [49], [66], [69], [83]. ...
Article
Full-text available
Pain is a complex phenomenon, involving sensory and emotional experience, that is often poorly understood, especially in infants, anesthetized patients, and others who cannot speak. Technology supporting pain assessment has the potential to help reduce suffering; however, advances are needed before it can be adopted clinically. This survey paper assesses the state of the art and provides guidance for researchers to help make such advances. First, we overview pain’s biological mechanisms, physiological and behavioral responses, emotional components, as well as assessment methods commonly used in the clinic. Next, we discuss the challenges hampering the development and validation of pain recognition technology, and we survey existing datasets together with evaluation methods. We then present an overview of all automated pain recognition publications indexed in the Web of Science as well as from the proceedings of the major conferences on biomedical informatics and artificial intelligence, to provide understanding of the current advances that have been made. We highlight progress in both non-contact and contact-based approaches, tools using face, voice, physiology, and multi-modal information, the importance of context, and discuss challenges that exist, including identification of ground truth. Finally, we identify underexplored areas such as chronic pain and connections to treatments, and describe promising opportunities for continued advances.
Conference Paper
Full-text available
Video based heart rate estimation has several advantages compared to the classical method. Current approaches use long time windows (30sec) to calculate heart rates, which results in high latency and is a big disadvantage for a practical use. To overcome this constraint, we propose a low latency approach for continuous frame based heart rate estimation. It is based on combination of face tracking and skin detection using short time windows (10sec) to filter and analyze the extracted PPG signals in real time. In experiments the presented approach performs with high accuracy (85,2%, with error <3 BPM) under stable illumination conditions using a pain recognition data set including facial expressions and head movement for validation.
Conference Paper
Full-text available
Die kontaktfreie, kamerabasierte Messung von Vitalparameter des Menschen ist komfortabler als klassische kontaktbasierte Methoden. Aktuelle Verfahren haben jedoch noch Probleme in realistischen Anwendungsszenarien, z.B. bei Verdeckungen durch Haare oder Brillen. Zur Zeit werden zur Extraktion der nötigen Farbsignale geometrisch festgelegte Regionen des Gesichtes genutzt, wobei Verdeckungen der Haut unberücksichtigt bleiben. Wir schlagen vor, die zu verwendende Region anhand der Hautfarbe zu segmentieren. In diesem Paper vergleichen wir die Güte der Herzratenschätzung unter Verwendung verschiedener Regionen. Es werden klassisch geometrisch bestimmte Regionen des Gesichtes sowie mittels Hautfarbensegmentierung gewählte Regionen betrachtet. Bei unseren Experimenten hat sich gezeigt, dass die Hautfarbensegmentierung deutliche Verbesserungen mit sich bringt. Die Ergebnisse sind robust gegenüber Variation eines zu wählenden Schwellwertparameters.
Conference Paper
Full-text available
Pain is what the patient says it is. But what about these who cannot utter? Automatic pain monitoring opens up prospects for better treatment, but accurate assessment of pain is challenging due to the subjective nature of pain. To facilitate advances, we contribute a new dataset, the BioVid Heat Pain Database which contains videos and physiological data of 90 persons subjected to well-defined pain stimuli of 4 intensities. We propose a fully automatic recognition system utilizing facial expression, head pose information and their dynamics. The approach is evaluated with the task of pain detection on the new dataset, also outlining open challenges for pain monitoring in general. Additionally, we analyze the relevance of head pose information for pain recognition and compare person-specific and general classification models.
Conference Paper
Full-text available
The objective measurement of subjective, multi-dimensionally experienced pain is still a problem that has yet to be adequately solved. Though verbal methods (i.e., pain scales, questionnaires) and visual analogue scales are commonly used for measuring clinical pain, they tend to lack in reliability or validity when applied to mentally impaired individuals. Expression of pain and/or its biopotential parameters could represent a solution. While such coding systems already exist, they are either very costly and time-consuming, or have been insufficiently evaluated with regards to the theory of mental tests. Building on the experiences made to date, we collected a database using visual and biopotential signals to advance an automated pain recognition system, to determine its theoretical testing quality, and to optimize its performance. For this purpose, participants were subjected to painful heat stimuli under controlled conditions.
Article
Full-text available
Noncontact imaging photoplethysmography (PPG) can provide physiological assessment at various anatomical locations with no discomfort to the patient. However, most previous imaging PPG (iPPG) systems have been limited by a low sample frequency, which restricts their use clinically, for instance, in the assessment of pulse rate variability (PRV). In the present study, plethysmographic signals are remotely captured via an iPPG system at a rate of 200 fps. The physiological parameters (i.e., heart and respiration rate and PRV) derived from the iPPG datasets yield statistically comparable results to those acquired using a contact PPG sensor, the gold standard. More importantly, we present evidence that the negative influence of initial low sample frequency could be compensated via interpolation to improve the time domain resolution. We thereby provide further strong support for the low-cost webcam-based iPPG technique and, importantly, open up a new avenue for effective noncontact assessment of multiple physiological parameters, with potential applications in the evaluation of cardiac autonomic activity and remote sensing of vital physiological signs.
Chapter
Full-text available
Real-time non-contact photoplethysmography imaging (PPGI) system for high-resolution blood perfusion mapping in human skin has been proposed. The PPGI system comprises of LED lamp, webcam and computer with video processing software. The purpose of this study is to evaluate the reliability of the PPGI system when measuring blood perfusion. The validation study of PPGI and laser-Doppler perfusion imager (LDPI) was performed during local warming of palm skin. Results showed that the amplitude of PPGI increases immediately after warming and well correlated with the mean LDPI amplitude (R=0.92+-0.03, p<0.0001). We found that PPGI technique has good potential for non-contact monitoring of blood perfusion changes. KeywordsImaging photoplethysmography–blood perfu-sion–blood pulsations–optophysiological imaging–non-contact technique
Article
Full-text available
Imaging photoplethysmography (PPG) is able to capture useful physiological data remotely from a wide range of anatomical locations. Recent imaging PPG studies have concentrated on two broad research directions involving either high-performance cameras and or webcam-based systems. However, little has been reported about the difference between these two techniques, particularly in terms of their performance under illumination with ambient light. We explore these two imaging PPG approaches through the simultaneous measurement of the cardiac pulse acquired from the face of 10 male subjects and the spectral characteristics of ambient light. Measurements are made before and after a period of cycling exercise. The physiological pulse waves extracted from both imaging PPG systems using the smoothed pseudo-Wigner-Ville distribution yield functional characteristics comparable to those acquired using gold standard contact PPG sensors. The influence of ambient light intensity on the physiological information is considered, where results reveal an independent relationship between the ambient light intensity and the normalized plethysmographic signals. This provides further support for imaging PPG as a means for practical noncontact physiological assessment with clear applications in several domains, including telemedicine and homecare.
Conference Paper
Full-text available
In this paper the simple and robust method of measuring the pulse rate is presented. Elaborated algorithm allows for efficient pulse rate registration directly from face image captured from webcam. The desired signal was obtained by proper channel selection and principal component analysis. A developed non-contact method of heart rate monitoring is shown in the paper. The proposed technique may have a great value in monitoring person at home after adequate enhancements are introduced.
Conference Paper
Non-contact, long-term monitoring human heart rate is of great importance to home health care. Recent studies show that Photoplethysmography (PPG) can provide a means of heart rate measurement by detecting blood volume pulse (BVP) in human face. However, most of existing methods use linear analysis method to uncover the underlying BVP, which may be not quite adequate for physiological signals. They also lack rigorous mathematical and physiological models for the subsequent heart rate calculation. In this paper, we present a novel webcam-based heart rate measurement method using Laplacian Eigenmap (LE). Usually, the webcam captures the PPG signal mixed with other sources of fluctuations in light. Thus exactly separating the PPG signal from the collected data is crucial for heart rate measurement. In our method, more accurate BVP can be extracted by applying LE to efficiently discover the embedding ties of PPG with the nonlinear mixed data. We also operate effective data filtering on BVP and get heart rate based on the calculation of interbeat intervals (IBIs). Experimental results show that LE obtains higher degrees of agreement with measurements using finger blood oximetry than Independent Component Analysis (ICA), Principal Component Analysis (PCA) and other five alternative methods. Moreover, filtering and processing on IBIs are proved to increase the measuring accuracy in experiments.
Conference Paper
Many computer vision problems (e.g., camera calibration, image alignment, structure from motion) are solved through a nonlinear optimization method. It is generally accepted that 2nd order descent methods are the most robust, fast and reliable approaches for nonlinear optimization of a general smooth function. However, in the context of computer vision, 2nd order descent methods have two main drawbacks: (1) The function might not be analytically differentiable and numerical approximations are impractical. (2) The Hessian might be large and not positive definite. To address these issues, this paper proposes a Supervised Descent Method (SDM) for minimizing a Non-linear Least Squares (NLS) function. During training, the SDM learns a sequence of descent directions that minimizes the mean of NLS functions sampled at different points. In testing, SDM minimizes the NLS objective using the learned descent directions without computing the Jacobian nor the Hessian. We illustrate the benefits of our approach in synthetic and real examples, and show how SDM achieves state-of-the-art performance in the problem of facial feature detection. The code is available at www.humansensing.cs. cmu.edu/intraface.
Conference Paper
Recently Viola et al. have introduced a rapid object detection scheme based on a boosted cascade of simple feature classifiers. In this paper we introduce and empirically analysis two extensions to their approach: Firstly, a novel set of rotated haar-like features is introduced. These novel features significantly enrich the simple features of (6) and can also be calculated efficiently. With these new rotated features our sample face detector shows off on average a 10% lower false alarm rate at a given hit rate. Secondly, we present a through analysis of different boosting algorithms (namely Discrete, Real and Gentle Adaboost) and weak classifiers on the detection performance and computational complexity. We will see that Gentle Adaboost with small CART trees as base classifiers outperform Discrete Adaboost and stumps. The complete object detection training and detection system as well as a trained face detector are available in the Open Computer Vision Library at sourceforge.net (8).