ArticlePDF Available

Accuracy of Commercially Available Smartwatches in Assessing Energy Expenditure During Rest and Exercise

Authors:
Accuracy of Commercially Available Smartwatches in Assessing
Energy Expenditure During Rest and Exercise
Zachary C. Pope
University of Minnesota
Nan Zeng
Colorado State University
Xianxiong Li and Wenfeng Liu
Hunan Normal University
Zan Gao
University of Minnesota
Background: This study examined the accuracy of Microsoft Band (MB), Fitbit Surge HR (FS), TomTom Cardio Watch (TT),
and Apple Watch (AW) for energy expenditure (EE) estimation at rest and at different physical activity (PA) intensities. Method:
During summer 2016, 25 college students (13 females; M
age
=23.52 ± 1.04 years) completed four separate 10-minute exercise
sessions: rest (i.e., seated quietly), light PA (LPA; 3.0-mph walking), moderate PA (MPA; 5.0-mph jogging), and vigorous PA
(VPA; 7.0-mph running) on a treadmill. Indirect calorimetry served as the criterion EE measure. The AW and TT were placed on
the right wrist and the FS and MB on the leftserving as comparison devices. Data were analyzed in late 2017. Results: Pearson
correlation coefcients revealed only three signicant relationships (r=0.430.57) between smartwatchesEE estimates and
indirect calorimetry: rest-TT; LPA-MB; and MPA-AW. Mean absolute percentage error (MAPE) values indicated the MB
(35.4%) and AW (42.3%) possessed the lowest error across all sessions, with MAPE across all smartwatches lowest during the
LPA (33.7%) and VPA (24.6%) sessions. During equivalence testing, no smartwatchs 90% CI fell within the equivalence region
designated by indirect calorimetry. However, the greatest overlap between smartwatches90% CIs and indirect calorimetrys
equivalency region was observed during the LPA and VPA sessions. Finally, EE estimate variation attributable to the use of
different manufacturers devices was greatest at rest (53.7 ± 12.6%), but incrementally decreased as PA intensity increased.
Conclusions: MB and AW appear most accurate for EE estimation. However, smartwatch manufacturers may consider
concentrating most on improving EE estimate accuracy during MPA.
Keywords: measurement bias, indirect calorimetry, validity
Wearable technology devices offer tremendous promise in
promoting physical activity (PA) and health among diverse popu-
lations (Case, Burwick, Volpp, & Patel, 2015;Kenney, Gortmaker,
Evenson, Goto, & Furberg, 2015), with great potential to aid in the
development of personalized health behavior change interventions
(Bai et al., 2016;Ferguson, Rowlands, Olds, & Maher, 2015;
Flores, Glusman, Brogaard, Price, & Hood, 2013;Hood, Balling,
& Auffray, 2012;Sasaki et al., 2015). For example, advancing
technology has facilitated development of sophisticated smart-
watches (Kenney et al., 2015)many providing health metric
data output for heart rate, energy expenditure (EE), PA, and sleep,
among other metrics. Notably, smartwatchescapability to provide
EE estimates has played a crucial role in these devicespopularity
as consumers track this metric and modify kcaloric (kcal) con-
sumption and PA in a manner necessary to promote appropriate and
sustainable weight loss (Kenney, Wilmore, & Costill, 2015b). Yet,
if smartwatches are not providing accurate EE estimates, these
inaccuracies may prevent the effective use of these devices as part
of a weight loss strategy or, more generally, for health promotion
purposes.
Currently, several smartwatches are popular among consu-
mers. Using each manufacturers proprietary algorithms, these
smartwatches combine demographic (age, sex), anthropometric
(height, weight), and bodily movement data collected via triaxial
accelerometer technology to provide daily EE estimates at rest,
during activities of daily living, and during PA or exercise (Fitbit,
2016;TomTom, 2017). Only a paucity of the available literature,
however, has conducted smartwatch EE estimate validation.
Indeed, literature has mostly examined the validity of smartwatches
in the measurement of laboratory-based and free-living PA dura-
tion and steps (Bai et al., 2016;Bunn, Navalta, Fountaine, & Reece,
2018;Evenson, Goto, & Furberg, 2015;Lee & Gorelick, 2011).
Among the few smartwatch EE estimate validation studies to date
(Bai et al., 2016;Diaz et al., 2015;Ferguson et al., 2015), mean
validity coefcients for EE were moderate to strong (range
r=0.740.85), with mixed ndings regarding smartwatches
over- or under-estimation of EE versus various criterion measures.
Notably, however, these studies were almost exclusively con-
ducted using specic models of the Fitbit and Jawbone despite
the rising popularity of other smartwatches (e.g., Apple Watch).
Moreover, few studies have employed indirect calorimetry as the
criterion EE measurean assessment method commonly consid-
ered the gold standardfor EE measurement (Kenney, Wilmore, &
Costill, 2015c). Finally, a newer statistical methodology, termed
equivalence testing(Dixon, Saint-Maurice, Kim, Hibbing, &
Pope is with the Division of Epidemiology and Community Health, School of Public
Health, University of Minnesota, Minneapolis, MN. Zeng is with the Department of
Food Science and Human Nutrition, Colorado State University, Fort Collins, CO. Li
and Liu are with the School of Physical Education, Hunan Normal University,
Changsha, China. Gao is with the School of Kinesiology, University of Minnesota,
Minneapolis, MN. Pope (popex157@umn.edu) and Gao (gaoz@umn.edu) are
corresponding authors
Q1.
1
Journal for the Measurement of Physical Behaviour, (Ahead of Print)
https://doi.org/10.1123/jmpb.2018-0037
© 2019 Human Kinetics, Inc. ORIGINAL RESEARCH
Welk, 2018), has been developed and may provide better insight
into smartwatch health metric data accuracy than the validity
statistics employed in past studies.
These limitations are not only notable given how consumers
often use smartwatch EE estimates (e.g., to monitor daily EE and
subsequently modify PA and/or dietary behaviors), but these
limitations may also impair health professionalsabilities to
employ smartwatches as a health promotion tool. Specically,
smartwatches are more often being cited as important components
of a healthcare approach referred to as systems medicine(Flores
et al., 2013;Hood et al., 2012), a multi-faceted wellness perspec-
tive leveraging novel technology (e.g., smartwatches, smartphones,
social media) to collect and analyze (via big data analysis) an
individuals health behaviors and develop personalized health
behavior change interventions thereafter based on these data
(Flores et al., 2013;Pope & Gao, 2017). Given smartwatch
technologys emerging uses, a need exists to assess several popular
smartwatchesEE estimates versus a gold standard EE criterion
measure like indirect calorimetry during different PA intensities
employing the statistical methodology of equivalence testing to
conduct these analyses. Therefore, this studys purpose was to
investigate the accuracy of the Microsoft Band (MB), Fitbit Surge
HR (FS), TomTom Cardio Watch (TT), and Apple Watch (AW) in
estimating EE at rest and at different PA intensities versus indirect
calorimetry EE measurements. The current studys observations
may inform consumers and health professionals alike of the
capability of various popular smartwatches to provide accurate
EE estimates capable of assisting in effective health behavior
change intervention development.
Method
Participants and Research Setting
This cross-sectional study recruited a convenience sample of
healthy young adults at a south-central Chinese university in
summer 2016. Participant inclusion criteria were (a) 1825 years
old; (b) body mass index 18.5 kg/m
2
; (c) no diagnosed physical
or mental disability; and (d) signed informed consent. Exclusion
criteria included (a) anyone currently using medication which
might affect cardiovascular function (e.g., beta-blockers); (b) a
history of documented cardiovascular or metabolic diseases/
conditions; or (c) unaccustomed to high-intensity exercise eliciting
EE >300 kcals/session. Participants completed a comprehensive
medical and health history questionnaire prior to study participa-
tion, with the experiment conducted in a highly controlled labora-
tory setting. All procedures performed were in accordance with
the ethical standards of the institution
Q2and/or national research
committee and with the 1964 Helsinki Declaration and its later
amendments or comparable ethical standards (World Medical
Association, 2018). Additionally, this research was completed in
agreement with the most recent ethical standards for sport and
exercise research (Harriss, Macsween, & Atkinson, 2017). Finally,
University Research Ethics Committee
Q3approval and informed
consent were obtained prior to testing.
Instrumentation
Criterion Device. Criterion EE data were collected via indirect
calorimetry with a Cortex Metalyzer II metabolic cart (Cortex;
Germany). Briey, the exercise tests were performed on a Pulsar
treadmill (H/P/Cosmos; Willich, Germany), with participants
wearing a mask attached to the metabolic cart. The metabolic
cart conducted indirect calorimetry measurements via gas analyses
at rest and during exercise from which body temperature, pressure,
and saturated-adjusted EE values for each exercise session were
measured. In simplest terms, indirect calorimetry measures parti-
cipantsrespiratory gas exchange rates of oxygen and carbon
dioxide which is then used to provide EE measurements (Kenney
et al., 2015c), with more detailed descriptions available regarding
how indirect calorimetry measures EE and why this measure
has been widely considered the gold standardEE measurement
method (Branson & Johannigman, 2004;Holdy, 2004). Impor-
tantly, the Pulsar treadmill and Cortex Metalyzer II have been used
in previous studies among various populations when assessing
EE (Bailey et al., 2012;Cockcroft et al., 2015;Peters, Heelan, &
Abbey, 2013). Notably, the Cortex Metalyzer II was calibrated
using a 3-liter syringe prior to each participants session, with the
calibration process completed per manufacturer specications.
Comparison Devices. Four wrist-worn smartwatches provided
EE estimates and served as the comparison devices. The smart-
watches included were the MB (Microsoft; Redmond, WA, USA),
FS (Fitbit, Inc.; San Francisco, CA, USA), TT (TomTom;
Amsterdam, The Netherlands), and AW (Apple; Cupertino, CA,
USA). Each smartwatch can assess several metrics including heart
rate, activity (i.e., minutes of activity, steps/day), sleep, stairs
climbed, and calories burned (i.e., EE). Notably, only one smart-
watch from each of the preceding manufacturers was included.
Regarding smartwatch placement, the MB and FS were worn on the
left wrist while the TT and AW were placed on the right, with the
smartwatches spaced 1 cm apart. Smartwatches were monitored
throughout the sessions to ensure no contact was made between
devices that might have impacted the results. This studys smart-
watch placement mirrored that of other studies placing multiple
smartwatches to participantswrists (Ferguson et al., 2015;
Fokkema, Kooiman, Krijnen, Van Der Schans, & De Groot, 2017).
To ensure the most accurate EE estimates were provided by each
smartwatch, each participants age, sex, weight, and height were
entered into each smartwatch prior to initiating the testing session
(see procedures below), with the side upon which each smartwatch
was worn (i.e., left or right) programmed as well. Finally, while the
wrist upon which the smartwatches were placed did not differ
between participants, potential bias of smartwatch placement was
reduced by randomizing which smartwatch was distal and which
was more proximal from participant to participant.
Anthropometrics. Height and weight were measured using a
stadiometer and digital weight scale, respectively. Specically,
height was measured using a Seca stadiometer (Seca; Hamburg,
Germany) and recorded to the nearest half-centimeter. As for
weight, this measurement was performed with a Detecto digital
weight scale (Detecto; Webb City, MO, USA), with weight
documented to the nearest tenth of a kilogram.
Procedures
All participants were instructed to abstain from eating or drinking
anything except water eight hours prior to visiting the lab in addition
to refraining from any vigorous PA (VPA) during the 24 hours prior
to study participation. Participants were asked to come into the lab
in a fasted state for two reasons. First, we wanted to ensure that the
indirect calorimetry measurements during the resting trial were as
accurate as possible. Indeed, basal metabolic rate assessed via
indirect calorimetry may be affected by prior food consumption
(Ahead of Print)
2Pope et al.
(Kenney et al., 2015c). Therefore, having the participants abstain
from food consumption until after study completion was important
to ensure the most valid comparison of indirect calorimetry EE
measurements to smartwatch EE estimates during the resting
(i.e., sitting) session. Second and more practically, participants
were requested to be fasted prior to study participation to ensure
no adverse gastrointestinal discomfort was experienced during the
studyparticularly during the higher-intensity sessions. Partici-
pants were informed of all experimental procedures and encouraged
to ask any questions before providing consent. Next, participants
were asked to complete a comprehensive medical/health history
questionnaire and a demographic information sheet after which
anthropometric data (i.e., height and weight) were gathered. Demo-
graphic and anthropometric data were subsequently loaded into
each smartwatch and into the metabolic carts software to ensure
accurate EE estimation and measurement, respectively. Finally, a
mask connected to the metabolic cart was placed on each participant
to measure oxygen consumption for determination of criterion EE.
Participants completed an 80-minute experimental protocol
which included four 10-minute PA sessions, each at a different
PA intensity: resting (sitting quietly), light PA (LPA; walking at
3.0 mph on treadmill), moderate PA (MPA; jogging at 5.0 mph), and
VPA (running at 7.0 mph). Sessions were completed from lowest
(i.e., resting) to highest (i.e., VPA) intensityensuring the results of
the lower-intensity trials were not biased by prior high-intensity
physiological workload. The PA intensity classication criterion
were consistent with a previous study among Chinese young adults
(Ren, Li, & Liang, 2017). Between each session, participants were
required to sit quietly until heart rate returned to ±10 beats/minute of
that observed during the initial resting session (Goto et al., 2007).
Following each of the four exercise sessions each participant
completed, EE data were obtained directly from the smartwatches
themselves, with these data recorded immediately to prevent any
data loss or misinterpretation. All four smartwatches in this study
provided average calories burned [i.e., EE]estimates over the
specied time interval pre-programmed by the researchers. There-
fore, prior to each of the participants resting and exercise trials, we
pre-programmed the smartwatches for a 10-minute exercise session
starting each smartwatchs 10-minute program immediately upon
each participants initiation of their session. This pre-programming
ensured that no EE data were included outside of the 10-minute
exercise session and, further, requested each smartwatch to save the
exercise session to its internal memory in case we needed to verify
these data at a later time. It is also noteworthy that two researchers
collected EE data from the smartwatches immediately after each
participant nished their respective exercise sessionallowing
each participants smartwatch data from each exercise session to
be double-checked (i.e., data quality control protocol). Finally, data
regarding each participants EE were placed immediately into an
Excel le for later analysis by one researcher and double-checked
by a second researcher after each trial for each participant. Impor-
tantly, the times the smartwatches were started and stopped during
each testing session were recorded per the software reporting
indirect calorimetry EE measurements. Using this softwares time-
stamp allowed us to ensure that the start and stop times used to
segment indirect calorimetry EE measurements were identical to the
time segments during which the smartwatches were estimating EE.
Statistical Analysis
Data were analyzed in late 2017 and were rst screened for
physiological implausible values. Next, Pearson correlation
coefcients were calculated to observe the association between
smartwatch EE estimates and indirect calorimetry EE measure-
ments at rest and each PA intensity (resting, LPA, MPA, and VPA).
Weak, moderate, and strong correlations were categorized as
rvalues of 0.200.39, 0.400.59, and 0.600.79, respectively,
with rvalues 0.19 classied as very weak and rvalues 0.80
classied as very strong (Thomas, Nelson, & Silverman, 2011).
Mean absolute percent errors (MAPE) were then calculated for
sitting and each PA intensity. Briey, MAPE was reported as
the average of the absolute difference between smartwatch EE
estimates and indirect calorimetry EE measurements divided
by indirect calorimetric measurements multiplied by 100. These
MAPE calculations were completed for each smartwatch at each
PA intensity, with MAPE calculations providing an examination of
individual-level measurement erroran approach used in other
smartwatch and accelerometer device validation studies (Fokkema
et al., 2017;Kim & Welk, 2015).
Equivalence testing was then used to assess the agreement of
smartwatch EE estimates with indirect calorimetry EE measure-
ments using this testing approachs condence interval (CI) method.
Equivalency testing is given fuller explanation in Dixon et al.
(2018), but the following two aspects of equivalence testing are
important: (1) equivalence testings null hypothesis states that the
two measurement methods being compared are not equivalent; and
(2) an alpha of 0.05 (i.e., 5%) is consistent with examining whether
the entire 90% CI for a given smartwatch at a given PA intensity
falls within a proposed equivalency region situated around the mean
indirect calorimetry EE measurement made at the same PA inten-
sity. Congruent with Kim and Welk (2015) and Bai et al. (2016), we
stated the equivalency region to be ±10% of the mean indirect
calorimetry EE measurements made at a given PA intensity. Finally,
coefcients of variation (CV) examined percentage variation in
smartwatch EE estimates attributable to the use of different man-
ufacturers devicesas done in prior literature (Driller, McQuillan,
&ODonnell, 2016). SPSS 25.0 (IBM Inc.; Armonk, NY) was
employed for all analyses, with alpha set at 0.05.
Results
Participants were 25 college students (13 females; M
age
=23.52 ±
1.04; M
height
=168.6 ± 7.4 cm; M
weight
=61.5 ± 10.1 kg). Table 1
presents descriptive statistics for smartwatch EE estimates and
indirect calorimetry EE measurements. As expected, EE values
increased linearly as PA intensity increased.
Pearson correlation coefcients between smartwatch EE esti-
mates and indirect calorimetry EE measurements at each PA
intensity revealed only three signicant correlations (rrange =
0.190.57; Table 2). Specically, moderate correlations were
seen for the following smartwatches at the denoted PA intensities
versus indirect calorimetry: RestTT (r=0.57, p<.01); LPAMB
(r=0.43, p<.05); and MPAAW (r=0.43, p<.05). Notably, a
marginally signicant, but weak, correlation was observed between
the AW and indirect calorimetry during LPA (r=0.37, p=.07).
No signicant correlations were found between smartwatch EE
estimates and indirect calorimetry EE measurements during VPA.
Moreover, Table 3contains MAPE values for each smartwatchs
EE estimates at each PA intensity compared to indirect calorimetry.
Overall, MAPE values were lowest for the MB (35.4%) and AW
(42.3%), with the FS and TT demonstrating higher values (47.7%
and 51.0%, respectively). Finally, MAPE values were higher
during the resting (52.9%) and MPA (65.3%) sessions versus the
LPA (33.7%) and VPA (24.6%) sessions.
(Ahead of Print)
Smartwatches in Assessing Energy Expenditure 3
The equivalence testing results for each smartwatchsEE
estimates and indirect calorimetrys EE measurements are pre-
sented in Table 4. Further, Figures 14graphically present these
results during the resting, LPA, MPA, and VPA sessions, respec-
tively. As indicated by Table 4and each Figure, no smartwatchs
90% CI fell completely within the ±10% equivalency region
established by indirect calorimetry at rest (20.322.5 kcals) or
during LPA (33.136.9 kcals), MPA (50.557.3 kcals), and VPA
(93.899.3 kcals). Similar to the MAPE results, however, the
greatest overlap between smartwatches90% CIs and indirect
calorimetrys equivalency region was observed during the LPA
and VPA sessions (see Table 4, Figures 2and 4). Specically, the
MB, TT, and AW possessed 90% CIs which overlapped with
indirect calorimetrys equivalency region during the LPA session
while all smartwatches achieved some overlap during the VPA
session. Notably, only the FS demonstrated any overlap with
indirect calorimetrys equivalency region during the resting session
whereas no smartwatch demonstrated any overlap during the MPA
session. Lastly, Table 5presents CVs. This metric indicated that EE
estimate variation attributable to the use of different manufacturers
devices was highest at rest (53.7 ± 12.6%), but incrementally
decreased as PA intensity increased (LPA: 31.1 ± 10.5%; MPA:
18.3 ± 8.9%; and VPA: 16.9 ± 8.0%).
Discussion
The present study examined the accuracy of four popular smart-
watchesEE estimates against indirect calorimetry at rest and at
different PA intensities. This comparison was signicant given the
fact that few previous investigations have examined the accuracy
of multiple smartwatchesEE estimates to that of gold standard
indirect calorimetry measurements, with most previous studies
having only validated PA duration and step estimates made by
different models of the Fitbit and Jawbone in comparison to
research-grade accelerometers like the ActiGraph.
Our data suggested the MB and AW possess the greatest EE
estimate accuracyparticularly during LPA and VPA. Notably,
despite the fact that all smartwatches demonstrated some EE
Table 1 Descriptive Statistics for Smartwatch Energy Expenditure and Indirect Calorimetry*
Microsoft Band Fitbit TomTom Apple Watch Indirect Calorimetry
M(SD) M(SD) M(SD) M(SD) M(SD)
Resting 16.7 (3.6) 18.4 (8.2) 33.4 (23.6) 36.3 (7.7) 21.4 (3.2)
Light Physical Activity 38.8 (13.0) 55.9 (16.0) 34.0 (11.8) 36.1 (9.8) 35.0 (5.4)
Moderate Physical Activity 86.7 (14.7) 90.6 (19.8) 82.5 (28.1) 79.9 (16.3) 53.9 (10.0)
Vigorous Physical Activity 102.2 (27.9) 94.4 (25.3) 95.7 (35.1) 88.0 (27.0) 96.5 (7.9)
*M ±SD total kilocalories burned during each 10-minute exercise session.
Table 2 Pearson Correlations Between Smartwatch Energy Expenditure and Indirect Calorimetry at Different PA
Intensities
#
Indirect Calorimetry vs.
Microsoft Band Fitbit TomTom Apple Watch
Resting 0.02 0.21 0.57** 0.06
Light Physical Activity 0.43* 0.14 0.19 0.37
Moderate Physical Activity 0.13 0.26 0.12 0.43*
Vigorous Physical Activity 0.03 0.25 0.03 0.09
#
Energy expenditure unit is kilocalories, with the correlations reective of this metric; *Indicates signicant correlation at p<.05 level; **Indicates signicant correlation at
p<0.01 level.
Table 3 Mean Absolute Percent Error for Each Smartwatchs Energy Expenditure Measurement at Each Physical
Activity Intensity Versus Indirect Calorimetry*
Indirect Calorimetry vs.
Microsoft Band Fitbit TomTom Apple Watch
Overall MAPE by
PA Intensity
M(SD) M(SD) M(SD) M(SD) M(SD)
Resting 23.6 (15.6) 31.8 (30.2) 83.3 (66.4) 73.0 (44.8) 52.9 (22.0)
Light Physical Activity 23.3 (23.9) 64.9 (44.1) 27.8 (26.3) 18.9 (18.8) 33.7 (16.4)
Moderate Physical Activity 69.2 (44.3) 73.3 (43.7) 64.2 (54.0) 54.5 (35.3) 65.3 (38.1)
Vigorous Physical Activity 25.6 (18.5) 21.0 (14.7) 28.8 (23.3) 22.8 (21.1) 24.6 (15.3)
Overall MAPE by Smartwatch 35.4 (12.1) 47.7 (21.5) 51.0 (22.4) 42.3 (14.1)
*Mean absolute percent error ± standard deviation for total kilocalories burned during each 10-minute exercise session.
(Ahead of Print)
4Pope et al.
estimate inaccuracies, these inaccuracies are congruent with past
studies assessing various smartwatchescapability to provide accu-
rate EE estimates (Alharbi, Bauman, Neubeck, & Gallagher, 2016;
Bai et al., 2016;Ferguson et al., 2015;Kenney et al., 2015;Sasaki
et al., 2015). For example, Bai et al. (2016) suggested the MAPEs for
four smartwatchs (Fitbit Flex, Jawbone Up24, Nike Fuel Band SE,
Mist Shine) EE estimate accuracy during aerobic activity to vary
between approximately 1860%congruentwiththecurrentin-
vestigations mean MAPE values for all smartwatches during the
Table 4 90% Condence Intervals for Energy Expen-
diture Measurements Made by Each Smartwatch and
Indirect Calorimetry at Each Physical Activity Intensity
Kilocalories 90% CI
M(LL, UL)
Resting Session
Indirect Calorimetry 21.4
Q4(20.3, 22.5)
Microsoft Band 16.7 (15.5, 18.0)
Fitbit 18.4 (15.6, 21.2)
TomTom 33.4 (25.4, 41.5)
Apple Watch 36.3 (33.7, 38.9)
LPA Session
Indirect Calorimetry 35.0 (33.1, 36.9)
Microsoft Band 38.8 (34.4, 43.3)
Fitbit 55.9 (50.5, 61.4)
TomTom 34.0 (30.0, 38.1)
Apple Watch 36.1 (32.7, 39.4)
MPA Session
Indirect Calorimetry 53.9 (50.5, 57.3)
Microsoft Band 86.7 (81.7, 91.7)
Fitbit 90.6 (83.8, 97.4)
TomTom 82.5 (72.9, 92.1)
Apple Watch 79.9 (74.3, 85.5)
VPA Session
Indirect Calorimetry 96.5 (93.8, 99.3)
Microsoft Band 102.2 (92.6, 111.7)
Fitbit 94.4 (85.6, 103.2)
TomTom 95.7 (83.7, 107.7)
Apple Watch 88.0 (78.7, 97.2)
Abbreviations: CI =Condence interval; LL =Lower limit for 90% condence
interval; UL =Upper limit for 90% condence interval.
Figure 1 Comparisons of Smartwatches vs. Indirect Calorimetry at
Rest.
Figure 2 Comparisons of Smartwatches vs. Indirect Calorimetry
during light physical activity.
Figure 3 Comparisons of Smartwatches vs. Indirect Calorimetry
during moderate physical activity.
Figure 4 Comparisons of Smartwatches vs. Indirect Calorimetry
during vigorous physical activity.
(Ahead of Print)
Smartwatches in Assessing Energy Expenditure 5
resting, LPA, and VPA conditions. Moreover, Lee, Kim, and Welk
(2014)conrmed the Fitbit Zip and Fitbit One accurately estimate
EE in free-living conditions (mean overall MAPEs =10.1% and
10.4%, respectively), with the other smartwatches tested (Jawbone
Up, Directlife, Nike Fuel Band, Basis Band) possessing a MAPE
range of 12.223.5%. Therefore, although the present study did not
observe the MAPE values for the MB and AW to be as low as that
observed by Lee et al. (2014) for that studys two Fitbit devices, the
fact that the MB and AW demonstrated relative accuracy versus
similar literature suggests two additional smartwatch options may be
considered by individuals desiring a wearable device to estimate EE.
Further, the MB and AWs relative EE estimation accuracy during
LPA is particularly promisinggiven the fact that most individuals are
capable of being active at this PA intensity, with a growing body of
literaturehighlighting the health-promoting benets of LPA (Powell,
Paluch, & Blair, 2011;U.S. Department of Health and Human
Services, 2018). Therefore, consumers and health professionals
might be able to use the MB and AW to develop PA programs
which focus on higher LPA incorporation among previously seden-
tary cohorts. Nonetheless, even the MB and AW demonstrated some
EE estimate inaccuracies, which suggests that these two devicesuse
within health programs should still take this error into accounta
topic discussed further below.
It is also noteworthy that the accuracy of smartwatch EE
estimates decreased (i.e., mean differences between smartwatch
EE and indirect calorimetry EE values increased) as PA intensity
increased up to the level of MPA, with the greatest smartwatch EE
overestimation observed during the MPA session. This observation
demonstrates majority alignment with literature examining smart-
watch EE estimate accuracy at different PA intensity levels (Bai
et al., 2016;Diaz et al., 2015). For example, Diaz and colleagues
(2015) examined smartwatch accuracy at different PA intensities
and indicated smartwatches overestimated EE by 52.4% during
moderate walking and 33.3% during brisk walking. These re-
searchersobservation of greater smartwatch EE estimate accuracy
during the highest walking intensity session, but less accuracy at
lower walking intensities, is congruent with the current studys
observation of increased accuracy during VPA, but decreased
smartwatch EE estimate accuracy as PA intensity increased to the
level of MPA. Bai and associates (2016) also made observations
similar to the current study. Indeed, these researchers observed
smartwatches to generally overestimate EE during MPA. Given the
observations of prior literature and the present study, speculation is
warranted as to the possible explanations for why smartwatch EE
estimates were quite accurate during VPA despite accuracy becom-
ing worse as participants increased PA intensity from LPA to MPA
with the largest inaccuracies during MPA.
The most plausible explanation lies in the difference in how
EE data is calculated by a smartwatch versus being measured by
indirect calorimetry. Specically, a smartwatch uses proprietary
algorithms to combine the users demographic and anthropometric
data with bodily movement data determined via an accelerometer
to estimate EE (Fitbit, 2016;TomTom, 2017). Indirect calorimetry,
on the other hand, measures the respiratory gas exchange rates of
oxygen and carbon dioxide as the participant breathes into the mask
during exercise (Branson & Johannigman, 2004;Holdy, 2004;
Kenney et al., 2015c). Thus, when the participants were progres-
sing from LPA to MPA, the body may have experienced slight
increases in physiological demand but marked increases in bodily
movement. As smartwatches estimate EE based largely upon
bodily movement, it may be that the large changes in bodily
movement observed as PA intensity increased led to systematic
overestimation of EE by smartwatches versus the highly accurate
indirect calorimetry which measures actual physiological demand
via gas analysis. This explanation appears more plausible, too,
when considering that during VPA smartwatch EE estimates from
all devices were found to be most accurate compared to indirect
calorimetry (see MAPE and equivalence testing results)aPA
intensity requiring an even greater amount of bodily movement
and physiological demand than observed during MPA. Indeed,
great amounts of bodily movement would have resulted in an
increased physiological demand (e.g., increased need for oxygen
and nutrients to be delivered to muscles/removal of carbon dioxide
and other metabolic waste productsall processes which are
facilitated via increased ventilation) and subsequently higher indi-
rect calorimetry EE measurements. As a nal point, more research
is also warranted regarding smartwatch EE estimate inaccuracy at
rest given the continued calls for the ability to accurately track and
modify sedentary behavior (Lewis, Napolitano, Buman, Williams,
& Nigg, 2017;U.S. Department of Health and Human Services,
2018). Undoubtedly, the high MAPE values and large variation
in EE estimates between different manufacturers smartwatch EE
estimates during the resting condition suggests improvements are
necessary if health professionals are to develop sedentary behavior
reduction interventions.
Smartwatchescapability to provide EE estimates have
increased interest among health professionals regarding utilizing
these devices to assist with the development and implementation of
personalized health behavior change programs among clients or
patients. Yet, the present studys observations suggested that while
smartwatches may demonstrate relative accuracy at certain PA
intensities, no smartwatch provided EE estimates within the EE
equivalency regions designated by indirect calorimetryeven
under standardized, highly controlled laboratory-based conditions.
Aside from how these inaccuracies affect consumersuse of
smartwatch EE estimates, these inaccuracies render problematic
the use of patient/client smartwatch EE estimates collected under
free-living (i.e., less standardized conditions than the present
study) by health professionals when developing the health behavior
change programs. Healthcare is experiencing a paradigm shift
from reactive treatment (i.e., treating diseases/conditions following
onset) to preventive/proactive treatment (i.e., treating diseases/
conditions prior to onset or in the early stages of development)
(Flores et al., 2013;Hood et al., 2012). Coinciding with this
paradigm shift has been the previously mentioned idea of systems
medicineand the development of a healthcare model which is
(a) predictive: using novel technology like smartwatches to track
health behaviors/indices (e.g., PA, sedentary behavior, EE, etc.)
may facilitate subsequent correlation of these health behaviors/
indices with biomarkers (e.g., blood lipid levels, blood sugar), with
disease risk able to be discerned thereafter; (b) preventive: health
Table 5 CVs for Smartwatch Energy Expenditure at
Different Physical Activity Intensities
CV*
M(SD)
Resting 53.7 (12.6%)
Light Physical Activity 31.1 (10.5%)
Moderate Physical Activity 18.3 (8.9%)
Vigorous Physical Activity 16.9 (8.0%)
Note.CV=coefcient of variation. *CVs are percentages.
(Ahead of Print)
6Pope et al.
behavior change programs can be developed based upon a patients
health behaviors to improve the patients participation in health
behaviors conducive to better health and the prevention/attenuation
of disease; (c) personalized: these health behavior change programs
can be personalized to the patients unique physical activity and/or
dietary preferences which may improve program adherence and
effectiveness; and (d) participatory: providing health education to
patients via web-based platforms may further improve patients
ability to engage in proper health behaviors in the long-term
(i.e., after cessation of the formal health behavior change program)
through promotion of increased health literacy.
Smartwatch EE overestimation is particularly detrimental to
smartwatch use within a systems medicine framework as overesti-
mation may diminish the effectiveness of weight loss programs
developed based upon smartwatch EE values. For instance, in-
dividuals may be led to believe they need to consume more kcals
than needed based upon the current inaccuracies observed for
the current studys smartwatchesparticularly during MPA. For
instance, an individual briskly walking for 30 minutes (i.e., MPA)
may have an actual EE of 200 kcals. Yet, even the most accurate
watch observed during MPA in the current study (i.e., the AW)
could register an EE estimate of 309 kcals during this 30-minute
walking session, based upon the AWs MPA MAPE results of
±54.5% and the fact that all smartwatches overestimated EE during
MPA. This, again, is not ideal within a systems medicine framework
and so caution is urged among health professionals using smart-
watches to develop health behavior change programs for patients/
clients. More broadly, these observations suggest more cross-
collaboration should be implemented between researchers and
smartwatch manufacturers to improve the algorithms used in smart-
watch EE estimation.
The present study has merits in that it was (1) conducted in a
highly controlled laboratory setting, thus limiting many confound-
ing variables (e.g., different wear times/locations/PA modality
choices) which might have affected the analyses; (2) assessed
EE at four different PA intensities; (3) examined smartwatch
accuracy using equivalency testing; and (4) used indirect calorim-
etry as the criterion measurean assessment method considered
the gold standardwhen assessing EE during aerobic exercise
(Kenney et al., 2015c). However, several limitations in the present
study should be noted. First, all study participants were healthy
young adults (i.e., a homogenous sample). Whether smartwatches
EE estimates are accurate in other populations, particularly clinical
populations, remains unanswered. Second, the sample size was
relatively small. Notably, while the use of indirect calorimetry is a
strength of the current study, connecting each participant to the
metabolic cart for an 80-minute study session was intricate and
time-consuminglimiting the number of participants tested and
precluding comparisons of how sex and BMI differences may
inuence smartwatch EE estimates. Yet, the researchers felt the
current studys sample size to be adequate as the sample size was
congruent with the most recent smartwatch validation studies
conducted (Diaz et al., 2015;Ferguson et al., 2015;Fokkema
et al., 2017)most of which did not employ indirect calorimetry as
the criterion measure. Third, this study only assessed participants
EE while neglecting other relevant health metric data output. For
example, heart rate data accuracy might also be examined given
that heart rate is often used by health professionals to facilitate
individualsparticipation in PA intensities necessary to promote
improved health outcomes like increased cardiovascular tness and
aerobic capacity (Kenney, Wilmore, & Costill, 2015a). Fourth, the
exercise tests were conducted solely on a treadmill. The last
limitation is noteworthy as the exclusive use of this PA modality
limits the current studys generalizability to other PA modalities
that may use different proportions of muscle mass (e.g., biking),
thus inuencing EE values. Moreover, other PA modalities may
have differing degrees of upper body motion, thus contributing to
greater or lesser degrees of motion artifact which some researchers
have speculated might affect smartwatch EE calculations (Lee &
Gorelick, 2011). Finally, although unlikely, placing two smart-
watches on each wrist may have biased smartwatch EE measure-
ments. It must be remembered, however, that while the FS and
MB were always placed on the left wrist and AW and TT on the
right, which device was distal and which device was proximal
was randomized. Moreover, smartwatch placement, no matter
distal or proximal, was as close to manufacturer specications as
possible. Therefore, future studies would benet from larger and
more diverse samples and the assessment of smartwatch EE and/or
heart rate data accuracy during different PA modalities. These
studies may also assess EE estimate inter-device reliability when
employing multiple smartwatches from the same manufacturer
to evaluate the device-dependency of EE estimations at different
PA intensities.
Conclusion
Wearable technology devices like smartwatches are becoming
widely used by consumers, in addition to health professionals, for
health promotion. Therefore, establishing smartwatch data accuracy
is paramount. Indeed, greater smartwatch data accuracy will allow
consumers and, importantly, health professionals to leverage these
devices to track health metrics such as EE and PAsubsequently
developing highly personalized health behavior change programs to
improve health and prevent non-communicable diseases (Flores
et al., 2013;Hood et al., 2012). This study indicated the MB and
AW to provide the most accurate EE estimates overallparticularly
during LPA and VPA. Notably, however, the accuracy of all
smartwatches decreased as PA intensity increased, with the most
pronounced inaccuracies during MPA. These observations suggest
a prudent approach should be taken by consumers and health
professionals when interpreting smartwatch EE estimates
particularly when one is engaging in MPA. Similarly, smartwatch
use in the development and implementation of PA and dietary
behavior change programs by health professionals may be cau-
tioned until health professionals can conrm the health metric data
accuracy these devices provide. In the future, researchers may work
alongside smartwatch manufacturers to ensure increased smart-
watch accuracy through the testing and manipulation of smartwatch
health metric data algorithms.
Acknowledgments
This research did not receive any specic grant from funding agencies in
the public, commercial, or not-for-prot sectors. While conducting this
study, the rst author played a large role in data analysis and writing the
manuscript. The second author played a role in data sorting and editing
the manuscript. The third author played a role in data collection and editing
the manuscript. The fourth author played a role in data collection and
editing the manuscript. The fth played a role in developing the idea,
overseeing data collection/analysis, and writing the manuscript. No nan-
cial disclosures were reported by the authors of this paper. The authors
have no conicts of interest to disclose in relation to the current research.
The results of this study are presented clearly, honestly, and without
fabrication, falsication, or inappropriate data manipulation.
(Ahead of Print)
Smartwatches in Assessing Energy Expenditure 7
References
Q5Alharbi, M., Bauman, A., Neubeck, L., & Gallagher, R. (2016). Validation
of the tbit-ex as a measure of free-living physical activity in
a community-based phase III cardiac rehabilitation population.
European Journal of Preventive Cardiology, 23(14), 14761485.
PubMed ID: 26907794 doi:10.1177/2047487316634883
Bai, Y., Welk, G., Nam, Y., Lee, J., Lee, J.-M., Kim, Y., ::: Dixon, P.
(2016). Comparison of consumer and research monitors under
semistructured settings. Medicine & Science in Sports & Exercise,
48(1), 151158. PubMed ID: 26154336 doi:10.1249/MSS.
0000000000000727
Bailey, T., Jones, H., Gregson, W., Atkinson, G., Cable, N., & Thijssen, D.
(2012). Effect of ischemic preconditioning on lactate accumulation
and running performance. Medicine & Science in Sports & Exercise,
44(11), 20842089. PubMed ID: 22843115 doi:10.1249/MSS.
0b013e318262cb17
Branson, R., & Johannigman, J. (2004). The measurement of energy
expenditure. Nutrition in Clinical Practice, 19, 622636. PubMed
ID: 16215161 doi:10.1177/0115426504019006622
Bunn, J., Navalta, J., Fountaine, C., & Reece, J. (2018). Current state of
commercial wearable technology in physical activity monitoring
20152017. International Journal of Exercise Science, 11(7), 503
515. PubMed ID: 29541338
Case, M., Burwick, H., Volpp, K., & Patel, M. (2015). Accuracy of
smartphone applications and wearable devices for tracking physical
activity data. Journal of the American Medical Association, 313(6),
625626. PubMed ID: 25668268 doi:10.1001/jama.2014.17841
Cockcroft, E., Williams, C., Tomlinson, O., Vlachopoulos, D., Jackman,
S., Armstrong, N., & Barker, A. (2015). High intensity interval
exercise is an effective alternative to moderate intensity exercise
for improving glucose tolerance and insulin sensitivity in adolescent
boys. Journal of Science and Medicine in Sport, 18(6), 720724.
PubMed ID: 25459232 doi:10.1016/j.jsams.2014.10.001
Diaz, K., Krupka, D., Chang, M., Peacock, J., Ma, Y., Goldsmith, J., :::
Davidson, K. (2015). Fitbit: an accurate and reliable device for
wireless physical activity tracking. International Journal of Cardiol-
ogy, 185, 138140. PubMed ID: 25795203 doi:10.1016/j.ijcard.
2015.03.038
Dixon, P., Saint-Maurice, P., Kim, Y., Hibbing, P., & Welk, G. (2018). A
primer on the use of equivalence testing for evaluating measurement
agreement. Medicine & Science in Sports & Exercise, 50(4), 837
845. PubMed ID: 29135817 doi:10.1249/MSS.0000000000001481
Driller, M., McQuillan, J., & ODonnell, S. (2016). Inter-device reliability
of an automatic-scoring actigraph for measuring sleep in healthy
adults. Sleep Science, 9, 198201. PubMed ID: 28123660 doi:10.
1016/j.slsci.2016.08.003
Evenson, K., Goto, M., & Furberg, R. (2015). Systematic review of the
validity and reliability of consumer-wearable activity trackers. Inter-
national Journal of Behavioral Nutrition, 12, 159. doi:10.1186/
s12966-015-0314-1
Ferguson, T., Rowlands, A., Olds, T., & Maher, C. (2015). The validity of
consumer-level, activity monitors in healthy adults worn in free-
living conditions: A cross-sectional study. International Journal of
Behavioral Nutrition and Physical Activity, 12, 42. PubMed ID:
25890168 doi:10.1186/s12966-015-0201-9
Fitbit. (2016). How does tbit estimate how many calories Ive burned.
Retrieved from https://help.tbit.com/articles/en_US/Help_article/1381
Flores, M., Glusman, G., Brogaard, K., Price, N., & Hood, L. (2013). P4
medicine: how systems medicine will transform the healthcare sector
and society. Personalized Medicine, 10(6), 565576. PubMed ID:
25342952 doi:10.2217/pme.13.57
Fokkema, T., Kooiman, T., Krijnen, W., Van Der Schans, C., & De Groot,
M. (2017). Reliability and validity of ten consumer activity trackers
depend on walking speed. Medicine & Science in Sports & Exercise,
49(4), 793800. PubMed ID: 28319983 doi:10.1249/MSS.
0000000000001146
Goto, C., Nishioka, K., Umemura, T., Jitsuiki, D., Sakagutchi, A.,
Kawamura, M., ::: Higashi, Y. (2007). Acute moderate-intensity
exercise induces vasodilation through an increase in nitric oxide
bioavailability in humans. American Journal of Hypertension, 20,
825830. PubMed ID: 17679027 doi:10.1016/j.amjhyper.2007.
02.014
Harriss, D., Macsween, A., & Atkinson, G. (2017). Standards for ethics
in sport and exercise science research: 2018 update. International
Journal of Sports Medicine, 38, 11261131. PubMed ID: 29258155
doi:10.1055/s-0043-124001
Holdy, K. (2004). Monitoring energy metabolism with indirect calorime-
try: Instruments, interpretation, and clinical application. Nutrition in
Clinical Practice, 19, 447454. PubMed ID: 16215138 doi:10.1177/
0115426504019005447
Hood, L., Balling, R., & Auffray, C. (2012). Revolutioning medicine in the
21st century through systems approaches. Biotechnology Journal, 7,
9921001. PubMed ID: 22815171 doi:10.1002/biot.201100306
Hopkins, W. (2000). Measures of reliability in sports medicine and
science. Sports Medicine, 30(1), 115. PubMed ID: 10907753 doi:10.
2165/00007256-200030010-00001
Kellar, S., & Kelvin, E. (2012). In S. Kellar& E. Kelvin (Eds.), Munros
statistical methods for health care research (6th ed.). Philadelphia,
PA: Lippincott Williams & Wilkins. Q6
Kenney, E., Gortmaker, S., Evenson, K., Goto, M., & Furberg, R. (2015).
Systematic review of the validity and reliability of consumer-
wearable activity trackers. International Journal of Behavioral
Medicine and Physical Activity, 12(1), 510.
Kenney, W., Wilmore, J., & Costill, D. (2015a). Adaptations to aerobic
and anaerobic training. In W. Kenney, J. Wilmore, & D. Costill
(Eds.), Physiology of sport and exercise (6th ed., pp. 261291).
Champaign, IL: Human Kinetics.
Kenney, W., Wilmore, J., & Costill, D. (2015b). Body composition and
nutrition for sport. In W. Kenney, J. Wilmore, & D. Costill (Eds.),
Physiology of sport and exercise (6th ed., pp. 371405). Champaign,
IL: Human Kinetics.
Kenney, W., Wilmore, J., & Costill, D. (2015c). Energy expenditure and
fatigue. In W. Kenney, J. Wilmore, & D. Costill (Eds.), Physiology
of sport and exercise (6th ed., pp. 119150). Champaign, IL: Human
Kinetics.
Kim, Y., & Welk, G. (2015). Criterion validity of competing
accelerometry-based activity monitoring devices. Medicine & Science
in Sports & Exercise, 47(11), 24562463. PubMed ID: 25910051
doi:10.1249/MSS.0000000000000691
Lee, C., & Gorelick, M. (2011). Validity of the smarthealth watch to
measure heart rate and energy expenditure during rest and exercise.
Measurement in Physical Education and Exercise Science, 15(1),
1825. doi:10.1080/1091367X.2011.539089
Lee, C., Gorelick, M., & Mendoza, A. (2011). Accuracy of an infrared
LED device to measure heart rate and energy expenditure during rest
and exercise. Journal of Sports Science, 29(15), 16451653. doi:10.
1080/02640414.2011.609899
Lee, J., Kim, Y., & Welk, G. (2014). Validity of consumer-based physical
activity monitors. Medicine & Science in Sports & Exercise, 46(9), 1840
1848. PubMed ID: 24777201 doi:10.1249/MSS.0000000000000287
Lewis, B., Napolitano, M., Buman, M., Williams, D., & Nigg, C. (2017).
Future directions in physical activity intervention research: Expand-
ing our focus to sedentary behaviors, technology, and dissemination.
(Ahead of Print)
8Pope et al.
Journal of Behavioral Medicine, 40(1), 112126. PubMed ID:
27722907 doi:10.1007/s10865-016-9797-8
Peters, B., Heelan, K., & Abbey, B. (2013). Validation of omron
pedometers using MTI accelerometers for use with children.
International Journal of Exercise Science, 6(2), 106113.
Pope, Z., & Gao, Z. (2017). Mobile device apps in enhancing physical
activity. In Z. Gao (Ed.), Technology in physical activity and promo-
tion (pp. 106128). London, UK: Routledge Publisher.
Powell, K., Paluch, A., & Blair, S. (2011). Physical activity for health:
What kind? how much? how intense? on top of what? In J. Fielding,
R. Brownson, & L. Green (Eds.), Annual review of public health
(Vol. 32, pp. 349365). Palo Alto, CA: Annual Reviews.
Ren, Q., Li, Z., & Liang, G. (2017). Comparison of active and passive
movement on treadmill in healthy individuals. Space Medicine &
Medical Engineering, 30(3), 185190.
Sasaki, J., Hickey, A., Mavilia, M., Tedesco, J., John, D., Keadle, S., &
Freedson, P. (2015). Validation of the tbit wireless activity tracker
for prediction of energy expenditure. Journal of Physical Activity
and Health, 12(2), 149154. PubMed ID: 24770438 doi:10.1123/
jpah.2012-0495
Thomas, J., Nelson, J., & Silverman, S. (2011). Relationships among
variables. In J. Thomas, J. Nelson, & S. Silverman (Eds.), Research
methods in physical activity (pp. 125144). Champaign, IL: Human
Kinetics.
TomTom. (2017). How calories are estimated on your watch. Retrieved
from http://uk.support.tomtom.com/app/answers/detail/a_id/19148/
~/how-calories-are-estimated-on-your-watch
U.S. Department of Health and Human Services. (2018). Physical activity
guidelines for Americans (2nd ed.). Washington, DC: Author.
World Medical Association. (2018). World medical association declara-
tion of Helsinki: Ethical principles for medical research involving
human subjects. Retrieved from https://www.wma.net/policies-post/
wma-declaration-of-helsinki-ethical-principles-for-medical-research-
involving-human-subjects/
(Ahead of Print)
Smartwatches in Assessing Energy Expenditure 9
Queries
Q1. Please ensure author information is listed correctly here and within the byline.
Q2. Please indicate the name of the institution
Q3. Please indicate the name of the university
Q4. Please provide a table footnote indicating what italics represents, or remove the italics
Q5. Please provide in-text for the following references: Hopkins (2000),Kellar and Kelvin. (2012), and Lee et al. (2011).
Q6. Please provide chapter title and page range for the reference Keller and Kelvin (2012).
... The AW's underestimation of EE increased with intensity, which contradicts findings from previous AW studies [24,33]. While Pope et al [33] reported lower accuracy at moderate compared to low and high intensity during running, Moreno et al [24] reported consistent accuracy across wheelchair propulsion stages with an increasing stroke rate. ...
... The AW's underestimation of EE increased with intensity, which contradicts findings from previous AW studies [24,33]. While Pope et al [33] reported lower accuracy at moderate compared to low and high intensity during running, Moreno et al [24] reported consistent accuracy across wheelchair propulsion stages with an increasing stroke rate. The test protocol with the standardized stroke rate increases in Moreno et al [24] might more closely resemble the way the AW estimation algorithm works. ...
... The Fitbit did, in contrast to the AW and other SW studies on EE [24,33], show a systematic decrease in the error with higher intensities for both groups, which resulted in lower overestimations. This finding was most likely related to using the "treadmill running mode" in the absence of a wheelchair-specific setting. ...
Article
Full-text available
Background The Apple Watch (AW) Series 1 provides energy expenditure (EE) for wheelchair users but was found to be inaccurate with an error of approximately 30%, and the corresponding error for heart rate (HR) provided by the Fitbit Charge 2 was approximately 10% to 20%. Improved accuracy of estimated EE and HR is expected with newer editions of these smart watches (SWs). Objective This study aims to assess the accuracy of the AW Series 4 (wheelchair-specific setting) and the Fitbit Versa (treadmill running mode) for estimating EE and HR during wheelchair propulsion at different intensities. Methods Data from 20 manual wheelchair users (male: n=11, female: n=9; body mass: mean 75, SD 19 kg) and 20 people without a disability (male: n=11, female: n=9; body mass: mean 75, SD 11 kg) were included. Three 4-minute wheelchair propulsion stages at increasing speed were performed on 3 separate test days (0.5%, 2.5%, or 5% incline), while EE and HR were collected by criterion devices and the AW or Fitbit. The mean absolute percentage error (MAPE) was used to indicate the absolute agreement between the criterion device and SWs for EE and HR. Additionally, linear mixed model analyses assessed the effect of exercise intensity, sex, and group on the SW error. Interclass correlation coefficients were used to assess relative agreement between criterion devices and SWs. Results The AW underestimated EE with MAPEs of 29.2% (SD 22%) in wheelchair users and 30% (SD 12%) in people without a disability. The Fitbit overestimated EE with MAPEs of 73.9% (SD 7%) in wheelchair users and 44.7% (SD 38%) in people without a disability. Both SWs underestimated HR. The device error for EE and HR increased with intensity for both SWs (all comparisons: P<.001), and the only significant difference between groups was found for HR in the AW (–5.27 beats/min for wheelchair users; P=.02). There was a significant effect of sex on the estimation error in EE, with worse accuracy for the AW (–0.69 kcal/min; P<.001) and better accuracy for the Fitbit (–2.08 kcal/min; P<.001) in female participants. For HR, sex differences were found only for the AW, with a smaller error in female participants (5.23 beats/min; P=.02). Interclass correlation coefficients showed poor to moderate relative agreement for both SWs apart from 2 stage-incline combinations (AW: 0.12-0.57 for EE and 0.11-0.86 for HR; Fitbit: 0.06-0.85 for EE and 0.03-0.29 for HR). Conclusions Neither the AW nor Fitbit were sufficiently accurate for estimating EE or HR during wheelchair propulsion. The AW underestimated EE and the Fitbit overestimated EE, and both SWs underestimated HR. Caution is hence required when using SWs as a tool for training intensity regulation and energy balance or imbalance in wheelchair users.
... M EASURING human metabolic rate (MR) represents an important step in determining nutritional needs. Estimation of MR through wearable devices, often relying on heart rate measurements, has been widely discussed in the scientific literature and results are often inaccurate, with authors' stated errors in MR estimation ranging from 25% to 50% [1], [2]. Recently, Levikari et al. [3] proposed an integrative approach to increase the accuracy of MR estimation by combining a thermoelectric heat flux sensor with heart rate measurement, in conjunction with a humidity sensor to take into account evaporative heat transfer, resulting in an accuracy improvement at low MR levels (e.g., resting conditions), while still achieving error ≥30% for higher MR levels. ...
... For each propane combustion experiment, actual measurements of gas concentrations and inflow rate were applied to the steady-state model in (2). Fig. 5 shows the measurements of f O 2,in , f CO 2,in , f O 2,WRIC , and f CO 2,WRIC obtained during one propane combustion experiment (EXP1). ...
... For the static model equations to be correctly applied to propane combustion data, the hypothesis Fig. 4. Weight (%) of each WRIC measured variable on total variance oḟ VO 2 ,VCO 2 , RER, and MR estimates for a CO 2 concentration inside the WRIC equal to 0.2%. of steady-state condition must hold true. Specifically, steadystate conditions can be considered valid if the results of the dynamic model in (1) are compatible with those of the static model in (2), namely, the contribution of time derivative terms of gas concentration multiplied by chamber volume in (1) must be negligible (e.g., 10-fold smaller) compared to the other terms in (1). Fig. 6 shows the measured values ofVO 2 anḋ VCO 2 calculated using the steady-state model equations in (2) applied to the data of five propane combustion experiments arbitrarily named EXP1, EXP2, EXP3, EXP4, and EXP5. ...
Article
Whole-Room Indirect Calorimeters (WRIC) are accurate tools to precisely measure energy metabolism in humans via calculation of oxygen consumption and carbon dioxide production. Yet, overall accuracy of metabolic measurements relies on the validity of the dynamic model for gas exchange inside the calorimeter volume in addition to experimental and environmental conditions that contribute to the uncertainty of WRIC outcome variables. The aim of this work is to formally study the sensitivity of a WRIC system operated in a push configuration at the steady-state condition to identify the optimal experimental conditions to obtain the best degree of accuracy for outcome metabolic measurements. The results of our sensitivity analysis are then validated with measurements obtained during propane combustion tests performed at the WRIC located at the University Hospital of Pisa. Our results demonstrate that achieving a fractional concentration of carbon dioxide inside the calorimeter >0.2% leads to relative uncertainty <5% for the outcome metabolic measurements when assuming an accuracy class of 1% for gas analyzer instruments.
... Combining wearable sensors and data-driven methods enables portable and computationally efficient estimation, but many methods rely on subject-specific data to train their models and do not evaluate the accuracy for new subjects. Data-driven methods may use subject-specific information, such as weight and height 24 , as well as a variety of wearable sensors including accelerometers and inertial measurement units (IMUs) 25,26 , heart rate monitors, or electrocardiography 27,28 , electromyography, impedance pneumography 29,30 , and various combinations 16,[29][30][31] . These methods have shown a high correlation between sensor data and energy expenditure 32,33 and can accurately evaluate physical fitness 34 . ...
... Activity monitors typically only estimate during walking or running because they require significant wrist or pelvis motion, precluding activities like biking. Smartwatches and wearable data-driven models report large errors, from 27% to 93% when evaluated with new subjects 30,31,39 , with errors varying across brands. Heart rate and respirometry have a delayed response to changes in energy expenditure, which causes errors at the start of steady-state conditions and during time-varying activities. ...
... During steady-state conditions, the Wearable System had 13% steady-state error, about half the error of the second-most accurate model, the Activity-Specific Smartwatch (Fig. 2c). The steady-state errors for the Smartwatch and the Activity-Specific Smartwatch match those from previous studies 31,39 . Even the Activity-Specific Model, which used manual labeling during steadystate conditions to achieve perfect activity classification, had higher steady-state error than the Wearable System (18%). ...
Article
Full-text available
Physical inactivity is the fourth leading cause of global mortality. Health organizations have requested a tool to objectively measure physical activity. Respirometry and doubly labeled water accurately estimate energy expenditure, but are infeasible for everyday use. Smartwatches are portable, but have significant errors. Existing wearable methods poorly estimate time-varying activity, which comprises 40% of daily steps. Here, we present a Wearable System that estimates metabolic energy expenditure in real-time during common steady-state and time-varying activities with substantially lower error than state-of-the-art methods. We perform experiments to select sensors, collect training data, and validate the Wearable System with new subjects and new conditions for walking, running, stair climbing, and biking. The Wearable System uses inertial measurement units worn on the shank and thigh as they distinguish lower-limb activity better than wrist or trunk kinematics and converge more quickly than physiological signals. When evaluated with a diverse group of new subjects, the Wearable System has a cumulative error of 13% across common activities, significantly less than 42% for a smartwatch and 44% for an activity-specific smartwatch. This approach enables accurate physical activity monitoring which could enable new energy balance systems for weight management or large-scale activity monitoring.
... For example, Davoudi et al. [103] examined the validity of Samsung Gear S smartwatch while using ActiGraph accelerometer as the criterion and found that this smartwatch was relatively accurate in assessing individual activity recognition, activity intensity detection, major body movement location detection, and locomotion detection tasks. Pope and colleagues [104,105] compared Apple Watch, Fitbit Surge HR, TomTom Multisport Cardio Watch, and Microsoft Band against ActiGraph accelerometer with two different samples. They observed that smartwatch average/peak heart rate measurements were moderately valid, yet smartwatch energy expenditure measurements were less valid [104]. ...
... They observed that smartwatch average/peak heart rate measurements were moderately valid, yet smartwatch energy expenditure measurements were less valid [104]. Among them, Apple Watch and Microsoft Band had the most accurate EE estimation [105]. More research development is needed in this area. ...
Article
Full-text available
Physical behaviors (e.g., physical activity and sedentary behavior) have been the focus among many researchers in the biomedical and behavioral science fields. The recent shift from hip- to wrist-worn accelerometers in these fields has signaled the need to develop novel approaches to process raw acceleration data of physical activity and sedentary behavior. However, there is currently no consensus regarding the best practices for analyzing wrist-worn accelerometer data to accurately predict individuals’ energy expenditure and the times spent in different intensities of free-living physical activity and sedentary behavior. To this end, accurately analyzing and interpreting wrist-worn accelerometer data has become a major challenge facing many clinicians and researchers. In response, this paper attempts to review different methodologies for analyzing wrist-worn accelerometer data and offer cutting edge, yet appropriate analysis plans for wrist-worn accelerometer data in the assessment of physical behavior. In this paper, we first discuss the fundamentals of wrist-worn accelerometer data, followed by various methods of processing these data (e.g., cut points, steps per minute, machine learning), and then we discuss the opportunities, challenges, and directions for future studies in this area of inquiry. This is the most comprehensive review paper to date regarding the analysis and interpretation of free-living physical activity data derived from wrist-worn accelerometers, aiming to help establish a blueprint for processing wrist-derived accelerometer data.
... Pope et al. tested the EE and HR estimation accuracy of four wearables during exergaming (playing a boxing game on a Nintendo Wii gaming console), with errors in the mean EE ranging from 10% to 40% [8]. In another study, Pope et al. reported mean EE errors between 25% and 50% during walking, jogging, and running, and errors of over 50% when the subjects were at rest [9]. While the aforementioned studies [7], [8] reported reasonably good accuracies for the heart rate estimates provided by wearables, the error in the energy expenditure estimates may limit the usability of applications that rely on calorie consumption information, such as sports performance metering. ...
... Adding heat flux and humidity data to heart rate information yields a notable decrease in error during lowintensity activities, i.e., sitting, standing, and walking. 9 show that the addition of HF and %RH data has indeed a pronounced effect on the MAPE scores below 4000 kcal/day, although they also show a moderate positive effect on levels above 14000 kcal/day. ...
Article
Full-text available
Wearable electronics are often used for estimating the energy expenditure of the user based on heart rate measurement. While heart rate is a good predictor of calorie consumption at high intensities, it is less precise at low intensity levels, which translates into inaccurate results when estimating daily net energy expenditure. In this study, heart rate measurement was augmented with heat flux measurement, a form of direct calorimetry. A physical exercise test on a group of 15 people showed that heat flux measurement can improve the accuracy of calorie consumption estimates especially during rest and low-intensity activity when used in conjunction with heart rate information and vital background parameters of the user.
... Relatedly, research suggests smart fitness watches are not always accurate at estimating the calories expended during moderate exercise. (64) In addition, smart watches often display the absolute number of calories expended during exercise rather than the additional calories expended over and above what would have been expended during rest, which may lead to individuals overconsuming after exercise. Therefore, smartwatch manufacturers may want to consider displaying this as additional information, although research into the effectiveness of this in reducing excessive energy intake after exercise is needed. ...
Article
Full-text available
Increasing food intake or eating unhealthily after exercise may undermine attempts to manage weight, thereby contributing to poor population-level health. This scoping review aimed to synthesise the evidence on the psychology of changes to eating after exercise and explore why changes to eating after exercise occur. A scoping review of peer-reviewed literature was conducted in accordance with the Joanna Briggs Institute guidance. Search terms relating to exercise, eating behaviour, and compensatory eating were used. All study designs were included. Research in children, athletes, or animals was excluded. No country or date restrictions were applied. Twenty-three studies were identified. Ten experimental studies (nine acute, one chronic) manipulated the psychological experience of exercise, one intervention study directly targeted compensatory eating, seven studies used observational methods (e.g. diet diaries, 24-h recall) to directly measure compensatory eating after exercise, and five questionnaire studies measured beliefs about eating after exercise. Outcomes varied and included energy intake (kcal/kJ), portion size, food intake, food choice, food preference, dietary lapse, and self-reported compensatory eating. We found that increased consumption of energy-dense foods occurred after exercise when exercise was perceived as less enjoyable, less autonomous, or hard work. Personal beliefs, exercise motivation, and exercise enjoyment were key psychological determinants of changes to eating after exercise. Individuals may consume additional food to refuel their energy stores after exercise (psychological compensatory eating), or consume unhealthy or energy dense foods to reward themselves after exercise, especially if exercise is experienced negatively (post-exercise licensing), however the population-level prevalence of these behaviours is unknown.
... Most prior work on physical activity tracking was focused on wearable devices such as smartphones [38], smartwatches [23], pedometers [3] or activity trackers [7]. Traditional activity trackers use accelerometer to recognise user activity and estimate calorie consumption based on personal details such as weight, height and age [14]. ...
Conference Paper
Maintaining certain physical activity levels is important to prevent or delay the onset of many medical conditions such as diabetes, or mental health disorders. Traditional calorie estimation methods require wearing devices, such as pedometers, smart watches or smart bracelets, which continuously monitor user activity and estimate the energy expenditure. However, wearable devices may not be suitable for some patients due to the need for periodic maintenance, frequent recharging and having to wear it all the time. In this paper we investigate a feasibility of a device- free human energy expenditure estimation based on RF-sensing, which recognises coarse-grained user activity, such as walking, standing, sitting or resting by monitoring the impact of a person’s activity on ambient wireless links. The calorie estimation is then based on Metabolic Equivalent concept that expresses the energy cost of an activity as a multiple of a person’s basal metabolic rate using Harrison-Benedict model. The experimental evaluation using low cost IEEE 802.15.4 transceivers demonstrated that the approach estimated energy expenditure within an indoor environment within 7.4% to 41.2% range when compared to a FitBit Blaze bracelet.
Article
In this Review, we explore the state of the art of biomechanical models for estimating energy consumption during terrestrial locomotion. We consider different mechanical models that provide a solid framework to understand movement energetics from the perspective of force and work requirements. Whilst such models are highly informative, they lack specificity for predicting absolute metabolic rates across a range of species or variations in movement patterns. Muscles consume energy when they activate to generate tension, as well as when they shorten to generate positive work. Phenomenological muscle models incorporating steady-state parameters have been developed and are able to reproduce how muscle fibre energy consumption changes under different contractile conditions; however, such models are difficult to validate when scaled up to whole muscle. This is, in part, owing to limited availability of data that relate muscle dynamics to energetic rates during contraction of large mammalian muscles. Furthermore, factors including the compliance of tendinous tissue, dynamic shape changes and motor unit recruitment can alter the dynamics of muscle contractile tissue and potentially improve muscle efficiency under some locomotion conditions. Despite the many challenges, energetic cost estimates derived from musculoskeletal models that simulate muscle function required to generate movement have been shown to reasonably predict changes in human metabolic rates under different movement conditions. However, accurate predictions of absolute metabolic rate are still elusive. We suggest that conceptual models may be adapted based on our understanding of muscle energetics to better predict the variance in movement energetics both within and between terrestrial species.
Article
OBJECTIVES In the modern era, there is heightened interest in understanding energy expenditure during exercise. Consequently, wearable devices such as the Galaxy Watch and Apple Watch have emerged as pivotal tools for daily health monitoring, given their convenience and increasing popularity. This study aimed to compare the calculated energy expenditure derived from the graded exercise test with readings from Galaxy and Apple Watches during a 30-min exercise session among Korean university students. Through this, we anticipate offering both motivation and clear insights into energy expenditure, thereby potentially aiding in weight management strategies for contemporary individuals.METHODS This study involved 27 college students from Korea National University of Transportation in Chungcheongbuk-do, Korea. We utilized COSMED's exercise load respiratory gas analysis system (Quark- CPET, COSMED, Rome, Italy), along with the Galaxy Watch (Galaxy Watch 5, Samsung, Seoul, Korea) and the Apple Watch (Apple watch series 5, Apple, Cupertino, USA) for measurements. Energy expenditure was monitored in real-time every 5 min throughout the 30-min exercise session. For statistical evaluations, we employed a one-way analysis of variance. Subsequent post-tests utilized the Tukey post-hoc test and Pearson correlation, with a significance level set at p<0.05.RESULTS Initially, no statistically significant difference emerged between energy expenditure readings from the graded exercise test and those from the Galaxy Watch across all time intervals: 5, 10, 15, 20, 25, and 30 min (p>0.05). Conversely, a notable difference was observed when comparing energy expenditure data from the graded exercise test to that of the Apple Watch for time intervals of 10, 15, 20, 25, and 30 min (p>0.05), although the 5-min interval did not exhibit a significant difference (p>0.05). Furthermore, a robust positive correlation was evident between the energy expenditure values derived from the graded exercise test and those from both the Galaxy Watch (r=0.952, p<0.001) and the Apple Watch (r=0.917, p<0.001).CONCLUSIONS Both devices demonstrated high reliability in calculating energy expenditure. Notably, the Galaxy Watch exhibited a more precise calculation compared to the Apple Watch, with a relative reliability margin of 3.5% higher. For individuals, especially those struggling with obesity, precise wearable devices that accurately reflect energy consumption can significantly boost motivation for exercise. Consequently, this study lays a foundation for future advancements in energy expenditure measurement tools, emphasizing enhanced convenience, reliability, and mobility.
Article
Equivalence testing may provide complementary information to more frequently used statistical procedures in that it determines whether physical behaviour outcomes are statistically equivalent to criterion measures. A caveat of this procedure is the predetermined selection of upper- and lower-bounds of acceptable error around a specified zone of equivalence. With no clear guidelines available to assist researchers, these equivalence zones are arbitrarily selected. A scoping review of articles implementing equivalence testing to determine the validity of physical behaviour outcomes was performed, with the aim of characterizing how this procedure has been implemented and to provide recommendations. A literature search from 5 databases initially identified potential 1153 articles, which resulted in the acceptance of 19 studies (20 arms) conducted in children/youth and 40 in adults (49 arms). Most studies were conducted in free-living conditions (children/youth: 13 arms; adults: 22 arms) and employed a ±10% equivalence zone. However, equivalence zones ranged from ±3% to ±25%, with only a sub-set using absolute thresholds (e.g., ±1000 steps/day). If these equivalence zones were increased or decreased by ±5%, 75% (15/20, children/youth) and 71% (35/49, adults) would have exhibited opposing equivalence test outcomes (i.e., equivalent to non-equivalent or vice versa). This scoping review identifies the heterogeneous usage of equivalence testing in studies examining the accuracy of (in)activity measures. In the absence of evidence-based standardized equivalence criteria, presenting the percentage required to achieve statistical equivalence or using absolute thresholds based as a proportion of the standard deviation may be a better practice than arbitrarily selecting zones a priori.
Article
Full-text available
Wearable physical activity trackers are a popular and useful method to collect biometric information at rest and during exercise. The purpose of this systematic review was to summarize recent findings of wearable devices for biometric information related to steps, heart rate, and caloric expenditure for several devices that hold a large portion of the market share. Searches were conducted in both PubMed and SPORTdiscus. Filters included: humans, within the last 5 years, English, full-text, and adult 19+ years. Manuscripts were retained if they included an exercise component of 5-min or greater and had 20 or more participants. A total of 10 articles were retained for this review. Overall, wearable devices tend to underestimate energy expenditure compared to criterion laboratory measures, however at higher intensities of activity energy expenditure is underestimated. All wrist and forearm devices had a tendency to underestimate heart rate, and this error was generally greater at higher exercise intensities and those that included greater arm movement. Heart rate measurement was also typically better at rest and while exercising on a cycle ergometer compared to exercise on a treadmill or elliptical machine. Step count was underestimated at slower walking speeds and in free-living conditions, but improved accuracy at faster speeds. The majority of the studies reviewed in the present manuscript employed different methods to assess validity and reliability of wearable technology, making it difficult to compare devices. Standardized protocols would provide guidance for researchers to evaluate research-grade devices as well as commercial devices used by the lay public.
Article
Full-text available
Despite the increased health risks of a sedentary lifestyle, only 49 % of American adults participate in physical activity (PA) at the recommended levels. In an effort to move the PA field forward, we briefly review three emerging areas of PA intervention research. First, new intervention research has focused on not only increasing PA but also on decreasing sedentary behavior. Researchers should utilize randomized controlled trials, common terminology, investigate which behaviors should replace sedentary behaviors, evaluate long-term outcomes, and focus across the lifespan. Second, technology has contributed to an increase in sedentary behavior but has also led to innovative PA interventions. PA technology research should focus on large randomized trials with evidence-based components, explore social networking and innovative apps, improve PA monitoring, consider the lifespan, and be grounded in theory. Finally, in an effort to maximize public health impact, dissemination efforts should address the RE-AIM model, health disparities, and intervention costs.
Article
Full-text available
Actigraphy has become a common method of measuring sleep due to its non-invasive, cost-effective nature. An actigraph (Readiband™) that utilizes automatic scoring algorithms has been used in the research, but is yet to be evaluated for its inter-device reliability. A total of 77 nights of sleep data from 11 healthy adult participants was collected while participants were concomitantly wearing two Readiband™ actigraphs attached together (ACT1 and ACT2). Sleep indices including total sleep time (TST), sleep latency (SL), sleep efficiency (SE%), wake after sleep onset (WASO), total time in bed (TTB), wake episodes per night (WE), sleep onset variance (SOV) and wake variance (WV) were assessed between the two devices using mean differences, 95% levels of agreement, intraclass correlation coefficients (ICC), typical error of measurement (TEM) and coefficient of variation (CV%) analysis. There were no significant differences between devices for any of the measured sleep variables (p>0.05). TST, SE, SL, TTB, SOV and WV all resulted in very high ICC's (>0.90), with WASO and WE resulting in high ICC's between devices (0.85 and 0.80, respectively). Mean differences of −2.1 and 0.2min for TST and SL were associated with a low TEM between devices (9.5 and 3.8min, respectively). SE resulted in a 0.3% mean difference between devices. The Readiband™ is a reliable tool for researchers using multiple devices of this brand in sleep studies to assess basic measures of sleep quality and quantity in healthy adult populations.
Article
Purpose: Statistical equivalence testing is more appropriate than conventional tests of difference to assess the validity of physical activity (PA) measures. This article presents the underlying principles of equivalence testing and gives three examples from PA and fitness assessment research. Methods: The three examples illustrate different uses of equivalence tests. Example 1 uses PA data to evaluate an activity monitor's equivalence to a known criterion. Example 2 illustrates the equivalence of two field-based measures of physical fitness with no known reference method. Example 3 uses regression to evaluate an activity monitor's equivalence across a suite of 23 activities. Results: The examples illustrate the appropriate reporting and interpretation of results from equivalence tests. In the first example, the mean criterion measure is significantly within ±15% of the mean PA monitor. The mean difference is 0.18 METs and the 90% confidence interval of -0.15 to 0.52 is inside the equivalence region of -0.65 to 0.65. In the second example, we chose to define equivalence for these two measures as a ratio of mean values between 0.98 and 1.02. The estimated ratio of mean V˙O2 values is 0.99, which is significantly (P = 0.007) inside the equivalence region. In the third example, the PA monitor is not equivalent to the criterion across the suite of activities. The estimated regression intercept and slope are -1.23 and 1.06. Neither confidence interval is within the suggested regression equivalence regions. Conclusions: When the study goal is to show similarity between methods, equivalence testing is more appropriate than traditional statistical tests of differences (e.g., ANOVA and t-tests).
Article
Purpose: To examine the test-retest reliability and validity of ten activity trackers for step counting at three different walking speeds. Methods: Thirty-one healthy participants walked twice on a treadmill for 30 min while wearing 10 activity trackers (Polar Loop, Garmin Vivosmart, Fitbit Charge HR, Apple Watch Sport, Pebble Smartwatch, Samsung Gear S, Misfit Flash, Jawbone Up Move, Flyfit, and Moves). Participants walked three walking speeds for 10 min each; slow (3.2 km·h), average (4.8 km·h), and vigorous (6.4 km·h). To measure test-retest reliability, intraclass correlations (ICC) were determined between the first and second treadmill test. Validity was determined by comparing the trackers with the gold standard (hand counting), using mean differences, mean absolute percentage errors, and ICC. Statistical differences were calculated by paired-sample t tests, Wilcoxon signed-rank tests, and by constructing Bland-Altman plots. Results: Test-retest reliability varied with ICC ranging from -0.02 to 0.97. Validity varied between trackers and different walking speeds with mean differences between the gold standard and activity trackers ranging from 0.0 to 26.4%. Most trackers showed relatively low ICC and broad limits of agreement of the Bland-Altman plots at the different speeds. For the slow walking speed, the Garmin Vivosmart and Fitbit Charge HR showed the most accurate results. The Garmin Vivosmart and Apple Watch Sport demonstrated the best accuracy at an average walking speed. For vigorous walking, the Apple Watch Sport, Pebble Smartwatch, and Samsung Gear S exhibited the most accurate results. Conclusion: Test-retest reliability and validity of activity trackers depends on walking speed. In general, consumer activity trackers perform better at an average and vigorous walking speed than at a slower walking speed.
Article
Background: Accurate physical activity monitoring is important for cardiac patients. Novel activity monitoring devices may enable precise measurement of physical activity. This study aimed to validate Fitbit-Flex against Actigraph accelerometer for monitoring physical activity. Design: A validation study with a comparative design. Methods: Cardiac patients and family members participating in community-based exercise programs wore Fitbit-Flex and Actigraph simultaneously over four days to monitor daily step counts and minutes of moderate to vigorous physical activity (MVPA). Results: Participants (N = 48) comprised 52.1% males, with a mean age of 65.6 ± 6.9 years and 58.9% had a cardiac diagnosis. Fitbit-Flex and Actigraph were significantly correlated in males, females, total participants and cardiac patients for step counts (r = .96; r = .95; r = .95; r = .95), though less so for MVPA (r = .81; r = .65, r = .74; r = .71). As step counts increased the differences between Fitbit-Flex and Actigraph also increased. Fitbit-Flex over-estimated step counts in females (556 steps/day), males (1462 steps/day) and total participants (1038 steps/day) as well as for minutes of MVPA in females (4 min/day), males (15 min/day) and total participants (10 min/day). Fitbit-Flex had high sensitivity and specificity in classifying participants who achieved the recommended physical activity guidelines. Conclusion: Fitbit-Flex is accurate in assessing attainment of physical activity guideline recommendations and is useful for monitoring physical activity in cardiac patients. The device does, however, slightly over-estimate step counts and MVPA.