Content uploaded by Hannah Scott
Author content
All content in this area was uploaded by Hannah Scott on Jun 19, 2022
Content may be subject to copyright.
Title: The accuracy of the THIM wearable device for estimating sleep onset latency.
Authors:
Hannah Scott PhD1,2, Ashwin Whitelaw2, Alex Canty1, Nicole Lovato PhD2 and Leon Lack
PhD1,2
1 College of Education, Psychology and Social Work, Flinders University, Adelaide
5001, Australia.
2 College of Medicine and Public Health, Adelaide Institute for Sleep Health: A Flinders
Centre of Research Excellence, Flinders University, Adelaide 5001, Australia.
Correspondence:
Hannah Scott
Adelaide Institute for Sleep Health, Flinders University
Box 6, 5 Laffer Drive
Bedford Park SA 5042
Email: hannah.scott@flinders.edu.au
Disclosure statements:
All authors have seen and approved the manuscript.
Financial support: Research costs were partially funded by Re-Time Pty Ltd, the company
that sell THIM. None of the study authors were financially supported by Re-Time for this
project.
Conflicts of interest: LL is a shareholder of Re-Time Pty Ltd. LL and HS have a patent
pending regarding the THIM device. AW, AC and NL have no conflicts of interest to declare.
Number of Tables: 1
Number of Figures: 5
Abstract Word Count: 250
Brief Summary Word Count: 84
Manuscript Word Count: 5045
1
Abstract
Study Objectives: THIM is a wearable device designed to accurately estimate sleep
onset. This article presents two studies that tested the original (Study 1) and a refined (Study
2) THIM sleep onset algorithms compared to polysomnography (PSG).
Methods: Twelve (Study 1) and twenty (Study 2) individuals slept in the laboratory
on two nights where participants underwent THIM-administered sleep onset trials with
simultaneous PSG recording. Participants attempted to fall asleep whilst using THIM, which
woke them once it determined sleep onset.
Results: In Study 1, there was no significant difference between PSG (Mean, M =
1.94 min, SD = 1.32) and THIM-sleep onset latency (M = 2.05 min, SD = 1.38) on the first or
second night, p > .07. There were moderate correlations between PSG and THIM on both
nights, r(s) > .57, p < .001. On 23.74% of trials, PSG-sleep onset could not be determined
before THIM ended the trial. With a revised THIM algorithm in Study 2, there was no
significant difference between PSG (M = 3.41 min, SD = 2.21) and THIM-sleep onset latency
(M = 3.65 min, SD = 2.18), p = .25, strong correspondence between the two devices, r(s)
> .73, p < .001, narrow levels of agreement on Bland-Altman plots, and significantly less
trials where PSG-sleep onset had not occurred (10.24%), p = .04.
Conclusions: THIM showed a high degree of correspondence and agreement with
PSG for estimating sleep onset. Future research will investigate whether THIM is accurate
with an insomnia sample for clinical purposes.
Keywords: sleep onset latency; Intensive Sleep Retraining; wearable device;
consumer sleep technology; polysomnography; actigraphy.
2
Brief Summary
Monitoring the onset of sleep outside of the laboratory setting is required for many purposes,
yet there are few simple objective methods available. Here, we discuss the accuracy of a
new wearable device called THIM. The revised version of the THIM algorithm showed high
agreement with the gold-standard measure of sleep, polysomnography, on a number of
indices. Further research is required to examine the accuracy of THIM with individuals with
insomnia to inform its clinical utility for administering a brief (24-hour) but effective
behavioural treatment for insomnia, once restricted to the sleep laboratory, in the home
environment.
3
Introduction
Accurate assessment of sleep onset latency (SOL) is required for a variety of
research and clinical purposes. For instance, Intensive Sleep Retraining (ISR) is a
behavioural treatment for chronic insomnia that involves repeatedly falling asleep and
waking up shortly thereafter over the course of one overnight session 1,2. Additionally, brief
daytime sleeps such as power naps or sleep diagnostic tests like the Multiple Sleep Latency
Test (MSLT) involve achieving a precise amount of sleep 3,4. These three purposes require
the accurate detection of sleep onset so that the individual can be awoken after the
appropriate duration of sleep. Yet, the accurate estimation of sleep onset in the home
environment is difficult, with the accuracy of popular actigraphy-based wearable devices
varying widely across individuals 5. This limits the translation of these purposes beyond the
sleep laboratory. The current article investigated the accuracy of a new wearable device for
estimating SOL, which may be used to implement these purposes outside the laboratory
setting.
THIM is a new consumer sleep device developed by Re-Time Pty. Ltd. that is worn
like a ring 6. To estimate SOL, THIM administers brief, low intensity vibrations at intervals
averaging 30 seconds apart. The individual is required to respond to the vibrations by
tapping their finger. When the individual does not respond to two consecutive vibrations, the
device infers that they have fallen asleep. Thus, the device can estimate sleep onset in real
time shortly after it occurs. THIM can also be programmed to wake the individual after a pre-
specified duration of sleep. THIM was designed to administer ISR and may be capable of
administering power naps and daytime diagnostic tests (e.g. the MSLT) outside of the
laboratory setting, without the need for expensive equipment or trained individuals to setup,
administer or score the data. However, the accuracy of THIM for estimating sleep onset is
currently unknown and must be tested to ensure that it can conduct these applications
appropriately.
THIM uses the stimulus-response method to estimate sleep onset. The scoring
criteria for polysomnography (PSG) was developed in part by examining
4
electroencephalography (EEG) changes that occur with the cessation of behavioural
responses to external stimuli 7,8. Hence, this behavioural method of estimating sleep onset
corresponds highly with PSG-defined sleep onset, with responses to stimuli typically ceasing
between late-N1 sleep and N2-sleep onset 9,10.
Whilst similar devices using the stimulus-response method are accurate for
estimating SOL, THIM differs from previously tested devices in ways that may affect its
accuracy. Devices tested in previous research have typically administered auditory stimuli
perceived through the auditory perception pathway 13, whereas vibratory stimuli emitted from
THIM are perceived through the somatosensory system 14,15. Whether these pathways show
similar inhibition across the sleep onset period is currently unknown. MacLean and
colleagues (16) tested the discrepancy between PSG-sleep onset and behavioural responses
(depression of a switch) to a hand-held device that administered vibratory stimuli. The
authors found no significant differences between PSG and the hand-held device for
estimating SOL. However, the vibratory stimuli were not calibrated to a minimally perceptible
level: the vibrations were delivered at five standard deviations above participant’s waking
threshold. Therefore, responsiveness to minimal intensity tactile stimuli - as utilised by THIM
- during the sleep onset period is yet to be tested.
A potential, currently untested limitation of devices that use the stimulus-response
method is the effect of learning on the device’s accuracy. When using THIM, finger tap
responses are elicited frequently in response to vibratory stimuli. Over repeated use, the
finger taps may become an automatic response to stimuli that the individual could produce
without conscious awareness of the stimuli occurring. Under classical conditioning theory,
the finger tap response would become a conditioned response to the vibratory stimuli after
many paired repetitions over time. This would be problematic if the conditioned finger tap
response could occur during deeper stages of sleep, potentially causing THIM to
increasingly overestimate SOL with repeated use.
The current article summarises the development of the THIM device for estimating
SOL in comparison to the gold standard objective measure of sleep, PSG. Two studies will
5
be presented. The aim of the first study was to test the accuracy of the initial THIM algorithm
for estimating SOL with healthy individuals. The findings informed modifications to the
algorithm, with the aim of the second study to assess the accuracy of the revised THIM
algorithm with a larger independent sample. We also conducted secondary analyses to
determine whether the accuracy of THIM is affected by previous use - indicative of potential
learning effects. Additionally, we examined whether the accuracy of THIM varies between
individuals with good or poor sleep, with a sample that represented the variability of sleep
patterns found in the general population.
Study 1: Method
Participants
Ethics approval was obtained from the Flinders University Social and Behavioural
Research Ethics Committee, South Australia. Potential participants were recruited via
advertisements on community noticeboards and social media. Eligibility criteria were as
follows: self-reported average habitual bedtime between 22:00-00:00 and wake up time
between 06:00-08:00; fluent in English; no diagnosis of a physical or mental health condition;
no active nicotine or illicit substance use, or alcohol (>10 standard drinks p/wk) or caffeine
(>250 mg p/day) dependence; no consumption of medications known to interfere with sleep;
no overnight shift work or trans-meridian travel within the last two months; not pregnant or
lactating. Screening questionnaires comprised of the Insomnia Severity Index ([ISI] 17) and
the Pittsburgh Sleep Quality Index ([PSQI] 18) to assess sleep schedules and insomnia
symptomology, as well as a health and lifestyle questionnaire to assess physical and mental
health conditions, medication use, caffeine/alcohol/nicotine consumption, and recent
overseas travel.
Thirteen healthy individuals met eligibility criteria, but one participant withdrew after
participating in Night 1. The final sample comprised on twelve individuals, see Table 1 for
participant characteristic information. Scores on the ISI indicated that five participants had
subthreshold levels of insomnia and were categorised as poor sleepers (ISI score ≥ 7), and
seven were good sleepers (ISI score < 7).
6
<insert Table 1 here>.
Materials
Polysomnography. PSG was recorded using Compumedics Grael 4K PSG:EEG
devices (Compumedics, Victoria, Australia). Six EEG (F3-M2, F4-M1, C3-M2, C4-M1, O1-
M2, O2-M1), reference and ground, right and left electrooculography (EOG), chin
electromyography (EMG), and electrocardiography (ECG) sites were sampled at 256Hz.
PSG data was scored using Profusion Compumedics software (v4) by a qualified,
independent sleep technician. In accordance with AASM scoring criteria 19, PSG-SOL was
defined as the time between the start of the attempt to sleep (beginning of the sleep onset
trial) and the first epoch of any stage of sleep during the trial (most commonly, the beginning
of N1 sleep).
THIM. THIM (firmware v1.0.3) is a small, ring-like device worn on the index finger of
the dominant hand. THIM comes with four different-sized ring bands so that the device can
fit securely onto index fingers of almost all sizes. To setup THIM, the device was connected
via Bluetooth to the accompanying smartphone application (v1.0.1) using an Apple iPhone
5s model (iOS 8.0). Participants started a sleep onset trial by tapping their index finger on
which THIM was placed onto their thumb, twice in quick succession (see Figure 1). During
the trials, the device emitted low intensity, short duration vibratory stimuli at non-regular
intervals (averaging 30 seconds apart). The intensity of the vibrations was individually
calibrated to the minimum level that the participant could consistently respond to whilst
awake using the threshold hunting procedure outlined in the THIM smartphone application.
Participants were required to respond to the vibratory stimuli by tapping their index finger
once onto their thumb, with responses detected by the device’s accelerometer. If participants
failed to respond to two consecutive vibratory stimuli, the device inferred that sleep onset
had occurred and it emitted a high intensity alarm vibration to wake them up, signalling the
end of the trial. Shortly afterwards (approximately 1-2 minutes later), participants attempted
another trial. THIM’s estimations of SOL is the time from the beginning of the trial to slightly
before the time of the first of the two consecutively-missed vibratory stimuli.
7
<insert Figure 1 here>.
To monitor THIM, we mounted a small piezo-electric sensor to the side of the THIM
device using adhesive tape. This sensor was inputted into a channel on the PSG device.
From this sensor, we observed four events of interest: vibrations emitted from THIM, finger
taps as responses to the vibrations, as well as the beginning (the double-tap motion) and
end (the high-intensity alarm vibration) of each trial. These four events were scored manually
on the Profusion Compumedics software by two scorers (HS and AW). If the events of
interest on the sensor data were obscured by body movements, the trial was removed from
analysis. The sensor data allowed the PSG and THIM data to be precisely time-locked,
reducing error of measurement. The interrater reliability on 10 randomly selected nights of
data exceeded 95% agreement between the two scorers.
Procedure
Home Testing. Participants completed a sleep diary based on the Consensus Sleep
Diary 20 and wore an actigraphy device (Actiwatch-2, Philips Respironics) every day for one
week to monitor their sleep pattern prior to the first laboratory night. Participants’ average
bedtimes and wake up times were calculated from the sleep diary to inform the timing of the
study protocol. The actigraphy data corroborated the bedtimes and wake up times reported
in the sleep diaries.
Laboratory Night 1. The first night was an adaptation night to help participants
become accustomed to sleeping in the laboratory environment with the sleep monitoring
equipment. Participants went to bed at their typical bedtime and slept overnight whilst
monitored by PSG and THIM. They were awoken at their typical wake up time when both
devices were removed, and participants left the sleep laboratory. Participants continued to
wear the Actiwatch-2 device during the subsequent day to confirm that they did not nap prior
to Night 2.
Laboratory Night 2. Participants arrived at the sleep laboratory at approximately
20:00 and were setup for overnight PSG recording. The THIM device was placed on the
participant’s index finger on their dominant hand along with a piezo-electric sensor secured
8
to the side of the device. After setting the vibratory stimulus intensity, participants received
instructions from research assistants on how to operate THIM. See the Supplementary
Materials for this procedure.
THIM-administered sleep onset trials began an hour prior to a participant's bedtime
and maintained continuously for four hours in total (3 hours past habitual bedtime)..
Compliance was confirmed by qualified research assistants observing participants via video
recording and the THIM sensor data in real-time. Once THIM determined sleep onset during
the final trial, instead of emitting a high intensity alarm vibration, the device let them sleep
uninterrupted until they spontaneously awoke in the morning. All devices except the
Actiwatch-2 device were removed and participants returned home.
Home Testing. Between Night 2 and Night 3, participants completed Sleep Diaries
and wore the Actiwatch-2 device every day for another week.
Laboratory Night 3. Participants returned to the sleep laboratory to undergo the
same testing protocol as experienced on Laboratory Night 2.
Data Analysis
The mean PSG and THIM estimations of SOL were compared separately for Nights 2
and 3. Cohen’s d were calculated as the mean difference in PSG and THIM estimations of
SOL divided by the pooled standard deviation. The mean discrepancies between PSG and
THIM were calculated for each individual separately. Then, these individual means were
averaged together for each night so that each individual contributed equal weighting to the
overall mean. Positive mean discrepancy values meant that THIM overestimated SOL,
whereas negative values indicated that THIM underestimated SOL compared to PSG.
Paired samples t-tests were then conducted to test whether THIM significantly
underestimated or overestimated SOL compared to PSG, separately for both laboratory
nights. Additionally, the degree of correspondence between PSG and THIM was calculated
across all sleep onset trials using Spearman’s rank correlation coefficients, separately for
Nights 2 and 3.
The level of agreement between PSG and THIM was assessed with Bland-Altman
9
plots, which shows the discrepancy between PSG and THIM-SOL (y axis) against PSG SOL
(x axis) across all trials on each night 21. This involved calculating the mean difference (bias)
and the limits of agreement (± 1.96 SD of the mean difference) between these measures.
Upper and lower limits of agreement within ± five minutes of PSG was considered
acceptable, as previously defined as an acceptable criterion for the administration of
Intensive Sleep Retraining with a wearable device 5. The r squared value for the linear
regression line and coefficient p value are reported in the Bland-Altman plot figures, as an
indicator of the degree of proportional bias 21. Some datapoints represent many overlapping
values.
To examine differences in the accuracy of THIM after repeated use which may
indicate a learning effect, a paired samples t-test was conducted to compare the
discrepancies between PSG and THIM-SOL on Night 2 versus Night 3. Additionally, paired
samples t-tests were conducted to compare differences in the discrepancy between PSG
and THIM-SOL on Night 2 versus Night 3 for each trial (e.g. on the first, second, third trial,
etc.). To examine the impact of participants’ sleep quality on the accuracy of THIM, an
independent samples t-test was conducted to determine whether the discrepancy between
PSG and THIM differed between good or poor sleepers, separately for Night 2 and Night 3.
Study 1: Results
First Sleep Onset Trial Night
On laboratory Night 2, there was no significant difference between the mean PSG-
SOL (M = 1.94 min, SD = 1.32) and mean THIM-SOL (M = 2.05 min, SD = 1.38), t(11) = -
0.88, p = .40, d = .08. The mean discrepancy between PSG and THIM-SOL on this night
was low, M = 0.08 min, SD = 0.49. There was also a significant moderate correlation
between PSG and THIM-SOL across all sleep onset trials, r(s) = .67, p < .001.
The level of agreement between PSG and THIM-SOL on Night 2 is illustrated in
Figure 2. As shown by the narrow levels of agreement, there is little variability in the
discrepancy between PSG and THIM-SOL across the 411 trials. Furthermore, the
discrepancy between PSG and THIM is consistent across trials with increasing latency
10
duration, as indicated by the blue trendline. Of note, are data points above the upper limit of
agreement that seem to depict trials where participants were responding to THIM’s vibratory
stimuli for 5+ mins into PSG-sleep. Closer inspection of these trials revealed that participants
did not remain asleep after the first epoch of PSG-sleep in these trials: participants were
fluctuating between wake and N1 sleep during this time.
<Insert Figure 2 here>.
Second Sleep Onset Trial Night
There was no significant difference between mean PSG-SOL (M = 1.40 min, SD =
0.64) and mean THIM-SOL (M = 2.12 min, SD = 1.71) on laboratory Night 3, t(11) = -2.02, p
= .07. Despite a medium effect size, d = 0.56, the mean discrepancy between PSG and
THIM-SOL on this night was still relatively low, M = 0.57 min, SD = 1.10. Additionally, there
was a significant moderate correlation between PSG and THIM-SOL across all sleep onset
trials, r(s) = .57, p < .001.
Figure 3 is a Bland-Altman plot illustrating the level of agreement between PSG and
THIM-SOL across all Night 3 trials. Similar to Figure 2, the variability in the discrepancy
between PSG and THIM-SOL across 527 trials is low. Figure 3 also shows trials where
participants were responding to THIM’s vibratory stimuli whilst fluctuating between wake and
N1 sleep (points above the upper limit of agreement).
<Insert Figure 3 here>.
Learning Effects
A paired samples t-test indicated that there was no significant difference in the mean
discrepancy between PSG and THIM-SOL on Night 2 compared to Night 3, t(11) = -1.90, p =
.08. There was a medium effect size, d = 0.57. Paired samples t-tests revealed no significant
differences in the discrepancy between PSG and THIM on Night 2 versus Night 3 for any
trial (e.g. on the first, second, third trial, etc.), p > .10. The accuracy of THIM compared to
PSG appears to remain high and does not significantly decrease, even after repeated use.
Good and Poor Sleeper Comparison
An independent samples t-test revealed that there was no significant difference in the
11
mean discrepancy between PSG and THIM-SOL on Night 2 for good sleepers (M = 0.06
min, SD = 0.44) compared to poor sleepers (M = 0.09 min, SD = 0.60), t(10) = -0.11, p = .92,
d = 0.08. Similarly, there was no significant difference in the mean discrepancy on Night 3
between good sleepers (M = 0.34 min, SD = 0.21) and poor sleepers (M = 0.88 min, SD =
1.75), t(4.08) = -0.68, p = .53, although there was a medium effect size, d = 0.48. Therefore,
the accuracy of THIM does not appear to differ between good and poor sleepers.
THIM False Positive Trials
Due to a slight delay between THIM-sleep onset and the end of the trial, there were
some occasions where THIM underestimated sleep onset but PSG-sleep onset was reached
before THIM ended the trial, as shown in Figures 2 and 3. However, it became apparent that
there was a considerable proportion of sleep onset trials during which PSG-sleep onset had
not occurred before THIM estimated sleep onset and ended the trial. Because a PSG-SOL
datapoint was unavailable for those trials, and it could not be predicted, they were excluded
from the above analyses. On average, PSG-sleep onset had not occurred in an average of
15.42 (SD = 16.22, 31.04% of Night 2 trials) of Night 2 trials per participants where THIM
had detected sleep onset. Similarly, there was an average of 8.92 ‘false positive’ trials (SD =
9.82, 16.88%) per participant on Night 3. There was no significant difference between Nights
2 and 3 on the number of false positive trials, t(11) = 1.47, p = .17, d = 0.49.
There are several possible reasons for the THIM determination of sleep onset when
participants were still awake according to PSG. One potential explanation is that participants
did not respond to the vibratory stimulus because they did not perceive it. However, this was
not the case for the majority of these false positive trials. Participants did not respond to
either of the last two consecutive vibratory stimuli for 28.42% of these false positive trials on
Night 2 and 42.00% of these trials on Night 3. In other words, participants had indeed
responded to one or both of the last two consecutive vibratory stimuli before the trial ended,
but the device had not registered the response. This was true for the majority of false
positive trials on both Night 2 (71.58%) and Night 3 (58.00%).
To register as a legitimate response to vibratory stimuli, finger tap responses had to
12
meet timing and intensity criteria. In order to exclude any spontaneous, random finger
twitches, a time window following the stimulus was established during which the response
had to occur to meet the valid response criterion. THIM failed to detect 42.02% on Night 2
and 48.77% on Night 3 of responses that occurred just beyond the time window. Therefore,
a majority of the finger tap responses on Night 2 and Night 3 occurred within the required
time window yet were not registered by THIM. This is presumably because the finger taps
were not vigorous enough to exceed the accelerometer threshold criterion required to
register as a legitimate response.
Study 1: Discussion
The aim of Study 1 was to test the accuracy of THIM for estimating SOL against
PSG. Overall, there was moderate agreement between THIM and PSG, regardless of
sleeper type (good or poor sleeper status) and repeated use (Night 2 versus Night 3).
Having said this, THIM had estimated sleep onset and prematurely ended the trial before
PSG-sleep onset criteria were met on a considerable number of trials. This is an issue for
two reasons. Firstly, we needed to exclude these trials from analysis: 23.74% of trials across
Night 2 and Night 3. This undermined our ability to make strong conclusions about the
accuracy of THIM. Secondly, this issue is problematic for the administration of many
functions, including ISR. If THIM determined that the patient had fallen asleep and ended the
trial when they were still awake, then the trial would be a wasted retraining opportunity as
presumably, sleep onset must occur during the trial to obtain therapeutic benefit.
Consequently, we made recommendations to the manufacturers of THIM, Re-Time
Pty. Ltd., about potential modifications to the THIM algorithm. The recommendations
included reducing the threshold accelerometer intensity required for a legitimate finger twitch
and expanding the time window during which such a response could occur to include the full
distribution of reaction times to the vibratory stimuli observed in Study 1. The company
incorporated these modifications into a revised algorithm, which we tested in the second
study to determine whether the issue had been resolved.
Study 2: Method
13
The study design, materials, study protocol, and data analysis plan of the second
study were identical to the first study, except that we tested the revised version of THIM
(firmware v1.0.4) with a larger, independent sample.
Participants
Participants of the second study were required to meet the same eligibility criteria as
participants in the first study. Twenty healthy individuals met eligibility criteria and consented
to participate. ISI scores at screening indicated that ten participants had subthreshold levels
of insomnia and were categorised as poor sleepers (ISI score ≥ 7), and ten were good
sleepers (ISI score < 7). See Table 1 for participant characteristic information and a
comparison between the Study 1 and Study 2 samples. There were no significant
differences on the participant characteristics between the two samples.
Study 2: Results
First Sleep Onset Trial Night
One PSG recording failed due to technical error and thus, this night’s data is only
based upon 19 participants. With the revised THIM algorithm, there was still no significant
difference between PSG (M = 3.41 min, SD = 2.21) and THIM estimations of mean SOL (M
= 3.65 min, SD = 2.18) on laboratory Night 2, t(18) = -1.18, p = .25, d = 0.11. There was a
small mean discrepancy between the two measures, M = 0.24 min, SD = 0.90. There was
also a significant strong correlation between PSG and THIM-SOL across all sleep onset
trials, r(s) = .77, p < .001. As shown in Figure 4, there was strong agreement between PSG
and THIM-SOL across 535 trials.
<Insert Figure 4 here>.
Second Sleep Onset Trial Night
Unlike Night 2, on Night 3 there was a significant difference between PSG (M = 3.93
min, SD = 3.32) and THIM-SOL (M = 4.75 min, SD = 3.85). THIM significantly overestimated
SOL compared to PSG, t(19) = -2.78, p = .01, d = 0.23. However, the effect size and mean
discrepancy between PSG and THIM was still low, M = 0.82 min, SD = 1.31. Additionally,
there was a significant strong correlation between PSG and THIM-SOL across all sleep
14
onset trials, r(s) = .73, p < .001. Figure 5 continued to show strong agreement between PSG
and THIM across 578 trials, as evident by the narrow levels of agreement.
<Insert Figure 5 here>.
Comparison between THIM algorithms
The goal of revising the THIM algorithm was to reduce the number of THIM false
positive trials. With the revised algorithm, there was a mean of 4.05 false positive trials (SD
= 3.76) per participant on Night 2 and 2.53 trials (SD = 2.09) per participant on Night 3, or
10.24% of trials overall. We conducted an independent samples t-test to determine whether
the issue occurred in less trials with the revised THIM algorithm compared to the original
algorithm. There was a significantly lower number of false positive trials with the revised
algorithm compared to the original algorithm on Night 2, t(11.75) = 2.39, p = .04, and Night
3, t(11.57) = 2.24, p = .046. The effect sizes were large for Night 2, d = 1.09, and Night 3, d
= 1.04. Considering that the issue occurred in a smaller minority of trials in Study 2, it
appears that the modifications made to the THIM algorithm improved this issue without
substantially increasing the mean discrepancy between THIM and PSG, though the issue
was not entirely resolved.
We also conducted independent samples t-tests to determine whether the revised
THIM algorithm (Study 2) had a lower mean discrepancy for estimating SOL than the original
algorithm (Study 1) on Night 2 and Night 3. There were no significant differences in the
mean discrepancy between the original algorithm or the revised algorithm on Night 2, t(30) =
-1.73, p = .10, or Night 3, t(30) = -0.68, p = .50.
Learning Effects
As in Study 1, there was no significant difference between the mean discrepancy of
PSG and THIM-SOL on Night 2 compared to Night 3, t(18) = -1.84, p = .08, although there
was a medium effect size, d = 0.51. Additional paired samples t-tests revealed no significant
differences in the discrepancy between PSG and THIM on Night 2 versus Night 3 on any
trial, p > .13. Therefore, the accuracy of THIM does not appear to significantly reduce after
repeated use.
15
Good and Poor Sleeper Comparison
An independent samples t-test showed no significant difference in the mean
discrepancy between PSG and THIM-SOL on Night 2 for good sleepers (M = 0.45 min, SD =
0.88) compared to poor sleepers (M = 0.55 min, SD = 0.68), t(17) = -0.28, p = .78, d = 0.13.
Similarly, there was no significant difference in the mean discrepancy on Night 3 between
good sleepers (M = 0.89 min, SD = 1.65) and poor sleepers (M = 0.87 min, SD = 1.06), t(18)
= 0.03, p = .98, d = 0.01. This is further evidence to suggest that sleeper type does not affect
the accuracy of THIM.
Study 2: Discussion
The aims of both studies were to assess the accuracy of THIM for estimating SOL
compared to PSG. Study 1 tested the original THIM algorithm and Study 2 tested a THIM
algorithm that was modified based on the findings from Study 1. In Study 2, THIM-SOL
showed strong correspondence and agreement with PSG sleep onset (evident by the
correlations, mean discrepancy tests and Bland-Altman plots), for both good and poor
sleepers and even after repeated use (Night 2 compared to Night 3). The revised THIM
algorithm also improved an issue found in Study 1 where THIM estimated that sleep onset
had occurred in trials where PSG-sleep onset criteria were not yet met. Whilst this issue still
occurred on approximately 10% of sleep onset trials, this is not thought to be a substantial
issue that would impact the use of THIM for the device’s main purpose of administering
Intensive Sleep Retraining. The Intensive Sleep Retraining procedure involves 30-40 sleep
onsets over the course of treatment, and if only 3-4 trials are unsuccessful as anticipated
from the findings of this study, then individuals should still experience many successful sleep
onset trials across the treatment session. Additionally, the low degree of overestimation of
sleep onset latency means that when THIM wakens the individual, they are unlikely to have
reduced homeostatic sleep drive enough to impact the subsequent sleep onset attempt and
the efficacy of the treatment. Therefore, the revised algorithm was an improvement upon the
original algorithm, and the device appears accurate enough at estimating SOL for the
purpose of administering Intensive Sleep Retraining. Future research would need to test
16
whether the device is accurate enough for reliably administering power naps and the MSLT,
noting that the number of false positive trails may be an issue for these purposes.
THIM had considerably closer agreement with PSG sleep onset compared to other
wearable devices 22,23. The next generation of actigraphy devices that incorporate information
from additional sensors, such as heart rate variability, appear to have greater accuracy
compared to standard actigraphy devices 24,25. However, THIM shows greater agreement
with PSG for estimating SOL than these multi-sensor devices: an underestimation of 7.48
minutes (SD = 6.64) was found in Fonseca et al.25 and a mean bias of 4 minutes (SD = 9)
was found in de Zambotti et al.24 (see Scott et al. 5 for a review). In fact, THIM produced
comparable accuracy to simplified EEG-based devices 26-28.
THIM also showed closer agreement with PSG sleep onset than similar devices that
also use the stimulus-response method of sleep onset estimation 10,11. This may be due to
differences in the stimulus type. THIM uses vibratory stimuli, which is perceived via a
different sensory processing pathway compared to the auditory stimuli utilised by similar
devices 13,15. It was evident from the piezo-electric motion sensor data collected during Study
2 that once participants entered PSG-defined sleep, they ceased responding to the vibratory
stimuli. This suggests that participants either a) did not perceive the vibratory stimulus and
remained totally asleep, or b) the individual stirred from sleep slightly, but the vibratory
stimulus was not salient enough to arouse the individual enough to produce a finger tap
response. A quantified EEG analysis comparing brainwave activity before and after a
vibratory stimulus would shed light on whether participants aroused at all to vibratory stimuli
during PSG-defined sleep, which may also elucidate how disruptive the stimulus and finger
tap responses are on the process of falling asleep. Regardless, it appears that the type of
stimulus to which participants respond may impact the accuracy of stimulus-response
devices. Future research could directly compare the use of different types of low intensity
stimuli to determine when each sensory system is inhibited during the sleep onset period.
While this study evaluated the accuracy of THIM for estimating sleep onset latency
across more than 2,000 sleep onset trials, the relatively low sample size has limited
17
statistical power for detecting between-subject effects. Another limitation to consider is that
the PSG data was scored by only one qualified sleep technician in the current study. The
interrater reliability of N1-sleep onset in particular is low at approximately 68% and 74% for
the epochs before and at sleep onset, respectively29. This adds to the error of measurement
in the gold-standard measure that should be considered when interpreting the findings of the
current study. Furthermore, although investigating THIM over two nights is a strength of this
investigation, it is possible that observation over additional nights is necessary to detect
learning effects. This would be important if individuals use the device frequently, such as for
power naps. The use of THIM over more than two occasions should therefore be
investigated in future research to explore its utility for frequent power napping.
Investigating the accuracy of THIM for individuals with insomnia is particularly
important for the administration of ISR because the device may be less accurate with this
population. In line with the neurocognitive model of insomnia 30, individuals with insomnia
may have abnormally sensitive/acute sensory and information processing during the sleep
onset period. Increased sensory responsivity may mean that people with insomnia perceive
vibratory stimuli beyond N1-sleep onset more so than average sleepers. Consequently,
THIM may overestimate SOL to a greater extent for those with insomnia compared to good
sleepers. The current studies did not include individuals with insomnia, but there was no
significant difference in the accuracy of THIM between good and poor sleepers. However,
neither of the two studies presented were adequately powered to detect small differences
between groups that may be relevant. Furthermore, insomnia-related arousal may not be
present for those identified as having poor sleep: this conditioned arousal is theorised to
develop over time 31, whereas poor sleep in general may be episodic in nature 32.
Additionally, ‘poor sleepers’ in this study could have also included those with sleep
maintenance and early morning awakening nocturnal symptoms and not necessarily
individuals with sleep initiation difficulties, is are germane to the clinical utility of THIM for
administering Intensive Sleep Retraining. Therefore, the accuracy of THIM should be
investigated with individuals with sleep onset insomnia specifically in future research.
18
Additional future research should be conducted to explore the clinical utility of THIM
in other sleep-disordered populations. Whether individuals will comply adequately with the
instructions of tapping in response to THIM’s vibrations, a necessity for the device to
estimate sleep onset appropriately, will need to be investigated in future research (see Lack
et al. 33 for further discussion). Whether THIM can also reliably wake people from sleep with
the high intensity alarm vibration is also a topic for further investigation. This may be
particularly problematic for clinical uses with excessively sleepy patients who may be difficult
to wake, such as for administering MSLTs.
This article showcased the development of the THIM algorithm for estimating SOL in
comparison to PSG. The revised algorithm demonstrated strong correspondence and
agreement with PSG, with a considerably lower percentage of false positive trials.
Additionally, repeated use and sleeper type (good or poor sleeper) did not impact the
accuracy of THIM. More research is needed to investigate whether other individual
characteristics affect the accuracy of THIM, particularly a diagnosis of insomnia.
19
Acknowledgements: The authors would like to acknowledge the contributions of
study participants, Flinders University third-year Psychology placement students, and
Adelaide Institute for Sleep Health staff that assisted with data collection.
20
Abbreviations
AASM American Academy of Sleep Medicine
ECG electrocardiography
EEG electroencephalography
EMG electromyography
EOG electrooculography
ISI Insomnia Severity Index
ISR Intensive Sleep Retraining
M Mean
MSLT Multiple sleep latency test
N Sample size
N1 Non-rapid eye movement Stage 1
N2 Non-rapid eye movement Stage 2
PSG polysomnography
SD Standard deviation
SOL sleep onset latency
TST total sleep time
21
Reference List
1. Harris J, Lack L, Wright H, Gradisar M, Brooks A. Intensive Sleep Retraining
treatment for chronic primary insomnia: A preliminary investigation. J Sleep Res.
2007;16(3):276-284.
2. Harris J, Lack L, Kemp K, Wright H, Bootzin R. A randomized controlled trial of
intensive sleep retraining (ISR): A brief conditioning treatment for chronic insomnia.
Sleep. 2012;35(1):49-60.
3. Carskadon MA. Guidelines for the multiple sleep latency test (MSLT): A standard
measure of sleepiness. Sleep. 1986;9(4):519-524.
4. Lovato N, Lack L. The effects of napping on cognitive functioning. Prog Brain Res.
2010;185:155-166.
5. Scott H, Lack L, Lovato N. A systematic review of the accuracy of sleep wearable
devices for estimating sleep onset. Sleep Med Rev. 2019;49:101227.
6. Re-Time PL. Thim - the first wearable device for sleep, which will improve your sleep.
2016; http://thim.io/.
7. Dement W, Kleitman N. Cyclic variations in EEG during sleep and their relation to
eye movements, body motility, and dreaming. EEG Clin Neurophysiol. 1957;9(4):673-
690.
8. Loomis AL, Harvey EN, Hobart G. Potential rhythms of the cerebral cortex during
sleep. Science. 1935;81(2111):597-598.
9. Ogilvie RD. The process of falling asleep. Sleep Med Rev. 2001;5(3):247-270.
10. Ogilvie RD, Wilkinson RT, Allison S. The detection of sleep onset: Behavioral,
physiological, and subjective convergence. Sleep. 1989;12(5):458-474.
11. Mair A. The relation between EEG and behavioural sleep onset. Adelaide, Australia,
Honours thesis submitted to Flinders University; 1994.
12. Scott H, Lack L, Lovato N. A pilot study of a novel smartphone application for the
estimation of sleep onset. J Sleep Res. 2018;27(1):90-97.
13. Cohen YE, Bennur S, Christison-Lagay K, Gifford AM, Tsunada J. Functional
22
organization of the ventral auditory pathway. Adv Exp Med Biol. 2016;894:381-388.
14. Abraira VE, Ginty DD. The sensory neurons of touch. Neuron. 2013;79(4):618-639.
15. Kaas JH. Somatosensory System. In: The Human Nervous System.2012:1074-1109.
16. MacLean AW, Arnedt T, Biedermann H, Knowles JB. Behavioural responding as a
measure of sleep quality. Sleep Res. 1992;21:105.
17. Morin CM, Belleville G, Bélanger L, Ivers H. The Insomnia Severity Index:
Psychometric indicators to detect insomnia cases and evaluate treatment response.
Sleep. 2011;34(5):601-608.
18. Buysse DJ, Reynolds CF, Monk TH, Berman SR, Kupfer DJ. The Pittsburgh Sleep
Quality Index: A new instrument for psychiatric practice and research. Psychiatry
Res. 1989;28:193-213.
19. American Academy of Sleep Medicine. The AASM manual for the scoring of sleep
and associated events (version 2.5). American Academy of Sleep Medicine.; 2018.
20. Carney CE, Buysse DJ, Ancoli-Israel S, et al. The consensus sleep diary:
Standardizing prospective sleep self-monitoring. Sleep. 2012;35(2):287-302.
21. Bland JM, Altman D. Statistical methods for assessing agreement between two
methods of clinical measurement. Lancet. 1986;327(8476):307-310.
22. Cellini N, Buman MP, McDevitt EA, Ricker AA, Mednick SC. Direct comparison of
two actigraphy devices with polysomnographically recorded naps in healthy young
adults. Chronobiol Int. 2013;30(5):691-698.
23. Chae KY, Kripke DF, Poceta JS, et al. Evaluation of immobility time for sleep latency
in actigraphy. Sleep Med. 2009;10(6):621-625.
24. de Zambotti, M., Goldstone A, Claudatos S, Colrain IM, Baker FC. A validation study
of Fitbit Charge 2 compared with polysomnography in adults. Chronobiol Int.
2018;35(4):465-476.
25. Fonseca P, Weysen T, Goelema MS, et al. Validation of photoplethysmography-
based sleep staging compared with polysomnography in healthy middle-aged adults.
Sleep. 2017;40(7).
23
26. Cellini N, McDevitt EA, Ricker AA, Rowe KM, Mednick SC. Validation of an
automated wireless system for sleep monitoring during daytime naps. Beh Sleep
Med. 2015;13:157-168.
27. Kaplan RF, Wang Y, Loparo KA, Kelly MR, Bootzin RR. Performance evaluation of
an automated single-channel sleep-wake detection algorithm. Nature and Science of
Sleep. 2014;6:113-122.
28. Markwald RR, Bessman SC, Reini SA, Drummond SP. Performance of a portable
sleep monitoring device in individuals with high versus low sleep efficiency. J Clin
Sleep Med. 2016;12(1):95-103.
29. Rosenberg RS, Van Hout S. The American Academy of Sleep Medicine inter-scorer
reliability program: Sleep stage scoring. J Clin Sleep Med. 2013;9(1):81-87.
30. Perlis ML, Giles DE, Mendelson WB, Bootzin RR, Wyatt JK. Psychophysiological
insomnia: the behavioural model and a neurocognitive perspective. J Sleep Res.
1997;6(3):179-188.
31. Perlis M, Jungquist C, Smith MT, Posner D. The cognitive behavioral treatment of
insomnia: A treatment manual. In: Springer; 2005:
https://www.amazon.com/Cognitive-Behavioral-Treatment-Insomnia-Session-ebook/
dp/B000PC6BGA.
32. Perlis ML, Vargas I, Ellis JG, et al. The natural history of Insomnia: The incidence of
acute Insomnia and subsequent progression to Chronic Insomnia or recovery in good
sleeper subjects. Sleep. 2019.
33. Lack L, Scott H, Micic G, Lovato N. Intensive Sleep Re-Training: From bench to
bedside. Brain Sci. 2017;7(4).
24
Figure Titles and Captions List
Figure 1. Illustration of the finger tap motion with the THIM device.
Figure 2. Bland-Altman plot indicating agreement between PSG and THIM-SOL on Night 2
for Study 1 data.
Caption for Figure 2: The solid black line indicates the mean difference, the dotted red lines
indicate the upper and lower limits of agreement and the dotted blue line is the linear
trendline. The R2 value and p value represent the linear regression line as indicators of the
degree of proportional bias. Some datapoints represent many overlapping values.
Figure 3. Bland-Altman plot indicating agreement between PSG and THIM-SOL on Night 3
for Study 1 data.
Caption for Figure 3: The solid black line indicates the mean difference, the dotted red lines
indicate the upper and lower limits of agreement and the dotted blue line is the linear
trendline. The R2 value and p value represent the linear regression line as indicators of the
degree of proportional bias. Some datapoints represent many overlapping values.
Figure 4. Bland-Altman plot indicating agreement between PSG and THIM-SOL on Night 2
for Study 2 data.
Caption for Figure 4: The solid black line indicates the mean difference, the dotted red lines
indicate the upper and lower limits of agreement and the dotted blue line is the linear
trendline. The R2 value and p value represent the linear regression line as indicators of the
degree of proportional bias. Some datapoints represent many overlapping values.
Figure 5. Bland-Altman plot indicating agreement between PSG and THIM-SOL on Night 3
for Study 2 data.
Caption for Figure 5: The solid black line indicates the mean difference, the dotted red lines
indicate the upper and lower limits of agreement and the dotted blue line is the linear
trendline. The R2 value and p value represent the linear regression line as indicators of the
degree of proportional bias. Some datapoints represent many overlapping values.
25
Table 1. Descriptive characteristics for participants in Studies 1 and 2.
Characteristic Study 1 (N = 12) Study 2 (N = 20) Study Comparison
Age, mean (SD), y 24.9 (6.1) 23.6 (4.9) t(30) = 0.68, p = .50
Sex, No. (%)
Men
Women
3 (25)
9 (75)
7 (35)
13 (65) χ(1) = 1.66, p = .20
Weekly alcohol consumption, No. (SD) 0.75 (0.97) 1.60 (1.79) t(29.80) = -1.51, p = .14
Daily caffeine consumption, No. (SD) 1.29 (1.05) 1.89 (1.47) t(30) = -1.20, p = .24
Sleep characteristics Good sleeper
(N = 7)
Poor sleeper
(N = 5)
Good sleeper
(N = 10)
Poor sleeper
(N = 10)
ISI, mean (SD) 2.14 (1.57) 11.00 (3.39) 2.00 (1.15) 11.70 (3.86) t(30) = -0.51, p = .62
PSQI, mean (SD) 3.26 (1.50) 7.40 (3.29) 3.10 (1.73) 8.30 (3.09) t(30) = -0.56, p = .58
Habitual Bedtime, mean (SD), min 22:38 (28.44) 22:36 (31.64) 22:45 (64.58) 23:02 (68.41) t(28.93) = -1.01, p = .32
Habitual Wake Up Time, mean (SD),
min 07:10 (24.41) 07:30 (20.42) 07:27 (61.27) 07:56 (72.23) t(26.93) = -1.47, p = .15
Habitual TST, mean (SD) hrs 8.11 (1.02) 7.10 (1.52) 8.05 (0.83) 7.10 (1.58) t(30) = 0.24, p = .82
ISI = Insomnia Severity Index, N = sample size, PSQI = Pittsburgh Sleep Quality Index, SD = standard deviation, TST = total sleep time.
26