Multimodal Affect Detection among Couples for Diabetes Management
George Boateng1, Prabhakaran Santhanam1, Janina Lüscher2, Urte Scholz2, Tobias Kowatsch3
1ETH Zurich, Switzerland, 2University of Zurich, Switzerland, 3University of St. Gallen, Switzerland
Diabetes mellitus Type II (T2DM) is a common chronic disease of the endocrine system in which the
pancreas no longer produces enough insulin to metabolize blood glucose or the body becomes less sensitive
to insulin (SDG, 2015). Over one in four of the 65 years and older adults in the U.S. population are estimated
to have T2DM resulting in 9.4% of the U.S. population (CDC, 2017). In Switzerland, almost 500,000 people
suffer from T2DM, which is approximately 4.9% of the male Swiss population and 4.2% of the female Swiss
population (SDG, 2015). To manage blood glucose levels and to reduce the risk of diabetes-related
complications (e.g., cardiovascular diseases, vision loss, amputations), patients need to follow medical
recommendations for healthy eating, physical activity, and medication adherence in their everyday life
(CDC, 2017). Evidence suggests that for married adults, illness management is mainly shared with their
spouses (Seidel et al., 2012; Rintala et al., 2013). Social support among spouses is associated with healthier
habits among diabetes patients (Miller et al, 2005). Additionally, spousal support has been shown to have
beneficial effects on well-being or affect (feelings) (Prati and Pietrantoni, 2010; Iida et al., 2010). Given that
there is some relationship between social support and affect, through affect detection, we may have a proxy
for received social support from spouses. Considering the health benefits of social support especially for
chronic disease management, affect detection could be used to inform just-in-time adaptive interventions
through for example a digital coach. Also, this digital coach could adapt its communication style based on
the detected affect.
Currently, psychologists measure affect through various self-reports such as the PANAS (Watson et al.,
1988). These self-reports are however not practical for continuous affect measurement in the wild because of
their obtrusive nature. On the other hand, a lot of work has been done in the area of affect detection.
However, a lot of these work use data from controlled settings such as having actors make specific facial
expressions mimicking certain emotions, or read text in a specific emotional tone (Poria et al., 2017). It is not
clear whether the algorithms developed using these data will work well in the naturalistic context of
couples’ interactions in everyday life. Additionally, there are well developed systems such as those by
Affectiva that use data from the face to recognize affect (Affectiva, 2018). These systems however use only a
unimodal source of data and hence will not work well in the context of couple’s affect recognition in
everyday life when for example facial data is not available.
It is not clear how well the affect of couples in everyday life can be detected, despite the potential for its
usage in improving couples’ chronic disease management. In our ongoing work, we plan on addressing the
following research questions:
RQ1: How accurately can affect be predicted using multimodal real-world sensor data from couples? There
are several challenges to address such as the kind of sensor data that should be collected, how the data should
be fused together, what features to extract for regular machine learning models, what deep learning
approaches to use, what algorithms will produce the best results, among others.
RQ2: How accurately can the affect of couples be detected in real-time in everyday life? There are several
challenges to address such as how well the algorithm will work when certain sensor data such as voice is not
available, how to ensure that there is little latency in prediction, whether to do the prediction on a remote
server considering various privacy issues or on-device, which will imply the machine learning model will
need to be compact, potentially reducing the prediction accuracy, among others.
There are three technical contributions should these research questions be answered:
1) A novel machine learning algorithm that predicts affect to a high degree of accuracy using multimodal
real-world sensor data from couples
2) A mobile system that predicts affect of couples to a high degree of accuracy in real-time in the wild
3) A novel module for the open source assessment and intervention platform MobileCoach
(www.mobile-coach.eu) to be used to predict affect of individuals
Methods: DyMand Study (Dyadic Management of Diabetes)
To answer our research questions, we will be running a user study starting in January 2019, DyMand funded
by the Swiss National Science Foundation (CR12I1_166348/1), through which we will collect various sensor
data in the wild along with corresponding self-reports with which to develop our affect detection algorithm.
The goal of this study are to understand the relationship between social support and the health behavior and
wellbeing (affect) of couples in which one partner has T2DM diabetes. In this study, we will have 180
couples (N=180; n=360), with one partner having T2DM diabetes. We will collect sensor and self-report data
from them for 7 days during which data will be collected in the mornings and evenings during the weekdays,
and the whole day during the weekends when they spend time together.
Each partner of the participating couples will receive a smartwatch (Polar M600), a smartphone (Nokia 6.1)
and an accelerometer (GT3X+ monitor devices; ActiGraph, Pensacola, FL). The smartwatch will collect the
following sensor data: audio, heart rate, accelerometer, gyroscope, ambient light, physical activity and BLE
signal strength between each couple’s smartwatches. The smartphone (Nokia 6.1) will collect video, audio
and ambient light for 3 seconds when the subjects are completing the self-report on the phone. The
ActiGraph will record physical activity information all day. Various studies have used these sensor data for
affect detection (Poria et al., 2017; Timmons et al., 2017; Boateng and Kotz, 2017). We will collect the
smartwatch sensor data for 5 minutes once per hour within the morning and evening hours set by the couples,
after which a self-report is triggered for each partner to rate their affect over the past 5 minutes. We ensure
that there is at least 20 minutes between subsequent data collection to reduce burden of completing the
self-reports. To optimize the quality of data collected within that hour, we collect data when the couple is
close together and when they are speaking. We will determine closeness using the BLE signal strength of the
smartwatches and we will determine speaking using a voice activity detection algorithm. In the case in which
this condition is not met in the hour, we record the last 5 minutes in the hour. After the 5 minutes of
recording, we trigger the Affective Slider, a digital affect measuring tool which measures the arousal and
pleasure dimensions of affect (Betella and Verschure, 2016). Additionally, at the end of the day, we trigger
the Affective Slider, and also a short form of the PANAS self-report (Mackinnon et al., 1999) for the couples
to report their affect over the whole day.
In order to assess the relationship between the sensor data and self-reports, and predict affect, we plan on
exploring regular machine learning and deep learning approaches. For regular machine learning, we will use
the pipeline of preprocessing, feature extraction and selection, and cross validation. We will do this for
various algorithms such as random forest, support vector machines, etc. For deep learning, we will explore
using convolutional neural networks and recurrent neural networks along with different architectures that
may combine both of them. Additionally, because of the multimodal nature of the data, we will explore
fusion at the feature level i.e., feed the different sensor modalities into the same machine learning algorithm
or at the decision level i.e., have a different algorithm for each sensor modality and then combine the
individual algorithm predictions using for example majority voting, or some hybrid (Poria et al., 2017). We
will then pick the best performing algorithm and then optimize it to run in real-time.
Affectiva. Accessed on August 2018. Retrieved from: https://www.affectiva.com/
Betella, A., and Verschure, P.F. (2016). The Affective Slider: A Digital Self-Assessment Scale for the
Measurement of Human Emotions. PLoS One 11, e0148037. doi: 10.1371/journal.pone.0148037.
Boateng, G., and Kotz, D. (2016). StressAware: An app for real-time stress monitoring on the amulet
wearable platform. In MIT Undergraduate Research Technology Conference (URTC), 2016 IEEE (pp. 1-4).
Centers for Disease Control and Prevention (2017). National Diabetes Statistics Report, 2017
742 [Online]. Available: https://www.cdc.gov/diabetes/pdfs/data/statistics/national-diabetes743
statistics-report.pdf [Accessed 25/07/2018].
Iida, M., Parris Stephens, M.A., Rook, K.S., Franks, M.M., and Salem, J.K. (2010). When the going
839 gets tough, does support get going? Determinants of spousal support provision to type 2
840 diabetic patients. Personality and Social Psychology Bulletin 36, 780-791. doi:
Mackinnon, A., Jorm, A.F., Christensen, H., Korten, A.E., Jacomb, P.A., and Rodgers, B. (1999). A short
form of the Positive and Negative Affect Schedule: Evaluation of factorial validity and invariance across
demographic variables in a community sample. Personality and Individual Differences 27,405-416. doi:
Miller, D., and Brown, J.L. (2005). Marital interactions in the process of dietary change for type 2 diabetes.
Journal of Nutrition Education and Behavior 37(5), 226-234.
Prati, G., and Pietrantoni, L. (2010). The relation of perceived and received social support to mental
949 health among first responders: A meta-analytic review. Journal of Community Psychology 38,
950 403-417. doi: 10.1002/jcop.20371.
Poria, S., Cambria, E., Bajpai, R., and Hussain, A. (2017). A review of affective computing: From unimodal
analysis to multimodal fusion. Information Fusion 37, 98-125. Doi: 10.1016/j.inffus.2017.02.003.
SDG (2015). Diabetes mellitus. Das Wichtigste in Kürze. [Online]. Schweizerische Diabetes-Gesellschaft.
Available: http://www.sdgshop.ch/media/pdf/197-de.pdf [Accessed 25/09/2015].
Timmons, A.C., Chaspari, T., Han, S.C., Perrone, L., Narayanan, S.S., and Margolin, G. (2017). Using
Multimodal Wearable Technology to Detect Conflict among Couples. Computer 50(3), 50-59.
Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measures of positive
and negative affect: the PANAS scales. Journal of personality and social psychology, 54(6), 1063.
George Boateng1, Prabhakaran Santhanam1, Janina Lüscher2, Urte Scholz2, Tobi as Kowatsch3
1ETH Zurich, Switzerland, 2University of Zurich, Switzerland, 3University of St. Gallen, Switzerland
Multimodal Affect Detection among
Couples for Diabetes Management
It is not clear how well the affect of couples in everyday life can
be detected, despite the potential for its usage in improving
couples’ chronic disease management
3. Research Questions
RQ1: How accurately can affect be predicted using
multimodal real-world sensor data from couples?
RQ2: How accurately can the affect of couples be detected
in real-time in everyday life?
1. Background 4. Method
5. Expected Results
§Anovel machine learning algorithm that
predicts affect accurately using multimodal
real-world sensor data
§Amobile system that predicts affect of
couples accurately in real-time in everyday life
Boateng, G. , and Kotz, D. (2016). StressAware: An app for real-time stress monitoring on the amulet wearable platform.
In MIT Undergraduate Research Technology Conference (URTC), 2016 IEEE (pp. 1-4). IEEE.
Bolger, N., and Amarel, D. (2007). Effects of social support visibility on adjustment to stress: Experimental evidence.
Journal of Personality and Social Psychology 92, 458-475. doi: 10.1037/0022-35188.8.131.528.
Miller, D., and Brown, J.L . (2005). Marital interactions in the process of dietary change for type 2 diabe tes. Journal of
Nutrition Education and Behavior 37(5), 226-234.
Poria, S., Cambria, E., Bajpai, R., and Hussain, A. (2017). A review of affective computing: From unimodal analysis to
multimodal fusi on. Information Fusion 37, 98-125. Doi: 10.1016/j.inffus.2017.02.003.
Timmons, A.C., Chaspari, T., Han, S.C., Perrone, L., Narayanan, S.S., and Margolin, G. (2017). Using Multimodal
Wearable T echnology to Detec t Conflict amo ng Couples. Compute r 50(3), 50-59.
Montréal, Canada | December 7 | 2018Black in AI Workshop | NeurIPS
Social Support from Spouses
Healthier habits among
Beneficial effects on well-
being or affect
May serve as proxy for received spousal support
Inform just-in-time adaptive interventions
Digital coach can adapt communication style
DyMand Study: Collect data from 180 couples in which one
partner has Type II diabetes
Data Collection: 7 days, 5 minutes of data, collected once per
hour when the couple is close and speaking
Sensor Data: Audio, Video, Accelerometer, Gyroscope, Physical
Activity, Heart Rate, Ambient light, BLE signal strength (from
smartwatch, smartphone and ActiGraph)
Self Reports: Affective Slider and PANAS (short form)
Data Analysis: Predict affect using machine learning and deep
learning algorithms + multimodal approaches
Unimodal data source
Self Reports Data from controlled settings