Conference PaperPDF Available

Imputing Missing Social Media Data Stream in Multisensor Studies of Human Behavior



The ubiquitous use of social media enables researchers to obtain self-recorded longitudinal data of individuals in real-time. Because this data can be collected in an inexpensive and unobtrusive way at scale, social media has been adopted as a “passive sensor” to study human behavior. However, such research is impacted by the lack of homogeneity in the use of social media, and the engineering challenges in obtaining such data. This paper proposes a statistical framework to leverage the potential of social media in sensing studies of human behavior, while navigating the challenges associated with its sparsity. Our framework is situated in a large-scale in-situ study concerning the passive assessment of psychological constructs of 757 information workers wherein of four sensing streams was deployed — bluetooth beacons, wearable, smartphone, and social media. Our framework includes principled feature transformation and machine learning models that predict latent social media features from the other passive sensors. We demonstrate the efficacy of this imputation framework via a high correlation of 0.78 between actual and imputed social media features. With the imputed features we test and validate predictions on psychological constructs like personality traits and affect. We find that adding the social media data streams, in their imputed form, improves the prediction of these measures. We discuss how our framework can be valuable in multimodal sensing studies that aim to gather comprehensive signals about an individual’s state or situation.
Imputing Missing Social Media Data Stream in
Multisensor Studies of Human Behavior
Koustuv Saha1, Manikanta D. Reddy1, Vedant Das Swain1, Julie M. Gregg2, Ted Grover3, Suwen Lin4,
Gonzalo J. Martinez4, Stephen M. Mattingly4, Shayan Mirjafari5, Raghu Mulukutla6, Kari Nies3,
Pablo Robles-Granda4, Anusha Sirigiri5, Dong Whi Yoo1, Pino Audia5, Andrew T. Campbell5,
Nitesh V. Chawla4, Sidney K. D’Mello2, Anind K. Dey7, Kaifeng Jiang8, Qiang Liu9, Gloria Mark3,
Edward Moskal4, Aaron Striegel4, and Munmun De Choudhury1
1Georgia Institute of Technology, 2University of Colorado, Boulder, 3University of California, Irvine, 4University of Notre Dame,
5Dartmouth College, 6Carnegie Mellon University, 7University of Washington,8Ohio State University,9University of Texas at Austin
Contact Author:
Abstract—The ubiquitous use of social media enables re-
searchers to obtain self-recorded longitudinal data of individuals
in real-time. Because this data can be collected in an inexpensive
and unobtrusive way at scale, social media has been adopted
as a “passive sensor” to study human behavior. However, such
research is impacted by the lack of homogeneity in the use of
social media, and the engineering challenges in obtaining such
data. This paper proposes a statistical framework to leverage the
potential of social media in sensing studies of human behavior,
while navigating the challenges associated with its sparsity. Our
framework is situated in a large-scale in-situ study concerning
the passive assessment of psychological constructs of 757 infor-
mation workers wherein of four sensing streams was deployed
— bluetooth beacons, wearable, smartphone, and social media.
Our framework includes principled feature transformation and
machine learning models that predict latent social media features
from the other passive sensors. We demonstrate the efficacy of
this imputation framework via a high correlation of 0.78 between
actual and imputed social media features. With the imputed
features we test and validate predictions on psychological con-
structs like personality traits and affect. We find that adding the
social media data streams, in their imputed form, improves the
prediction of these measures. We discuss how our framework
can be valuable in multimodal sensing studies that aim to gather
comprehensive signals about an individual’s state or situation.
Index Terms—social media, imputation, multisensor, wellbeing
Understanding why and how individuals feel, think, and act
is a key topic of interest among researchers from a variety of
academic disciplines, such as psychiatry, psychology, sociol-
ogy, economics, and anthropology [22]. Typically, studies of
human behavior have largely relied on self-reported survey
data. In recent years, several limitations have been noted
with these approaches, for example, survey data suffers from
subjective assessments, recall and hindsight biases. These
surveys are often retrospective in nature — information is
gathered after an event or experience [51].
A variety of active and passive sensing technologies over-
come such biases by recording psychological states and be-
havior in-the-moment [4]. However, such approaches require
diverse, extensive, and rich data via a variety of complemen-
tary sensors to provide comprehensive information about an
individual’s state and context [4]. However, it is not all the
sensing modalities are always present for an individual, for
instance, active sensing techniques such as ecological mo-
mentary assessments (EMAs), suffer from compliance issues,
and are difficult to implement longitudinally at scale. Many
of these limitations are overcome by passive sensing, such
as logging device use [13], [40], [55]. However, despite the
dense, high fidelity data that they capture, passive sensing
paradigms alone are still challenged by resource and logistical
constraints; thus being limited to capturing behavioral data
only during the study period [44]. Such drawbacks could be
overcome by leveraging the social media data of an individual.
Social media provides an inexpensive and unobtrusive means
of gathering both present and historical data [36], overcoming
some of the challenges posed by active and passive sensing,
and providing complementary information about an individual
in their natural settings [7], [37]. Further, being self-recorded,
social media data also serves as a verbal sensor to understand
the psychological dynamics of an individual [38].
However, the availability and quality of retrospective data
via social media widely vary depending on social media use
behavior. Passive consumption is often more prevalent than
active engagement, leading to sparsity in data over extended
periods of time. Consequently, studies either focus on a
very active participant cohort — hurting generalizability and
recruitment, and introducing compliance bias, or disregard
those with no or only limited social media data — hurting
scalability. Additionally, everybody is not on social media,
and its use is typically skewed towards young adults [28].
Yet many sensing studies focus on other demographics where
social media is less prevalent. Further, gathering social media
data also presents engineering challenges due to platform-
specific restrictions, thereby, posing significant challenges in
long-term longitudinal studies of human behavior.
This paper, therefore, makes a case to overcome the chal-
lenges of missing sensing streams (here, social media) in mul-
timodal studies of human behaviors. The premise of this work
is theoretically grounded in the Social Ecological Model [5],
that posits human behaviors have social underpinnings. It
suggests that behaviors can be deeply embedded in the com-
plex interplay between an individual, their relationships, the
communities they belong to, and the societal factors.
In particular, we examine: How to leverage the potential of
social media data in multimodal sensing studies of human
behavior, while mitigating the limitations of acquiring this
unique data stream? We address this question within Tesserae
project, a multisensor study that aims to predict psychological
constructs using longitudinal passive sensing data of 757
information workers.978-1-7281-3888-6/19/$31.00 ©2019 IEEE
Focusing on those participants whose social media data is
not available, this paper proposes a statistical framework to
model the latent dimensions which could have otherwise been
derived, had their social media data stream been available.
Specifically, we impute missing social media features by learn-
ing their observed behaviors from other passive sensor streams
(bluetooth beacons, wearable, and smartphone use). We em-
ploy a range of state-of-the-art machine learning models, such
as linear regressions, ensemble tree-based regression, and deep
neural network based regression. After having demonstrated
that the imputed social media features closely follow actual
social media features of participants (average correlation of
0.78), we evaluate the efficacy of this social media imputation
framework. We compare pairs of statistical models that predict
a range of common (or benchmark) individual difference
variables (psychological constructs like personality, affect, and
anxiety) — one set of models being those that use imputed
social media features alongside other passive sensor features,
and the other set that does not use these imputed signals.
Our findings suggest that the imputed social media features
significantly improve the predictions by 17%.
Summarily, this paper shows that our proposed framework
can augment the range of social-ecological signals available
in large-scale multimodal sensing studies, by imputing latent
behavioral dimensions, when one sensor stream (that is, so-
cial media data stream) is entirely unavailable for certain
participants. We discuss the implications of our work as a
methodological contribution in multimodal sensing studies of
human behavior, within the sensing research community.
Social Media as a Passive Sensing Modality. With the ubiq-
uity of smartphones and wearables, passive sensing modalities
enable convenient means to obtain dense and longitudinal
behavioral data at scale [55], [56]. However, such a data
collection is prospective — after enrollment, during the study
period. To obtain historical or before-study data, researchers
have recently started to use social media as a “passive sensor”,
which enables unobtrusive data collection of longitudinal and
historical data of individuals that were self-recorded [36], [37].
Social media provides low-cost, large-scale, non-intrusive
means of data collection. It has the potential to comprehen-
sively reveal naturalistic patterns of mood, behavior, cognition,
and psychological states, both in real-time and across longitu-
dinal time [12]. Relatedly, social media has facilitated analyz-
ing personality traits and their relationship to psychological
and psychosocial well-being, through machine learning and
linguistic analysis [19], [29], [43].
Together, passive sensing modalities in conjunction propa-
gate the vision of “people-centric sensing” [4], although each
one of them may have its own limitation. Social media suffers
from data sparsity issues, and it can function as a “sensor”
only on those who use it. This leads to a common problem that
many multimodal sensing studies of human behavior face [21],
[36], [55]— they either examine a larger pool of participants
with fewer sensors, or a smaller pool of participants who
comply with all sensing streams. This compromises the com-
bined potential of multiple sensors or the wide spectrum of
individual behaviors. Our work is motivated by computational
approaches to infer latent behavioral attributes [8], [27], [31].
We model latent behavioral states as captured by multimodal
sensing to impute the missing sensing stream.
Data Imputation Approaches in Sensing Studies. Data
imputation is the process of replacing missing data with sub-
stituted values [48]. Imputation techniques commonly include
dropping missing data,substituting with mean or median
values,substituting with random values, etc [30]. These ap-
proaches are typically employed during data cleaning and
pre-processing, and their downstream influence in the results
largely remain understudied being overshadowed by the objec-
tives of the studies. A number of studies have used statistical
and machine learning based modeling techniques to impute
missing values [6], [41], [42], [52]. In an early work, [34]
proposed probabilistic approaches to handle missing data, and
recently Jaques et al. used deep learning to impute missing
sensor data and found better mood prediction results [17].
Although addressing missing data challenges has been stud-
ied in the literature, problems surrounding missing sensing
streams remain understudied. Besides proposing a framework
to impute a missing stream (social media), this paper shows the
effectiveness of this imputation through the lens of predicting
psychological constructs (a problem that has been widely
studied in the multimodal sensing literature) through a variety
of algorithms. We demonstrate the robustness in the imputation
efficacy by comparing our findings with permutation tests and
random- and mean- based imputation techniques.
Our dataset comes from the Tesserae study that recruited
757 participants1who are information workers in cognitively
demanding fields (e.g. engineers, consultants, managers) in
the U.S. The participants were enrolled from January 2018
through July 2018. The study is approved by Institutional
Review Board at the researchers’ institutions.
The participants responded to self-reported survey ques-
tionnaires, and provided us their passively sensed behavioral
data through four major sensing streams, bluetooth, wearable,
smartphone agent, and social media [35]. They were provided
with an informed-consent document with descriptions of each
sensing streams and the data being collected via them. They
were required to consent to each sensing streams individually,
and they could opt out of any stream. The data was de-
identified and stored on secured databases and servers phys-
ically located in one of the researcher institutions, and had
limited access privileges.
The enrollment process consisted of responding to a set
of initial survey questionnaires related to demographics (age,
gender, education, and income). The participant pool consists
of 350 males and 253 females, where the average age is
34 years (stdev. = 9.34). In education, the majority of the
participants belong to have college (52%) and master’s degree
(35%) education level. Participants were additionally required
to answer an initial set of survey questionnaires that measure
their self-reported assessments of personality, cognitive ability,
affect, anxiety, stress, sleep, physical activity, and alcohol and
tobacco use. Relevant to the focus of the present paper, we
outline the psychological constructs below:
1Note that this is an ongoing study and this paper uses sensed data collected
until August 21st, 2018 [23], [24]. Randomly selected 154 participants has
been “blinded at source”, and their data is put aside only for external validation
at the end of the study. The rest of the paper only concerns the data of the
remaining 603 “non-blinded” participants in the study.
Personality. The BFI-2 scale [46] measures personality traits
across the five dimensions of personality traits on a continuous
scale of 1 to 5. In our dataset, the average value of neuroticism
is 2.46 (std. = 0.78), conscientiousness is 3.89 (std. = 0.66),
extraversion is 3.44 (std. = 0.68), agreeableness is 3.87 (std.
= 0.56), and openness is 3.82 (std. = 0.61).
Affective Measures. The PANAS-X scale [57] measures posi-
tive and negative affect on a continuous scale of 10 to 50 each.
The STAI-Trait scale [47] measures anxiety on a continuous
scale of 20 to 80. In our dataset, positive and negative
affect averages at 34.61 (std.=5.95) and 17.47 (std.=5.34)
respectively, and anxiety averages at 38.11 (std.=9.29).
To passively collect data about participants’ behavior, our
study deployed four major sensor streams:
Bluetooth Beacons. Participants were provided with two static
and two portable bluetooth beacons (Gimbal [3]). The static
beacons were to be placed at their work and home, and
the portable beacons were to be carried continuously (e.g.,
keychains). The beacons track their presence at home and
work, and also help us assess their commute and desk time.
Wearable. Participants were provided with a fitness band
(Garmin Vivosmart [2]), which they would wear throughout
the day. The wearable continually tracks health measures, such
as heart rate, stress, and physical activity in the form of sleep,
footsteps, and calories lost.
Smartphone Application. The participants’ smartphones (an-
droid and iPhones) were installed with a smartphone applica-
tion (also used in [55]). This application tracks their phone use
such as lock behavior, call durations, and uses mobile sensors
to track their mobility and physical activity.
Social Media. Participants authorized access to their social
media data through an Open Authentication (OAuth) based
data collection infrastructure that we developed in-house.
Specifically, we asked permission from participants to provide
their Facebook and LinkedIn data, unless they opted out, or
did not have either of these accounts. We asked consent from
only those participants who had existing Facebook or LinkedIn
accounts from before the study.
Passively Sensed Data. The participants were enrolled over
6 months (February to July 2018) in a staggered fashion,
averaging at 111 days of study per participant. Table I reports
the descriptive statistics of the number of days of passively
sensed data that we collected per participant through each of
the sensor streams. Per participant, we have an average of 42
days data through bluetooth beacons, 108 days data through
wearable, and 101 days of data through a phone application.
Out of the 603 non-blinded participants, 475 authorized
their Facebook data. This data can be broadly categorized
in two types—ones that were self-composed (e.g., writing
a status update or checking into a certain location), and
ones that they received on shared updates on their timeline.
Comprehensively, Facebook data consists of the updates on
participants’ timelines, including textual posts, Facebook apps
usage, check-ins at locations, media updates, and the share
of others’ posts. The likes and comments received on these
updates on the participants’ timelines were also collected.
Note that as per our IRB approval, we did not collect any
multimedia data or private messages. Table II summarizes the
descriptive statistics of our Facebook dataset. Temporally, our
data dates back to October 2005, and the number of days of
data per participant averages at 1,898 days — giving us a sense
of the historical data that Facebook allows us to capture.
Table I: Descriptive statistics of
# days data collected.
Type Range Mdn. Std.
Study Period 16:205 99 46.7
Bluetooth 1:159 37 32.6
Wearable 5:206 94 46.9
Smartphone 1:206 93 52.4
Social Media 110:4756 2923 1474
Table II: Descriptive statistics of
the Facebook dataset.
Type Mdn. Std.
Likes Rcvd. 1,139 5,277.85
Comms. Rcvd. 316 1,383.69
Self-posts 137 511.80
Self-comments 55 334.16
Self-Words 2,374 13,718.56
(130) (124) (51) (30)
(5,077) (4,806) (3,716) (200)
Figure 1: A schematic overview of the feature engineering pipeline
to obtain transformed features from sensor and social media derived
features (The numbers in brackets represent the number of features
in each step in our dataset).
We derive 130 features from sensor data and 5,077 features
from social media data. The choice of our features is motivated
by prior work on predicting psychological constructs [55]:
Sensor Raw Features. From the sensor datastreams, we obtain
a variety of features that broadly correspond to heart-rate and
heart-rate variability, stress, fitness, physical activity, mobility,
phone use activity, call use, desk activity, and sleep.
Social Media Features. From the social media dataset, we
obtain a variety of features corresponding to psycholinguistic
attributes [50], open vocabulary n-grams (top 5,000), sen-
timent, and social capital (such as number of check-ins,
engagement and activity with friends, etc.).
We conduct feature selection and transformation to over-
come problems related to multi-collinearity, covariance, etc.
among the features — issues that can potentially affect down-
stream prediction tasks [10]. Because our features are obtained
from multimodal data streams, there is a high likelihood that
some features are related, or are redundant, or show high
variance, or lack predictive power [14], [31]. For example,
the activity and stress-related features as captured by our
wearable, are both intuitively and theoretically correlated [2].
We adopt three techniques of reducing the number of features
and consequently transforming them:
1) Selecting Features on Coefficient of Variation: First, we
reduce the feature space on the basis of explained variance
using the measure of coefficient of variation (cv), that essen-
tially quantifies the ratio of standard deviation to the mean
for each feature. We drop those features that are outliers in
the cv (beyond threshold cv of two standard deviations away
from mean). Six sensor features occur above the threshold cv
of 8.6, and 271 social media features show a cv greater than
the threshold cv of 14.5. Dropping these features, our feature
space reduces to 124 sensor derived features and 4,806 social
media derived features (Fig. 2 (a&b)).
2) Selecting Features on Pairwise Correlations: Correlated
features typically affect or distort machine learning prediction
models by potentially yielding unstable solutions or masking
the interactions between significant features [14]. To select
uncorrelated features, on the above 124 sensor and 4,806 social
media features, we obtain Pearson’s correlation between every
feature pairs. With a threshold absolute value of 0.8, we drop
those features that are highly correlated with another feature.
0 1 2 3 4 5
Coefficient of Variance
# Feature
(a) Sensor: cv
0 5 10 15 20
Coefficient of Variance
# Features
(b) SM: cv
−1.0 −0.5 0.0 0.5 1.0
# Feature Pairs
(c) Sensor: r
−0.4−0.20.0 0.2 0.4 0.6 0.8 1.0
# Feature Pairs
(d) SM: r
Figure 2: Feature selection stage using Coefficient of Variance (cv)
and Correlation (r). The greyed-out region include those features that
are dropped in these analyses
0 5 10 15 20 25 30 35 40
# Components
Explained Variance
0 50 100 150 200 250 300 350
# Components
Explained Variance
Figure 3: (a&b) Scree plots of explained variance and number of PCA
components to transform features in sensor and social media feature
space. These plots help us to determine the number of components.
(c&d) Transformed features’ distribution across participants.
Fig. 2 (c&d) plot the correlations of all the 15,376 (1242)
sensor feature pairs, and 23,097,636 (4,8062) social media
feature pairs. 73 sensor feature pairs and 1,090 social media
feature pairs occur outside the absolute correlation of 0.8 —
leading to exclusion of 73 sensor features and 1,090 social
media features. At the end of this step, we are left with 51
sensor features and 3,716 social media features.
3) Transforming Features using Principal Component Anal-
ysis: On the above features, we employ Principal Compo-
nent Analysis (PCA) using a singular value decomposition
solver [54], where we select the number of components on
the basis of explained variance [15]. This method reduces the
dimensions in the feature space by transforming features into
orthogonal or principal components [18], [59]. Fig. 3 (a&b)
plot the scree plots of the explained variance of the principal
components in the feature space. We find that 95% of the
feature space is roughly explained at 30 principal components
in the sensor features space, and 200 principal components
in the social media feature space. Note that we build the
PCA models only on the training samples, and transform the
features in the held-out samples with the PCA models. This
way there is no data leakage in our statistical framework.
Finally, our final feature set consist of 30 sensor-derived
features and 200 social media-derived features.
Our feature learning framework broadly addresses the chal-
lenge of missing social media data stream for 128 participants
in the study. Fig. 4 shows a schematic overview of the
prediction models of psychological constructs that are used to
evaluate the effectiveness of the imputing missing social media
transformed features. We briefly mention the three algorithms
that we consistently use throughout the paper.
Linear Regression (LR) Linear regression adopts a linear
approach to model the relationship between the independent
and dependent variables [45]. Specifically, wherever applica-
ble, we employ linear regression with L1/L2 regularization
to prevent overfitting and to avoid bias introduced due to the
inter-dependence of independent variables [60].
Type 1
Type 2
Feature Set
Feature Set
Base Models (Who have social media data)
SS .X + X’ : Y
11 1 1
Models (Who do not have social media data)
S .X : Y
22 2
SS .X + X’ : Y
22 2 2
Final Models (All participants)
Imputation Model
Imp. X : X’
X’ : Imp(X )
1 1
2 2
S . (X + X ) : (Y + Y )
31 2 21
SS . (X + X ) + (X + X’ ) : (Y + Y )
31 2 1 1 1 2
X’ X X’
X X’
Sensor Transformed Features
Social Media Transformed Features
S .X : Y
11 1
Figure 4: A schematic overview of the statistical models built to
evaluate the effectiveness of imputation.
Gradient Boosted Regression (GBR) Gradient boost technique
conducts regression in the form of an ensemble of weak
prediction models, which are typically decision trees [11],
[25]. It optimizes the cost function by iteratively choosing a
function that points in the negative gradient direction. In our
case, we used gradient boost on an ensemble of decision tree
regressors, by varying the number of decision trees between
100 and 1000, with each tree of maximum depth as 3.
Multilayer Perceptron Regression (MLP) Neural network
regression suits in problems where a more conventional re-
gression model cannot fit a solution. We use the multi-layered
perceptron (MLP) technique that works in a feed-forward
fashion (no cycles) with multiple internal layers [33]. The
model learns through a method called backpropagation [20],
and follows a fully connected (dense) deep neural network
architecture. Wherever applicable, we use two internal layers
and tune the number of nodes in them between 36 and 216
for our neural network regression models.
The above three algorithm choices are motivated by the
fact that they essentially cover a broad spectrum of algorithm
families spread across linear regression, non-linear regression,
decision trees, ensemble learning, neural networks, and deep
learning. We quantify the prediction accuracy of psychological
constructs as the Symmetric Mean Absolute Percentage Error
(SMAPE), which is computed as mean percentage relative dif-
ference between predicted and actual values, over an average
of the two values [16]. SMAPE values range between 0%
and 100%, and lower values of error indicate better predictive
ability. To obtain these, we first divide their datasets into five
equal segments, and then iteratively train models on four of the
segments to predict on the held-out fifth segment. We average
the testing accuracy metrics to obtain the pooled accuracy
metrics for the above algorithms. This paper refers to this
technique as pooled accuracy technique and the corresponding
outcomes as pooled accuracy or error measures. Within the
training segments, we tune the hyper-parameters using a k-fold
cross-validation (k= 5) technique.
Baseline Prediction with Passively Sensed Data. We first
seek to establish if the presence of social media features
improves prediction accuracy. On the same set of 475 partici-
pants who have social media data, we compare two models of
predicting psychological constructs — 1) S1uses 30 sensor
features, and 2) SS1combines 30 sensor features and 200
social media features. Table III reports the relative decrease
in error for SS1compared to S1. The relative decrease in
error averages at 21% for LR, 26% for GBR, and 21% for
MLP. In sum, adding social media features improves the
predictions by an average of 22.4% across all the models and
the psychological constructs.
−0.2 0.0 0.2 0.4 0.6 0.8
# Components
Mean: 0.22
(a) LR
0.2 0.4 0.6 0.8 1.0
# Components
Mean: 0.78
(b) GBR
0.2 0.4 0.6 0.8 1.0
# Components
Mean: 0.67
(c) MLP
Figure 5: Correlation distribution between PCA Components and
Predicted PCA Components of Facebook
A. Imputing Missing Social Media Features
The baseline prediction suggests that adding social media
features indeed improves the prediction task of the psycholog-
ical constructs. However, about one-quarter of the participants
do not have social media data (see section III). This restricts us
from leveraging a rich feature stream to predict such attributes
for these individuals. To overcome this constraint, we aim at
learning certain latent behaviors that we could have otherwise
inferred if we had access to their social media data.
We impute the social media features using the sensor fea-
tures. For this, we build learning models on the sensor stream
of the social media participants to predict their latent social
media dimensions. That is, for every 200 social media feature,
we build a separate model that uses the sensor features as the
independent variables to predict the social media feature. We
adopt k-fold cross-validation based hyper-parameter tuning.
We use LR, GBR, and MLP to find the best algorithmic model,
and quantify the pooled accuracy of the prediction models in
terms of Pearson’s correlation (r) between actual and predicted
social media features. Levene’s test between all the actual
and predicted features reveals homogeneity of variance in the
feature set [26]. This statistically indicates that the imputed
social media transformed features are not arbitrarily generated.
Fig. 5 plots the distribution of the pooled Pearson’s corre-
lation (r) between the actual and predicted values of social
media transformed features. We find that the mean correlation
across the components is 0.22 in LR, 0.78 in GBR, and 0.67
in MLP. All of these correlation measures are statistically
significant at p < 0.05. Comparing across the algorithms,
GBR performs the best in predicting the latent social media
dimensions. For the rest of the analyses, we used the GBR
algorithm to impute the social media transformed features.
B. Evaluating the Effectiveness of Imputation
On those 128 participants whose social media data we did
not have, we compare two prediction models of psychological
constructs— 1) S2uses only sensor features of these partici-
pants, and 2) SS2combines sensor features and imputed social
media features (as obtained above).
We compare the accuracy metrics of S2and SS2to deduce
if imputing the social media features improves our task of
predicting psychological constructs. Table III compares the
prediction errors (SMAPE) for the three algorithms that we
run in each of the models S2and SS2. We find that for LR,
the relative decrease in the error ranges between 6% (for
openness) and 17% (for positive affect), averaging at 11%;
for GBR, the relative decrease in the error ranges between
16% (anxiety) and 20% (extraversion), averaging 17%; and
for MLP, the relative decrease in the error ranges between
6% (extraversion) and 21% (anxiety). Therefore, the imputed
Table III: Relative % decrease in SMAPE in prediction models using
both sensor & social media features from ones using only sensor
features. Positive values mean better prediction in SSnthan Sn.
Personality Traits (BFI-2)
Extraversion 10.6 28.4 16.6 8.4 20.1 6.4 12.8 19.5 3.6
Agreeableness 8.3 27.5 30.4 5.9 17.9 17.2 3.2 14.4 20.2
Conscientious. 11.8 26.0 28.2 9.4 17.4 13.5 15.0 21.2 12.1
Neuroticisim 11.2 24.9 17.6 7.6 16.9 13.4 6.0 17.5 -13
Openness 10.0 25.1 33.8 6.1 15.6 16.9 5.4 15.3 3.1
Affective Measures
Pos. Affect 33.8 26.2 8.06 16.6 18.1 18.4 8.6 14.5 21.5
Neg. Affect 38.8 24.7 24.04 16.1 15.7 9.7 8.4 11.8 16.4
Anxiety (STAI) 39.4 24.3 7.5 14.1 15.7 20.8 6.4 16.8 34.4
Mean 20.5 25.9 20.8 10.5 17.2 14.5 8.2 16.4 12.3
0 10 20 30 40 50 60
Sensors+Imputed FB
(a) Mean Imp.
0 10 20 30 40 50 60
Sensors+Imputed FB
(b) Random Imp.
Mean Decrease %
# Permutations
Figure 6: (a&b) SMAPE comparing prediction models using sensor
features (S3) vs. those using sensor and (a) mean- and (b) random-
imputed features, (c) Reduction in SMAPE in several permutations
of randomly imputed social media features, as compared to S3.
social media features improved the prediction by an average
of 14% across all models and measures.
Finally, on our entire dataset, we build two (Final Models)
to evaluate the overarching effectiveness of imputation— 1) S3
incorporates sensor features of all participants, 2) SS3incorpo-
rates Facebook features of all participants. In this model, for
those who have Facebook data, we use their Facebook features,
and for the rest, we use their imputed Facebook features.
We compare the prediction accuracy of the SS3and S3
this gives us an estimate of how this sort of imputation
framework influences the overarching task of predicting psy-
chological constructs in multimodal studies (see Table III).
We find an average improvement in prediction by 8.2% in
LR, 16.4% in GBR, and 12.3% in MLP.
C. Hypothesis Tests for Robustness
After evaluating our imputation models, we measure its ro-
bustness. We compare the effectiveness of our imputed sensing
stream against two other imputation approaches applied to
those 128 participants without social media data.
Mean Imputation. This approach imputes social media features
as the mean value of the corresponding feature sets. We build
prediction models of psychological constructs as described
in the previous subsections. This method draws on prior
studies which adopted similar approaches of imputing missing
features using static measures of central tendencies, such as
mean or median of the feature sets [9].
Randomized Imputation. This approach imputes the social
media transformed features as random values from the cor-
responding feature sets. We repeat such a randomization for a
1000 times, and in each case compare the prediction with Final
Model S3. This method emulates a permutation test [1], and
checks for robustness of the imputation effectiveness, by test-
ing the null hypothesis that randomly imputed sensor streams
are better than that imputed by our statistical framework.
Fig. 6 shows the SMAPE of these models as compared
to that by S3. While our imputation shows an average im-
provement in SMAPE by 16% on the Final Model (S3) (see
Table III), the same improvement for Mean Imputation-based
model is -3.10% and Randomized Imputation-based model is
5.34%, clearly suggesting minimal (or no) improvement in
these two models. Permuting on the randomized imputations
a thousand times, we observe that in terms of prediction error,
our imputations are never outperformed by the randomized
imputations in those thousand permutations. Essentially, this
rejects the null hypothesis that our imputation is only more
effective than randomly generated imputations by chance.
In conclusion, our findings suggest that passively sensed
multimodal data streams can be used to not only impute
the latent social media dimensions, but also to augment
these latent features in building better prediction models
that infer psychological constructs. We consistently observe
similar trends in the improvement of prediction accuracies by
integrating the social media features (both actual and imputed)
with the sensor-transformed features.
Theoretical and Practical Implications. This paper proposes
an analytical framework of imputing a missing sensing stream
(here social media) in multimodal sensing studies. We evaluate
the effectiveness of this imputation by predicting psychologi-
cal constructs through a variety of state-of-the-art algorithms.
At a higher level, the imputation framework is grounded on
the Social Ecological Model that construes interdependence
among individuals, their behavior and their surroundings and
environment [39], [53]. This implies its applicability not
only in theory but also in practice (context and activity as
captured and observed through passive sensing modalities).
Our findings reveal the robustness of imputation by comparing
with permutation tests and random- and mean- imputation. We
believe such a framework can potentially be used in studies
where there is similar theoretical grounding (around a focus on
comprehensive social ecological signals), and an opportunity
to infer psychological attributes.
We find that integrating social media features improves the
prediction of psychological constructs. This aligns with prior
work on the potential of social media (both individually as
well as in tandem with other passive sensors) in predicting
these measures [7], [36], [43]. However, social media data may
not be available for the participants. Our proposed imputation
method addresses this gap by computing latent social media
dimensions, which can be used to improve such machine
learning-based prediction tasks of human behavior.
Following our framework, existing datasets that include
multimodal sensing, but do not have social media streams for
some participants, can now be retrained for better predictions.
While our study only focuses on predicting psychological
constructs, the same method can be extrapolated to predict
other measures of human behavior as well. Not being limited
to a single algorithm, our framework shows the consistency in
the findings across a variety of algorithm families. It is not con-
strained by the choice of machine learning algorithms, which
typically vary depending on the characteristics of the dataset
and the distribution of the individual difference variables.
We believe that if there are additional sensing streams over
those we consider, their features can be plugged into our
framework. However, it remains interesting to study whether
the additional sensors improve the imputation models. For
instance, sensing technologies that capture conversations [32]
among individuals in social settings would plausibly improve
predicting latent social media features, on the rationale that
it captures another set of dimensions in the social ecological
framework — offline social interactions.
Ethical Implications. We caution against our work being
misused as a methodology to surveil or infer individual be-
haviors. Our work intends to model latent dimensions that can
assist prediction tasks in multimodal sensing studies, by being
internal to the pipeline of the prediction system. However,
these latent dimensions do not necessarily translate to or are
indicative of actual individual behaviors on social media, and
therefore such inferences cannot be drawn from the imputed
social media features about the individuals.
This paper does not unpack why certain participants did not
share social media data. It could be because they do not use
social media, or because they do not intend to share this data
for privacy reasons. Whether social media features should be
imputed for the second class of individuals can constitute a
debated topic. This is because such an imputation approach,
when applied to make predictions of sensitive individual
difference variables and incorporated into larger systems (e.g.,
targeted advertising), can be perceived as a violation of the
very privacy considerations that spurred them to not share
social media data in the first place. We envision these topics
need further discussions among researchers, ethicists, and the
individuals who participate in such studies.
Limitations and Future Work. Our work has limitations,
some of which open up opportunities for future research. Our
findings are limited to imputing a specific type of social media
data – that gathered from Facebook. Imputing social media
stream of other platforms, especially ones where mixed media
sharing is extensive (e.g., Instagram or Tumblr) may present
unique challenges. Because we focus on a specific cohort
of participants who are information workers, whether our
approach would yield similar promising imputation results in
other populations needs to be explored. We, therefore, caution
against making sweeping generalizable claims.
Like any other imputation approaches, our methodology is
vulnerable to introduce biases within the dataset [49], [58].
Because the imputation model only learns the latent dimen-
sions from what it has seen, it is unlikely to learn unknown
and deviant behavioral patterns. While such occurrences are
less likely to occur in a large-scale multimodal sensing studies
like ours (where the participant pool is diverse), this factor
should be considered in smaller scale studies or when the study
population lacks representativeness and has a greater selection
bias. The present work leverages a specific set, albeit a range
of commercially available passive sensors. It interests future
research to investigate how adding other sensing modalities
can improve the imputation of social media features.
This research is supported in part by the Office of the Di-
rector of National Intelligence (ODNI), Intelligence Advanced
Research Projects Activity (IARPA), via IARPA Contract
No. 2017-17042800007. The views and conclusions contained
herein are those of the authors and should not be interpreted as
necessarily representing the official policies, either expressed
or implied, of ODNI, IARPA, or the U.S. Government. The
U.S. Government is authorized to reproduce and distribute
reprints for governmental purposes notwithstanding any copy-
right annotation therein.
[1] Aris Anagnostopoulos, Ravi Kumar, and Mohammad Mahdian. Influ-
ence and correlation in social networks. In KDD, 2008.
[2] Garmin Health API.
[3] Manager REST API., 2018.
[4] Andrew T Campbell, Shane B Eisenman, Nicholas D Lane, Emiliano
Miluzzo, Ronald A Peterson, Hong Lu, Xiao Zheng, Mirco Musolesi,
of Fodor, and Gahng-Seop Ahn. The rise of people-centric sensing.
IEEE Internet Computing, 12(4), 2008.
[5] Ralph Catalano. Health, behavior and the community: An ecological
perspective. Pergamon Press New York, 1979.
[6] Diane J Catellier, Peter J Hannan, David M Murray, Cheryl L Addy,
Terry L Conway, Song Yang, and Janet C Rice. Imputation of missing
data when measuring physical activity by accelerometry. Medicine and
science in sports and exercise, 37(11 Suppl):S555, 2005.
[7] Munmun De Choudhury, Michael Gamon, Scott Counts, and Eric
Horvitz. Predicting depression via social media. In ICWSM, 2013.
[8] Mariella Dimiccoli, Juan Mar´
ın, and Edison Thomaz. Mitigating
bystander privacy concerns in egocentric activity recognition with deep
learning and intentional image degradation. IMWUT, 2018.
[9] A Rogier T Donders, Geert JMG Van Der Heijden, Theo Stijnen, and
Karel GM Moons. A gentle introduction to imputation of missing values.
Journal of clinical epidemiology, 59(10):1087–1091, 2006.
[10] Carsten F Dormann, Jane Elith, Sven Bacher, Carsten Buchmann, Gu-
drun Carl, Gabriel Carr´
e, Jaime R Garc´
ıa Marqu´
ez, Bernd Gruber, Bruno
Lafourcade, Pedro J Leit˜
ao, et al. Collinearity: a review of methods
to deal with it and a simulation study evaluating their performance.
Ecography, 36(1):27–46, 2013.
[11] Jane Elith, John R Leathwick, and Trevor Hastie. A working guide to
boosted regression trees. Journal of Animal Ecology, 2008.
[12] Scott A Golder and Michael W Macy. Diurnal and seasonal mood vary
with work, sleep, and daylength across diverse cultures. Science, 2011.
[13] Agnes Gr¨
unerbl, Amir Muaremi, Venet Osmani, Gernot Bahle, Stefan
Oehler, Gerhard Tr¨
oster, Oscar Mayora, Christian Haring, and Paul
Lukowicz. Smartphone-based recognition of states and state changes
in bipolar disorder patients. IEEE JBHI, 2015.
[14] Mark Andrew Hall. Correlation-based feature selection for machine
learning. 1999.
[15] Robin K Henson and J Kyle Roberts. Use of exploratory factor analysis
in published research: Common errors and some comment on improved
practice. Educ. Psychol. Meas., 2006.
[16] Rob J Hyndman and Anne B Koehler. Another look at measures of
forecast accuracy. International journal of forecasting, 2006.
[17] Natasha Jaques, Sara Taylor, Akane Sano, and Rosalind Picard. Mul-
timodal autoencoder: A deep learning approach to filling in missing
sensor data and enabling better mood prediction. In ACII, 2017.
[18] Ian Jolliffe. Principal component analysis. In International encyclopedia
of statistical science, pages 1094–1096. Springer, 2011.
[19] Michal Kosinski, David Stillwell, and Thore Graepel. Private traits and
attributes are predictable from digital records of human behavior. 2013.
[20] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning.
nature, 2015.
[21] James Alexander Lee, Christos Efstratiou, and Lu Bai. Osn mood
tracking: exploring the use of online social network activity as an
indicator of mood changes. In Ubicomp: Adjunct, 2016.
[22] William Little, Ron McGivern, and Nathan Kerins. Introduction to
Sociology-2nd Canadian Edition. BC Campus, 2016.
[23] Stephen M Mattingly, Julie M Gregg, Pino Audia, Ayse Elvan Bayrak-
taroglu, Andrew T Campbell, Nitesh V Chawla, Vedant Das Swain,
Munmun De Choudhury, et al. The tesserae project: Large-scale,
longitudinal, in situ, multimodal sensing of information workers. 2019.
[24] Shayan Mirjafari, Kizito Masaba, Ted Grover, Weichen Wang, Pino
Audia, et al. Differentiating higher and lower job performers in the
workplace using mobile sensing. Proc. IMWUT, 2019.
[25] Ananth Mohan, Zheng Chen, and Kilian Weinberger. Web-search
ranking with initialized gradient boosted regression trees. In Proceedings
of the learning to rank challenge, pages 77–89, 2011.
[26] David W Nordstokke and Bruno D Zumbo. A new nonparametric levene
test for equal variances. Psicologica, 2010.
[27] Liangying Peng, Ling Chen, Zhenan Ye, and Yi Zhang. Aroma: A
deep multi-task learning based simple and complex human activity
recognition method using wearable sensors. PACM IMWUT, 2018.
[28] Pew. media, 2018.
[29] Daniele Quercia, Michal Kosinski, David Stillwell, and Jon Crowcroft.
Our twitter profiles, our selves: Predicting personality with twitter.
[30] Quinten AW Raaijmakers. Effectiveness of different missing data
treatments in surveys with likert-type data: Introducing the relative mean
substitution approach. Educ. Psychol. Meas., 1999.
[31] Valentin Radu, Catherine Tong, Sourav Bhattacharya, Nicholas D Lane,
Cecilia Mascolo, Mahesh K Marina, and Fahim Kawsar. Multimodal
deep learning for activity and context recognition. IMWUT, 2018.
[32] Tauhidur Rahman, Alexander Travis Adams, Mi Zhang, Erin Cherry,
Bobby Zhou, Huaishu Peng, and Tanzeem Choudhury. Bodybeat: a
mobile system for sensing non-speech body sounds. In MobiSys, 2014.
[33] Dennis W Ruck, Steven K Rogers, Matthew Kabrisky, Mark E Oxley,
and Bruce W Suter. The multilayer perceptron as an approximation to
a bayes optimal discriminant function. IEEE Trans. Neural Netw.
[34] Hesam Sagha, Jos´
e del R Mill´
an, and Ricardo Chavarriaga. A prob-
abilistic approach to handle missing data for multi-sensory activity
recognition. In Workshop on Context Awareness and Information
Processing in Opportunistic Ubiquitous Systems at UbiComp, 2010.
[35] Koustuv Saha, Ayse Elvan Bayraktaraglu, Andrew Campbell, Nitesh V
Chawla, et al. Social media as a passive sensor in longitudinal studies
of human behavior and wellbeing. In CHI Ext. Abstracts, 2019.
[36] Koustuv Saha, Larry Chan, Kaya De Barbaro, Gregory D Abowd, and
Munmun De Choudhury. Inferring mood instability on social media by
leveraging ecological momentary assessments. IMWUT, 2017.
[37] Koustuv Saha and Munmun De Choudhury. Modeling stress with social
media around incidents of gun violence on college campuses. PACM
Human-Computer Interaction, (CSCW), 2017.
[38] Koustuv Saha, Benjamin Sugar, John Torous, Bruno Abrahao, Emre
Kıcıman, and Munmun De Choudhury. A social media study on the
effects of psychiatric medication use. In Proc. ICWSM, 2019.
[39] James F Sallis and Neville Owen. Physical activity and behavioral
medicine, volume 3. SAGE publications, 1998.
[40] Akane Sano and Rosalind W Picard. Stress recognition using wearable
sensors and mobile phones. In ACII, 2013.
[41] Joseph L Schafer. Analysis of incomplete multivariate data. Chapman
and Hall/CRC, 1997.
[42] Joseph L Schafer and Maren K Olsen. Multiple imputation for multi-
variate missing-data problems: A data analyst’s perspective. Multivariate
behavioral research, 33(4):545–571, 1998.
[43] H Andrew Schwartz, Johannes C Eichstaedt, Margaret L Kern, et al.
Personality, gender, and age in the language of social media: The open-
vocabulary approach. PloS one, 8(9):e73791, 2013.
[44] Christie Napa Scollon, Chu-Kim Prieto, and Ed Diener. Experience
sampling: promises and pitfalls, strength and weaknesses. In Assessing
well-being, pages 157–180. Springer, 2009.
[45] George AF Seber and Alan J Lee. Linear regression analysis, volume
329. John Wiley & Sons, 2012.
[46] Christopher J Soto and Oliver P John. The next big five inventory
(bfi-2): Developing and assessing a hierarchical model with 15 facets to
enhance bandwidth, fidelity, and predictive power. Journal of Personality
and Social Psychology, 113(1):117, 2017.
[47] Charles D Spielberger, Fernando Gonzalez-Reigosa, Angel Martinez-
Urrutia, Luiz FS Natalicio, and Diana S Natalicio. The state-trait anxiety
inventory. Revista Interamerican Journal of Psychology, 2017.
[48] Daniel J Stekhoven and Peter B¨
uhlmann. Missforest-non-parametric
missing value imputation for mixed-type data. Bioinformatics, 2011.
[49] Thomas R Sullivan, Amy B Salter, Philip Ryan, and Katherine J Lee.
Bias and precision of the “multiple imputation, then deletion” method
for dealing with missing outcome data. Am. J. Epidemiol, 2015.
[50] Yla R Tausczik and James W Pennebaker. The psychological meaning
of words: Liwc and computerized text analysis methods. J. Lang. Soc.
Psychol., 2010.
[51] Roger Tourangeau, Lance J Rips, and Kenneth Rasinski. The psychology
of survey response. Cambridge University Press, 2000.
[52] Olga Troyanskaya, Michael Cantor, Gavin Sherlock, Pat Brown, Trevor
Hastie, Robert Tibshirani, David Botstein, and Russ B Altman. Missing
value estimation methods for dna microarrays. Bioinformatics, 2001.
[53] Jacqueline A Walcott-McQuigg, Julie Johnson Zerwic, Alice Dan, and
Michele A Kelley. An ecological approach to physical activity in african
american women. Medscape women’s health, 6(6):3–3, 2001.
[54] Michael E Wall, Andreas Rechtsteiner, and Luis M Rocha. Singular
value decomposition and principal component analysis. In A practical
approach to microarray data analysis, pages 91–109. Springer, 2003.
[55] Rui Wang, Fanglin Chen, Zhenyu Chen, Tianxing Li, Gabriella Harari,
Stefanie Tignor, Xia Zhou, Dror Ben-Zeev, and Andrew T Campbell.
Studentlife: assessing mental health, academic performance and behav-
ioral trends of college students using smartphones. In Ubicomp.
[56] Weichen Wang, Gabriella M Harari, Rui Wang, Sandrine R M¨
Shayan Mirjafari, Kizito Masaba, and Andrew T Campbell. Sensing
behavioral change over time: Using within-person variability features
from mobile sensing to predict personality traits. IMWUT, 2018.
[57] David Watson and Lee Anna Clark. The panas-x: Manual for the positive
and negative affect schedule-expanded form. 1999.
[58] Ian R White and John B Carlin. Bias and efficiency of multiple
imputation compared with complete-case analysis for missing covariate
values. Statistics in medicine, 29(28):2920–2931, 2010.
[59] Svante Wold, Kim Esbensen, and Paul Geladi. Principal component
analysis. Chemom. Intell. Lab. Syst, 1987.
[60] Hui Zou and Trevor Hastie. Regularization and variable selection via
the elastic net. J. Royal Stat. Soc.: Series B, 2005.
... First, the past decade of computational social science research, which has repeatedly showcased how social media postings can provide rich insights about many real-world happenings, whether political, economic, social, or about health and well-being (Golder and Macy, 2011;Lazer et al., 2009Lazer et al., , 2020. Specifically, studies in psycholinguistics and crisis informatics have found promising evidence that the content shared on social media can help us to study mental health responses to crises, ranging from understanding how communities cope with protracted wars (Mark et al., 2012), community violence (De Choudhury et al., 2014;Saha andDe Choudhury, 2017), terrorism (Hoffman, 2018), homicides and mass shootings (Glasgow et al., 2014;Jones et al., 2017;Lin and Margolin, 2014). Second, with the growing adoption of social media among K-12 school communities including students, teachers, school administrators, and parents (Kimmons et al., 2018), social media constitutes a promising opportunity to study psychological states unobtrusively and passively. ...
... We measured the probability (p-value) that the synthetic LC is greater than actual LC, which helps to quantify the statistical significance of our observations against chance or random observations. This method emulates permutation test frameworks applied in the prior work (Das et al., 2020;Saha, 2019a), and tests for the null hypothesis that outcome change around a randomly generated drill date is comparable to outcome change around actual drill date. If this p-value is found to be zero or significantly low (e.g., p < 0.05), then we can deduce that the treatment LC is indeed attributed to the effects of the activeshooter drill. ...
Full-text available
The toll from gun violence in American K-12 schools has escalated over the past 20 years. School administrators face pressure to prepare for possible active shootings, and often do so through drills, which can range from general lockdowns to simulations, involving masked “shooters” and simulated gunfire, and many variations in between. However, the broad and lasting impact of these drills on the well-being of school communities is poorly understood. To that end, this article applies machine learning and interrupted time series analysis to 54 million social media posts, both pre- and post-drills in 114 schools spanning 33 states. Drill dates and locations were identified via a survey, then posts were captured by geo-location, school social media following, and/or school social media group membership. Results indicate that anxiety, stress, and depression increased by 39–42% following the drills, but this was accompanied by increases in civic engagement (10–106%). This research, paired with the lack of strong evidence that drills save lives, suggests that proactive school safety strategies may be both more effective, and less detrimental to mental health, than drills.
... In addition, missing data is often not missing completely at random, meaning that simply ignoring missing data as is done in several studies can lead to a biased sample [50]. One way researchers can mitigate this for behavioral data is by exploiting complementary streams of data to impute the missing data [16], [51]. On the other hand, a large set of time varying features opens an array of potential learning approaches, e.g., moment statistics, regularity, or auto-encoders. ...
Full-text available
As new technology inches into every aspect of our lives, there is no place more likely to dramatically change in the future than the workplace. New passive sensing technology is emerging capable of assessing human behavior with the goal of promoting better cognitive and physical capabilities at work. In this article, we survey recent research on the use of passive sensing in the workplace to assess wellbeing and productivity of the workforce. We also discuss open problems and future directions related to passive sensing in the future workplace.
... Our dataset was generated during COVID-19, and the stress events we captured were linked to stress contributors. Tesserae [13][14][15][16][17][18][19] , is a large multi-university project that studied various aspects of the workplace performance of information workers using wearables. Compared to the Tesserae, our dataset is focused on nurses instead. ...
Full-text available
Advances in wearable technologies provide the opportunity to continuously monitor many physiological variables. Stress detection has gained increased attention in recent years, especially because early stress detection can help individuals better manage health to minimize the negative impacts of long-term stress exposure. This paper provides a unique stress detection dataset that was created in a natural working environment in a hospital. This dataset is a collection of biometric data of nurses during the COVID-19 outbreak. Studying stress "in the wild" in a work environment is complex due to the influence of many social, cultural and individuals experience in dealing with stressful conditions. In order to address these concerns, we captured both the physiological data and associated context pertaining to the stress events. Specific physiological variables that were monitored included electrodermal activity, heart rate, skin temperature, and accelerometer data of the nurse subjects. A periodic smartphone-administered survey also captured the contributing factors for the detected stress events. A database containing the signals, stress events, and survey responses is available upon request.
... In addition, we also target rejecting the null hypothesis that any prediction improvement by our contextualization approach is by chance or any random cluster-label assignment. Drawing on permutation test approaches [3,104], we permute (randomize) the cluster label of all individuals, and repeat our entire pipeline predicting of psychological constructs. We run 1,000 such permutations, and we find that the probability ( -value) of improvement by a random-cluster assignment over contextualized approaches is almost zero across all the measures ( =0.002 for abstraction, =0.001 for positive affect are the only non-zero probabilities). ...
Full-text available
Personalized predictions have shown promises in various disciplines but they are fundamentally constrained in their ability to generalize across individuals. These models are often trained on limited datasets which do not represent the fluidity of human functioning. In contrast, generalized models capture normative behaviors between individuals but lack precision in predicting individual outcomes. This paper aims to balance the tradeoff between one-for-each and one-for-all models by clustering individuals on mutable behaviors and conducting cluster-specific predictions of psychological constructs in a multimodal sensing dataset of 754 individuals. Specifically, we situate our modeling on social media that has exhibited capability in inferring psychosocial attributes. We hypothesize that complementing social media data with offline sensor data can help to personalize and improve predictions. We cluster individuals on physical behaviors captured via Bluetooth, wearables, and smartphone sensors. We build contextualized models predicting psychological constructs trained on each cluster's social media data and compare their performance against generalized models trained on all individuals' data. The comparison reveals no difference in predicting affect and a decline in predicting cognitive ability, but an improvement in predicting personality, anxiety, and sleep quality. We construe that our approach improves predicting psychological constructs sharing theoretical associations with physical behavior. We also find how social media language associates with offline behavioral contextualization. Our work bears implications in understanding the nuanced strengths and weaknesses of personalized predictions, and how the effectiveness may vary by multiple factors. This work reveals the importance of taking a critical stance on evaluating the effectiveness before investing efforts in personalization.
... However, the same ad allocation strategy would not necessarily work effectively on all users, given that every individual is different, and that they have a different lifestyle, behavior, needs, and engagement both offline and online [6]. In fact, online behaviors are also functions of offline context and routines, as well as users' momentary psychological and cognitive states [37,64,91]. Therefore, it is important to embrace and evaluate dynamic and context-centric ad allocation strategies. ...
Conference Paper
Full-text available
Showing ads delivers revenue for online content distributors, but ad exposure can compromise user experience and cause user fatigue and frustration. Correctly balancing ads with other content is imperative. Currently, ad allocation relies primarily on demographics and inferred user interests, which are treated as static features and can be privacy-intrusive. This paper uses person-centric and momentary context features to understand optimal ad-timing. In a quasi-experimental study on a three-month longitudinal dataset of 100K Snapchat users, we find ad timing influences ad effectiveness. We draw insights on the relationship between ad effectiveness and momentary behaviors such as duration, interactivity, and interaction diversity. We simulate ad reallocation, finding that our study-driven insights lead to greater value for the platform. This work advances our understanding of ad consumption and bears implications for designing responsible ad allocation systems, improving both user and platform outcomes. We discuss privacy-preserving components and the ethical implications of our work.
... This paper contextualizes the potential of leveraging pervasive technologies for this new work paradigm to enable new forms of personnel management. Pervasive technologies include ubiquitous technologies such as wearables, bluetooth, and smartphone based sensors, as well as online technologies such as social media and crowd-contributed online platforms -these technologies have shown significant promises for passively understanding wellbeing both longitudinally and at scale [13,21,23,24,25,26,27,28,29,30]. In particular, we draw on some of our recent work to discuss how they can be reconsidered and adapted. ...
Conference Paper
Sensor-based sleep monitoring systems can be used to track sleep behavior on a daily basis and provide feedback to their users to promote health and well-being. Such systems can provide data visualizations to enable self-reflection on sleep habits or a sleep coaching service to improve sleep quality. To provide useful feedback, sleep monitoring systems must be able to recognize whether an individual is sleeping or awake. Existing approaches to infer sleep-wake phases, however, typically assume continuous streams of data to be available at inference time. In real-world settings, though, data streams or data samples may be missing, causing severe performance degradation of models trained on complete data streams. In this paper, we investigate the impact of missing data to recognize sleep and wake, and use regression-and interpolation-based imputation strategies to mitigate the errors that might be caused by incomplete data. To evaluate our approach, we use a data set that includes physiological traces-collected using wristbands-, behavioral data-gathered using smartphones-and self-reports from 16 participants over 30 days. Our results show that the presence of missing sensor data degrades the balanced accuracy of the classifier on average by 10-35 percentage points for detecting sleep and wake depending on the missing data rate. The impu-tation strategies explored in this work increase the performance of the classifier by 4-30 percentage points. These results open up new opportunities to improve the robustness of sleep monitoring systems against missing data.
The transition from high school to college is a taxing time for young adults. New students arriving on campus navigate a myriad of challenges centered around adapting to new living situations, financial needs, academic pressures and social demands. First-year students need to gain new skills and strategies to cope with these new demands in order to make good decisions, ease their transition to independent living and ultimately succeed. In general, first-generation students are less prepared when they enter college in comparison to non-first-generation students. This presents additional challenges for first-generation students to overcome and be successful during their college years. We study first-year students through the lens of mobile phone sensing across their first year at college, including all academic terms and breaks. We collect longitudinal mobile sensing data for N=180 first-year college students, where 27 of the students are first-generation, representing 15% of the study cohort and representative of the number of first-generation students admitted each year at the study institution, Dartmouth College. We discuss risk factors, behavioral patterns and mental health of first-generation and non-first-generation students. We propose a deep learning model that accurately predicts the mental health of first-generation students by taking into account important distinguishing behavioral factors of first-generation students. Our study, which uses the StudentLife app, offers data-informed insights that could be used to identify struggling students and provide new forms of phone-based interventions with the goal of keeping students on track.
Full-text available
The mental health of college students is a growing concern, and gauging the mental health needs of college students is difficult to assess in real-time and in scale. To address this gap, researchers and practitioners have encouraged the use of passive technologies. Social media is one such "passive sensor" that has shown potential as a viable "passive sensor" of mental health. However, the construct validity and in-practice reliability of computational assessments of mental health constructs with social media data remain largely unexplored. Towards this goal, we study how assessing the mental health of college students using social media data correspond with ground-truth data of on-campus mental health consultations. For a large U.S. public university, we obtained ground-truth data of on-campus mental health consultations between 2011–2016, and collected 66,000 posts from the university’s Reddit community. We adopted machine learning and natural language methodologies to measure symptomatic mental health expressions of depression, anxiety, stress, suicidal ideation, and psychosis on the social media data. Seasonal auto-regressive integrated moving average (SARIMA) models of forecasting on-campus mental health consultations showed that incorporating social media data led to predictions with r = 0.86 and SMAPE = 13.30, outperforming models without social media data by 41%. Our language analyses revealed that social media discussions during high mental health consultations months consisted of discussions on academics and career, whereas months of low mental health consultations saliently show expressions of positive affect, collective identity, and socialization. This study reveals that social media data can improve our understanding of college students’ mental health, particularly their mental health treatment needs.
In this paper, we present an automatic approach to recognize cooking activities from acceleration and motion data. We rely on a dataset that contains three-axis acceleration and motion data collected with multiple devices, including two wristbands, two smartphones and a motion capture system. The data is collected from three participants while preparing sandwich, fruit salad and cereal recipes. The participants performed several fine-grained activities while preparing each recipe such as cut and peel. We propose to use the multi-class classification approach to distinguish between cooking recipes and a multi-label classification approach to identify the fine-grained activities. Our approach achieves 81% accuracy to recognize fine-grained activities and 66% accuracy to distinguish between different recipes using leave-one-subject-out cross-validation. The multi-class and multi-label classification results are 27 and 50% points higher than the baseline. We further investigate the effect on classification performance of different strategies to cope with missing data and show that imputing missing data with an iterative approach provides 3% point increment to identify fine-grained activities. We confirm findings from the literature that extracting features from multi-sensors achieves higher performance in comparison to using single-sensor features.
Conference Paper
Full-text available
Social media serves as a platform to share thoughts and connect with others. The ubiquitous use of social media also enables researchers to study human behavior as the data can be collected in an inexpensive and unobtrusive way. Not only does social media provide a passive means to collect historical data at scale, it also functions as a "verbal" sensor, providing rich signals about an individual's social ecological context. This case study introduces an infrastructural framework to illustrate the feasibility of passively collecting social media data at scale in the context of an ongoing multimodal sensing study of workplace performance (N=757).. sharing vs. those who do not. Our work provides practical experiences and implications for research in the HCI field who seek to conduct similar longitudinal studies that harness the potential of social media data.
Full-text available
Understanding the effects of psychiatric medications during mental health treatment constitutes an active area of inquiry. While clinical trials help evaluate the effects of these medications, many trials suffer from a lack of generalizability to broader populations. We leverage social media data to examine psychopathological effects subject to self-reported usage of psychiatric medication. Using a list of common approved and regulated psychiatric drugs and a Twitter dataset of 300M posts from 30K individuals, we develop machine learning models to first assess effects relating to mood, cognition, depression, anxiety, psychosis, and suicidal ideation. Then, based on a stratified propensity score based causal analysis, we observe that use of specific drugs are associated with characteristic changes in an individual’s psychopathology. We situate these observations in the psychiatry literature, with a deeper analysis of pre-treatment cues that predict treatment outcomes. Our work bears potential to inspire novel clinical investigations and to build tools for digital therapeutics.
Full-text available
Wearables and mobile devices see the world through the lens of half a dozen low-power sensors, such as, barometers, accelerometers, microphones and proximity detectors. But differences between sensors ranging from sampling rates, discrete and continuous data or even the data type itself make principled approaches to integrating these streams challenging. How, for example, is barometric pressure best combined with an audio sample to infer if a user is in a car, plane or bike? Critically for applications, how successfully sensor devices are able to maximize the information contained across these multi-modal sensor streams often dictates the fidelity at which they can track user behaviors and context changes. This paper studies the benefits of adopting deep learning algorithms for interpreting user activity and context as captured by multi-sensor systems. Specifically, we focus on four variations of deep neural networks that are based either on fully-connected Deep Neural Networks (DNNs) or Convolutional Neural Networks (CNNs). Two of these architectures follow conventional deep models by performing feature representation learning from a concatenation of sensor types. This classic approach is contrasted with a promising deep model variant characterized by modality-specific partitions of the architecture to maximize intra-modality learning. Our exploration represents the first time these architectures have been evaluated for multimodal deep learning under wearable data -- and for convolutional layers within this architecture, it represents a novel architecture entirely. Experiments show these generic multimodal neural network models compete well with a rich variety of conventional hand-designed shallow methods (including feature extraction and classifier construction) and task-specific modeling pipelines, across a wide-range of sensor types and inference tasks (four different datasets). Although the training and inference overhead of these multimodal deep approaches is in some cases appreciable, we also demonstrate the feasibility of on-device mobile and wearable execution is not a barrier to adoption. This study is carefully constructed to focus on multimodal aspects of wearable data modeling for deep learning by providing a wide range of empirical observations, which we expect to have considerable value in the community. We summarize our observations into a series of practitioner rules-of-thumb and lessons learned that can guide the usage of multimodal deep learning for activity and context detection.
Full-text available
Recent advances in wearable camera technology and computer vision algorithms have greatly enhanced the automatic capture and recognition of human activities in real-world settings. While the appeal and utility of wearable camera devices for human-behavior understanding is indisputable, privacy concerns have limited the broader adoption of this method. To mitigate this problem, we propose a deep learning-based approach that recognizes everyday activities in egocentric photos that have been intentionally degraded in quality to preserve the privacy of bystanders. An evaluation on 2 annotated datasets collected in the field with a combined total of 84,078 egocentric photos showed activity recognition performance with accuracy between 79% and 88% across 17 and 21 activity classes when the images were subjected to blurring (mean filter k=20). To confirm that image degradation does indeed raise the perception of bystander privacy, we conducted a crowd sourced validation study with 640 participants; it showed a statistically significant positive relationship between the amount of image degradation and participants' willingness to be captured by wearable cameras. This work contributes to the field of privacy-sensitive activity recognition with egocentric photos by highlighting the trade-off between perceived bystander privacy protection and activity recognition performance.
Full-text available
Stress constitutes a persistent wellbeing challenge to college students, impacting their personal, social, and academic life. However, violent events on campuses may aggravate student stress, due to the induced fear and trauma. In this paper, leveraging social media as a passive sensor of stress, we propose novel computational techniques to quantify and examine stress responses after gun violence on college campuses. We first present a machine learning classifier for inferring stress expression in Reddit posts, which achieves an accuracy of 82%. Next, focusing on 12 incidents of campus gun violence in the past five years, and social media data gathered from college Reddit communities, our methods reveal amplified stress levels following the violent incidents, which deviate from usual stress patterns on the campuses. Further, distinctive temporal and linguistic changes characterize the campus populations, such as reduced cognition, higher self pre-occupation and death-related conversations. We discuss the implications of our work in improving mental wellbeing and rehabilitation efforts around crisis events in college student populations.
Full-text available
Active and passive sensing technologies are providing powerful mechanisms to track, model, and understand a range of health behaviors and well-being states. Despite yielding rich, dense and high fidelity data, current sensing technologies often require highly engineered study designs and persistent participant compliance, making them difficult to scale to large populations and to data acquisition tasks spanning extended time periods. This paper situates social media as a new passive, unobtrusive sensing technology. We propose a semi-supervised machine learning framework to combine small samples of data gathered through active sensing, with large-scale social media data to infer mood instability (MI) in individuals. Starting from a theoretically-grounded measure of MI obtained from mobile ecological momentary assessments (EMAs), we show that our model is able to infer MI in a large population of Twitter users with 96% accuracy and F-1 score. Additionally, we show that, our model predicts self-identifying Twitter users with bipolar and borderline personality disorder to exhibit twice the likelihood of high MI, compared to that in a suitable control. We discuss the implications and the potential for integrating complementary sensing capabilities to address complex research challenges in precision medicine.
Assessing performance in the workplace typically relies on subjective evaluations, such as, peer ratings, supervisor ratings and self assessments, which are manual, burdensome and potentially biased. We use objective mobile sensing data from phones, wearables and beacons to study workplace performance and offer new insights into behavioral patterns that distinguish higher and lower performers when considering roles in companies (i.e., supervisors and non-supervisors) and different types of companies (i.e., high tech and consultancy). We present initial results from an ongoing year-long study of N=554 information workers collected over a period ranging from 2-8.5 months. We train a gradient boosting classifier that can classify workers as higher or lower performers with AUROC of 0.83. Our work opens the way to new forms of passive objective assessment and feedback to workers to potentially provide week by week or quarter by quarter guidance in the workplace.
Personality traits describe individual differences in patterns of thinking, feeling, and behaving ("between-person" variability). But individuals also show changes in their own patterns over time ("within-person" variability). Existing approaches to measuring within-person variability typically rely on self-report methods that do not account for fine-grained behavior change patterns (e.g., hour-by-hour). In this paper, we use passive sensing data from mobile phones to examine the extent to which within-person variability in behavioral patterns can predict self-reported personality traits. Data were collected from 646 college students who participated in a self-tracking assignment for 14 days. To measure variability in behavior, we focused on 5 sensed behaviors (ambient audio amplitude, exposure to human voice, physical activity, phone usage, and location data) and computed 4 within-person variability features (simple standard deviation, circadian rhythm, regularity index, and flexible regularity index). We identified a number of significant correlations between the within-person variability features and the self-reported personality traits. Finally, we designed a model to predict the personality traits from the within-person variability features. Our results show that we can predict personality traits with good accuracy. The resulting predictions correlate with self-reported personality traits in the range of r = 0.32, MAE = 0.45 (for Openness in iOS users) to r = 0.69, MAE = 0.55 (for Extraversion in Android users). Our results suggest that within-person variability features from smartphone data has potential for passive personality assessment.
Human activity recognition (HAR) is a promising research issue in ubiquitous and wearable computing. However, there are some problems existing in traditional methods: 1) They treat HAR as a single label classification task, and ignore the information from other related tasks, which is helpful for the original task. 2) They need to predesign features artificially, which are heuristic and not tightly related to HAR task. To address these problems, we propose AROMA (human activity recognition using deep multi-task learning). Human activities can be divided into simple and complex activities. They are closely linked. Simple and complex activity recognitions are two related tasks in AROMA. For simple activity recognition task, AROMA utilizes a convolutional neural network (CNN) to extract deep features, which are task dependent and non-handcrafted. For complex activity recognition task, AROMA applies a long short-term memory (LSTM) network to learn the temporal context of activity data. In addition, there is a shared structure between the two tasks, and the object functions of these two tasks are optimized jointly. We evaluate AROMA on two public datasets, and the experimental results show that AROMA is able to yield a competitive performance in both simple and complex activity recognitions.