PreprintPDF Available

Heterogeneous Hyper-Graph Neural Networks for Context-aware Human Activity Recognition

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Context-aware Human Activity Recognition (CHAR) is challenging due to the need to recognize the user's current activity from signals that vary significantly with contextual factors such as phone placements and the varied styles with which different users perform the same activity. In this paper, we argue that context-aware activity visit patterns in realistic in-the-wild data can equivocally be considered as a general graph representation learning task. We posit that exploiting underlying graphical patterns in CHAR data can improve CHAR task performance and representation learning. Building on the intuition that certain activities are frequently performed with the phone placed in certain positions, we focus on the context-aware human activity problem of recognizing the <Activity, Phone Placement> tuple. We demonstrate that CHAR data has an underlying graph structure that can be viewed as a heterogenous hypergraph that has multiple types of nodes and hyperedges (an edge connecting more than two nodes). Subsequently, learning <Activity, Phone Placement> representations becomes a graph node representation learning problem. After task transformation, we further propose a novel Heterogeneous HyperGraph Neural Network architecture for Context-aware Human Activity Recognition (HHGNN-CHAR), with three types of heterogeneous nodes (user, phone placement, and activity). Connections between all types of nodes are represented by hyperedges. Rigorous evaluation demonstrated that on an unscripted, in-the-wild CHAR dataset, our proposed framework significantly outperforms state-of-the-art (SOTA) baselines including CHAR models that do not exploit graphs, and GNN variants that do not incorporate heterogeneous nodes or hyperedges with overall improvements 14.04% on Matthews Correlation Coefficient (MCC) and 7.01% on Macro F1 scores.
Content may be subject to copyright.
Heterogeneous Hyper-Graph Neural Networks for
Context-aware Human Activity Recognition
Wen Ge§, Guanyi Mou§, Emmanuel O. Agu, Kyumin Lee
Computer Science Department,Worcester Polytechnic Institute Worcester, MA, USA
{wge, gmou, emmanuel, kmlee}@wpi.edu
Abstract—Context-aware Human Activity Recognition
(CHAR) is challenging due to the need to recognize the
user’s current activity from signals that vary significantly with
contextual factors such as phone placements and the varied
styles with which different users perform the same activity. In
this paper, we argue that context-aware activity visit patterns
in realistic in-the-wild data can equivocally be considered as
a general graph representation learning task. We posit that
exploiting underlying graphical patterns in CHAR data can
improve CHAR task performance and representation learning.
Building on the intuition that certain activities are frequently
performed with the phone placed in certain positions, we focus
on the context-aware human activity problem of recognizing
the <Activity, Phone Placement>tuple. We demonstrate that
CHAR data has an underlying graph structure that can be
viewed as a heterogenous hypergraph that has multiple types
of nodes and hyperedges (an edge connecting more than two
nodes). Subsequently, learning <Activity, Phone Placement>
representations becomes a graph node representation learning
problem. After task transformation, we further propose a novel
Heterogeneous HyperGraph Neural Network architecture for
Context-aware Human Activity Recognition (HHGNN-CHAR),
with three types of heterogeneous nodes (user, phone placement,
and activity). Connections between all types of nodes are
represented by hyperedges. Rigorous evaluation demonstrated
that on an unscripted, in-the-wild CHAR dataset, our proposed
framework significantly outperforms state-of-the-art (SOTA)
baselines including CHAR models that do not exploit graphs,
and GNN variants that do not incorporate heterogeneous nodes
or hyperedges with overall improvements 14.04% on Matthews
Correlation Coefficient (MCC) and 7.01% on Macro F1 scores.
Index Terms—Context-aware Human Activity Recognition,
heterogeneous graph, hypergraph, graph neural networks
I. INTRODUCTION
Motivation: Context-aware Human Activity Recognition
(CHAR) [1] tries to recognize the user’s activity while being
aware of their current context. CHAR is an important task
in Context-aware (CA) [2] Systems, targeting diverse real-life
problems [3], [4]. To generate realistic, labeled CHAR datasets
for supervised learning, apps installed on the smartphones
(or smartwatches) of human volunteers1, continuously gather
sensor data (e.g., accelerometer) as they live their lives while
performing various activities. Users are also prompted period-
ically to provide ground truth context-aware activity labels.
While many definitions of user context exist in the litera-
ture [5], [6], in this paper, since certain activities are frequently
§Equal contribution
1We use “volunteers”, “participants” and “users” interchangeably
performed with the phone placed in certain positions, we focus
on the specific CHAR problem of recognizing the <Activity,
Phone Placement>tuple. The overarching goal of this paper is
to create a pattern recognition model that can simultaneously
infer a user’s current activity and phone placement from sensor
data gathered from their smartphone. We focus on CHAR on
smartphones, which are now nearly ubiquitously owned.
Prior Work: has explored improving Human Activity
Recognition (HAR) by exploiting underlying graphical struc-
tures in activity data. Martin et al. [7] modeled human mobility
patterns by using GPS coordinates to build multiple personal-
ized graphs, and then predicted target HAR labels using two
Graph Convolution Networks (GCNs) [8]. Mohamed et al. [9]
proposed HAR-GCNN, which exploited correlations between
chronologically adjacent sensor measurements to predict miss-
ing activity labels. Selected activities were used to build a fully
connected graph where each node contained sensors values
and corresponding activitly labels. A GCN was employed to
learn node embeddings, followed by a sequence of CNNs to
predict activity labels. One limitation of HAR-GCNN is that it
is incapable of making predictions for nodes with completely
missing labels as is common in in-the-wild HAR datasets.
Novelty: Our work differs from prior work in two ways:
1) The CHAR task was not explored. Prior work [7],
[9] recognized activities without factoring in phone placement
or inter-user activity performance style, which results in an
oversimplification of the problem. Such HAR models would
underperform when deployed in the wild [10]–[12].
2) Graphs were built using data from specific sensors and
labels. Some sensor data can be privacy sensitive (e.g., GPS),
and less than 50% of users are willing to grant permission
access permission [13]. In contrast, our approach derives the
graph directly from CHAR data based solely on label co-
occurrence information observed in the training set.
Our approach: We propose a generalizable GNN-based
approach that improves CHAR performance by exploiting
graphical patterns and internal relationships between different
entities in CHAR data without requiring external information.
We propose 1) A CHAR graph: that has three types of
graph entity/nodes (activities, phone placements, and users)
and edges as the aggregated mean feature values of instances
that have similar corresponding labels connecting them. 2) A
method of encoding CHAR data, transforming the associated
recognition task into the equivalent heterogeneous hypergraph
arXiv:2409.17483v1 [cs.LG] 26 Sep 2024
Fig. 1. Sample CHAR graph that has heterogeneous nodes and hyperedges
representation learning problem that is inspired by user-item
interaction graphs commonly utilized in recommender sys-
tems [14]–[16], and 3) A novel deep heterogenous hypergraph
Graph Neural Network (GNN)-based learning model to solve
the CHAR task.
The obtained CHAR graphical representation is heteroge-
neous [17] because the activity, phone placement, and user
attributes that we define as nodes in our graph, are actually
different types of nodes. As participants may perform mul-
tiple activities with specific phone placement concurrently,
connecting all activities performed by a given user to valid
phone placements results in hyperedges [18]. Such a one-
to-many mapping condition is formulated as a multi-label
problem in which phone placement and activity labels may
co-occur. The example CHAR graph shown in Fig. 1 has two
hyperedges: e1and e2, where hyperedge e2 represents the
User1 putting the Phone On Table while Sitting and Talking at
the same time. After graphical encoding and transformation, a
CHAR Heterogeneous HyperGraph is obtained. In the original
CHAR task, given an input signal representation, a CHAR
model predicts associated context-aware activity labels. In
the transformed task, given input data as a new hyperedge
and the heterogeneous hypergraph discovered from the CHAR
training set, a graph representation learning model can predict
the labels as the most likely connecting nodes. Transforming
the CHAR task into a graph representation learning task,
facilitates explicit identification and exploitation of graphical
relationships among three types of nodes (activity, phone
placement, and user) and with the corresponding sensor data.
Our contributions: in this paper are as follows:
A method for graphical encoding of CHAR data is pro-
posed, defining three types of CHAR nodes (activity,
phone placement, and user) and the corresponding signal
representations as edges based on underlying relationships.
Encoding the CHAR task into a graph representation learn-
ing problem is proposed, establishing that the transformed
CHAR graph has heterogeneity and hypergraph properties.
A novel Heterogeneous HyperGraph Neural Network for
Context-aware Human Activity Recognition (HHGNN-
CHAR) is proposed, which solves the above-mentioned
heterogeneous hypergraph representation learning problem
in a supervised fashion.
We rigorously evaluate our proposed HHGNN-CHAR
and demonstrate that it outperforms SOTA baselines by
14.04% on Matthews Correlation Coefficient and 7.01% on
Macro F1 scores on a realistic, in-the-wild CHAR dataset.
An extensive ablation study also revealed the non-trivial
contributions made by each component of HHGNN-
CHAR and the novel adaptations of incorporating graph
heterogeneity and hypergraph properties.
To the best of our knowledge, our work is the first to
reformulate the CHAR task as a generalized graph learning
problem, and we propose a specific GNN framework that
significantly outperforms other non-GNN SOTA approaches.
II. PRO PO SE D FRA ME WO RK
A. Notations and Specifications
For CHAR, Model Input and Graph Information are
inputs to the Model, which generates predictions as Model
Output and approximations of the actual Model Target.
At a high level, model optimization during training, model
parameters are adjusted to reduce the gap between Model
Output and the Model Target. In subsequent sections, unless
explicitly noted, the notation conventions followed are:
Model Input: The batched input data to models are de-
noted as X. Each instance within the batch is x.Xtrain and
Xtest stand for the training and testing sets, respectively.
The inputs are usually features extracted from smartphone
sensor signals, which often undergo pre-processing steps
before being consumed by the model.
Graph Information: Graph nodes and edges are denoted
as Vand E, the incidence matrix (for each hyperedge,
whether each node is connected) as GH(Eq. 1). Fig. 1
shows an example incident matrix. The hyperedge weight
matrix is denoted as GW(frequency of each hyperedge),
the hyperedge initial representation is Gattr, node degree
matrix is DV=PeGEGW(e)GH(v, e)and hyperedge
degree matrix is DE=PvVGH(v, e). The input Vfor
each layer in the network is Vin while the output is Vout.
GH(v, e) = 0,if ve,
1,if v∈ e(1)
Model Target: Batched target labels are denoted as Yand
each label as lowercase y. Targets are the ground truth
CHAR activity and phone placement labels.
Model Output: ˆ
Yand ˆyare model-generated predictions.
The Model: The general model is denoted as Mand the
model’s parameters as θM.
Other Specifications: A graph is denoted as G, users as
U, phone placements as P P , and activities as ACT .
B. Graph Formation / Task Transformation
The Original Task: Given “input data” as smartphone sen-
sor features X, performed by the simulation users U, an
CHAR model Mought to generate “predictions” around the
Fig. 2. An Overview of our HHGNN-CHAR Framework.
< ACT , P P > pairs, represented by ˆ
YACT ,ˆ
YP P , where
ˆ
YACT R|AC T |and ˆ
YP P R|P P |.
ˆ
Y=<ˆ
YACT ,ˆ
YP P >=θM(X)(2)
Traditional CHAR approaches directly model θMusing ma-
chine learning / deep learning techniques such as Multi-
Layer Perceptrons (MLP) or LightGBM to enable the model
automatically learn the hidden correlations between Xand
Y. However, these approaches lack the explicit modeling
of 1) Internal relationships within Y, especially YP P and
YACT ; 2) User-specific information that accounts for inter-
user variability in visiting various context, which ultimately
impacts its performance: correlation between Uand Y.
The Transformed CHAR Task: Considering the lack of ex-
plicit modeling correlations among X, Y, U, we transform the
CHAR task into a graphical representation learning problem.
Given (Xtrain, Ytrain ), we formulate an undirected heteroge-
neous hypergraph Gtrain with the following relationships:
Gtrain =< V, E >
V={vi|i(U, P P, AC T )}
E={ei|eiV×V}
(3)
where is noted for possible hyperconnections (connecting
more than two nodes). Thus, a graph learning model MG
consumes the graph G as input and generates the learned
node representations Vrep as output. The instance encoding
model Menc then transform the new feature vector xinto the
same feature dimensions as the node representation, denoted
as xenc. Intuitively, one may treat xas a hyperedge with un-
known connections to nodes. A third decision-making model
Mdec consumes both graph node representations and the new
hyperedge information, and predicts its node connections.
Vrep =MG(Gtrain )
xenc =Menc(x)
ˆ
Y=<ˆ
YACT ,ˆ
YP P >=Mdec (Vrep(P P , ACT ), xenc )
(4)
In the following sections, we will explain how each separate
component in the deep learning model is designed.
C. Graph Neural Network Design: HHGNN-CHAR
Our HH GN N- CH AR network (Fig. 2), is composed of
three types of fundamental sub-layers: Heterogeneity Layer,
Hypergraph Layer, and Context-aware Human Activity Recog-
nition Layer. The Heterogeneity Layer takes in node repre-
sentations from either the constructed/learned Context-aware
Human Activity Heterogeneous HyperGraph and encodes Het-
erogeneity into representations by projecting different types of
nodes into their own corresponding hidden spaces. The Hy-
pergraph Layer consumes the outputs from the Heterogeneity
Layer and adds Hypergraph Convolutions plus activations and
dropouts to address the hyperedges’ characteristics. A combi-
nation of the Heterogeneity Layer and the Hypergraph Layer
can be viewed as a HHGNN Layer block, which is stacked
in order to learn the center node representations from more
hops of its neighborhood. Lastly, the CHAR layer receives
fine-tuned node representations from blocks of the HHGNN
Layers and the task requires instance input, comparing them
and deriving the possible nodes corresponding to the input
instance input.
Heterogeneity Layer: is introduced to explicitly address the
different types of nodes within the graph (i.e., Us, PPs, and
ACTs). A given V, the input vector to the layer, is broken into
node groups, which are handled by separate linear parameters
and activations, and eventually, concatenate results as Vout
Vin = [VU, VP P , VAC T ]
Vout = [σlU(VU)σlP P (VPP )σlACT (VAC T )] (5)
where represents concatenation, lrepresents the linear
layer, and σrepresents a non-linear activation function (in
practice, LeakyReLU is adopted).
Hypergraph Layer: is introduced to address the property
that each hyperedge can connect more than two nodes. Each
Hypergraph Layer is a sequence of a HyperConv [19] sub-
layer, an activation sub-layer σ, and a dropout sub-layer β.
VhyperC onv =D1
VGHGWD1
EGT
HVinΘ
Vout =βσVhyperConv
(6)
where Θis the learnable weight matrix in the convolutions.
CHAR Layer: Given a graph’s learned node representations
Vrep R(|U|+|P P |+|ACT |)×dvand new coming input data
XRn×dxas candidate hyperedges, the CHAR layer
tries to predict the connecting nodes for each hyperedge
xi, i = 1,2, ..., n within X. First, Xis projected into the same
dimension as Vand the result is activated using non-linear
transformations. Then matrix multiplication is performed be-
tween the input data and the learned node representation
embeddings. The final outputs are the node predictions for
each hyperedge.
XP P =σlPP (X), XP P R|P P dv
XACT =σlAC T (X), XACT R|ACT dv(7)
ˆ
YP P = (XP P VT
P P )
ˆ
YACT = (XAC T VT
ACT )(8)
where represents the matrix multiplication operation.
Putting it all together: Essentially, a GCN [8] can only cap-
ture the non-heterogeneous and non-hypergraph graphs leading
to an inevitable loss of information [20]. Despite the existence
of prior work that explicitly addresses hyperedges [19], the
issue of heterogeneity within CHAR data has not previously
been considered in a GNN. Thus, our proposed approach
innovatively combines the Heterogeneity Layer and the Hy-
pergraph Layer to achieve superior performance. Finally, the
CHAR layer is used to predict ˆ
Y. In a supervised fashion,
the model learns node representations and also predicts node
connections giving hyperedges simultaneously.
III. EXP ER IM EN TS
A. Experiment Dataset
We evaluated HHGNN-CHAR on the publicly available
Extrasensory [21] dataset, which contains 6,355,350 instances
gathered from 60 participants. Each instance contains 170
features extracted from smartphone sensor data including
accelerometer, gyroscope, location, phone state, audio and
gravity sensors. 17 Extrasensory labels were considered (4
phone placement and 13 activity labels).
1) Dataset Pre-processing: The original dataset contained
signals with varied durations that we sampled using a window
length = 3 seconds and step size = 1.5 seconds, the values
we experimentally found to be optimal. Rules to fix labeling
issues: As a phone can only be carried in one position at
a time, phone placement labels are mutually exclusive. Only
phone placement labels of 0,1, or missing are valid. In a
pre-processing step, intuitive rules were applied to resolve am-
biguous and duplicate labels, and remove conflicting activity
and phone placement labels such as those that cannot co-occur.
B. Feature Extraction
Handcrafted features commonly utilized in CHAR [22],
[23] were extracted. Features were then normalized using
z= (xµ)/s where µand sare the mean and standard
deviation of features in the training set, before being applied
to the validation and testing sets. xand zare the original and
transformed features, respectively.
TABLE I
MAJOR RES ULTS O N HHGNN-CHAR AND BA SE LIN ES .
Phone Placement Activity Overall
Models MCC MacF1 MCC MacF1 MCC MacF1
ExtraMLP 0.784 0.813 0.636 0.608 0.673 0.659
CRUFT 0.758 0.868 0.616 0.783 0.651 0.804
GCN 0.824 0.907 0.675 0.813 0.712 0.837
LightGBM 0.835 0.913 0.710 0.837 0.741 0.856
HHGNN-CHAR 0.953 0.976 0.808 0.896 0.845 0.916
TABLE II
ABL ATION ST UDY O N HHGNN-CHAR.
Phone Placement Activity Overall
Models MCC MacF1 MCC MacF1 MCC MacF1
Hetero GCN 0.916 0.957 0.754 0.863 0.795 0.887
Hyper GCN 0.705 0.836 0.694 0.826 0.697 0.828
1-layer-HHGNN 0.942 0.970 0.804 0.894 0.839 0.913
HHG NN-CH AR 0.953 0.976 0.808 0.896 0.845 0.916
C. Baseline CHAR Models
We evaluated baseline models including ExtraMLP [22],
CRUFT [24], GCN [8] and LightGBM [25] as well as our
HHGNN-CHAR and its variants:
HHGNN-CHAR: proposed approach using optimal hy-
perparameter values.
Hetero GCN: HHGNN without hyperedges, to eval-
uate the contribution of hyperedges to HHGNN-
CHARperformance.
Hyper GCN: HHGNN without the heterogeneity layer
to evaluate the contribution of the heterogeneity layer.
1-layer-HHGNN: demonstrates that using two HHGNN
layers in HH GN N- CH AR is optimal.
D. Evaluation Metrics
Considering that the CHAR dataset was extremely im-
balanced, the two main evaluation metrics selected were
Matthews Correlation Coefficient (MCC) and Macro F1 Score.
E. Experimental Setting
The dataset was randomly split into 60% for training,
20% for validation, and 20% for hold-out testing. HHGNN-
CHAR was trained on the training set, and optimal hyperpa-
rameters were selected using grid search based on the best-
performing model on the validation set. Evaluation results
were reported on the hold-out testing set. The weighted binary
cross entropy loss function was adopted as our target loss
function (Eq. 9), where the instance-wise pair weights ωn,c
is inversely proportional to the label frequency and set to 0
if the label is missing. The weight ensures that positive and
negative instances contribute equally to the loss. For graph
representations, Vand Gattr were initialized to the aggregated
mean values of features and were fine-tuned during model
training, while other matrices were set to initialization values.
All other parameters were randomly initialized with a fixed
random seed to enhance reproducibility. The performance of
the proposed HH GN N- CH AR was evaluated and reported on
three levels: 1) Results for each label, 2) Average results for
the activity and phone placement categories, and 3) Average
of all activities and phone placement labels.
Loss =1
NPN
n=1 PC
c=1[ωn,c (yn,clog ˆyn,c + (1 yn,c)log(1 ˆyn,c))] (9)
F. Experiment Results
Results on Extrasensory is reported in Table I and Table II.
Overall Performance: In Table I, results for the best per-
forming baseline model under each column is shown in italic.
If the result of HHGNN-CHAR is significantly better than
the best baseline result (gap larger than 0.01), it is marked in
bold. HH GN N- CH AR outperformed the best baselines with
14.04% improvement on MCC and 7.01% improvement on
MacF1 over the Extrasensory dataset.
Performance in each category: There were also consistent
improvements for HHGNN-CHAR for both Phone Placement
and Activity categories in Table I. HHGNN-CHAR had a
14.13% MCC improvement and 6.90% MacF1 improvement
for predicting Phone Placement, and a 13.80% MCC improve-
ment and 7.05% MacF1 improvement for predicting Activity,
against the best performing baseline LightGBM.
Ablation Study: In Table II, when comparing HHGNN-
CHAR to its variants, the heterogeneous design and hy-
peredges contributed non-trivially. When the hypergraph is
converted into a traditional graph (i.e., Hetero GCN) by
breaking hyperedges into multiple pair-wise edges, the per-
formance of HH GN N- CH AR degraded due to information
loss [20]. The inferior performance of Hyper GCN com-
pared to HHGNN-CHAR also confirms that node embed-
dings learned in a unified space do not adequately capture
heterogeneity characteristics of context-aware human activity
data [26]. Admittedly, the gap between HHGNN-CHAR and
the Hyper GCN is larger, indicating that Heterogeneity con-
tributes to performance more than hyperedges. However, this
may change in a larger, real-world dataset. Lastly, HHGNN-
CHAR’s performance decreased after removing a HHGNN
layer. Intuitively, two HHGNN layers capture information in
two hops of the graph, while one-layer HHGNN only learns
representations from one-hop neighbor nodes. The network
likely benefited from a two-layer design due to the complexity
of in-the-wild CHAR data.
IV. CONCLUSION
To improve on prior approaches to the challenging Context-
aware Human Activity Recognition (CHAR) task, we pro-
pose a novel graph learning approach, transforming original
data-label connections into a Heterogeneous Hypergraph that
explicitly encodes previously implicit relationships between
context-aware activity labels. Consequently, the multi-label
CHAR problem is transformed into a node identification
problem given an unknown hyperedge. Transforming real-
world datasets yields a graph with heterogenous nodes and
hyperedges. HH GN N- CH AR, a novel heterogeneous hyper-
graph deep learning model is proposed for resolving the newly
transformed problem. To the best of our knowledge, ours is the
first effort that formulates the CHAR task as a heterogenous
hypergraph problem without requiring external information.
In rigorous experiments, our proposed approach outperformed
SOTA baselines by 14.04% on Matthews Correlation Coeffi-
cient (MCC) and 7.01% on Macro F1 scores.
In future, we will evaluate HHGNN-CHAR using subject-
level splitting and explore other backbone GNN models.
ACKNOWLEDGMENT
DARPA grant HR00111780032-WASH-FP-031 and NSF
grant CNS-1755536 supported this research.
REFERENCES
[1] T. Rault, A. Bouabdallah, Y. Challal, and F. Marin, “A survey of
energy-efficient context recognition systems using wearable sensors for
healthcare applications,” Perv. & Mob. Comp., 2017.
[2] W.-P. Lee, “Deploying personalized mobile services in an agent-based
environment, Exp. Sys. & Appl., 2007.
[3] K.-H. Chen, Y.-W. Hsu, J.-J. Yang, and F.-S. Jaw, “Evaluating the
specifications of built-in accelerometers in smartphones on fall detection
performance,” Inst. Sci. & Tech., 2018.
[4] J. Lindqvist and J. Hong, “Undistracted driving: A mobile phone that
doesn’t distract,” in Proc. Hotmobile, 2011.
[5] G. D. Abowd, A. K. Dey, P. J. Brown, N. Davies, M. Smith, and
P. Steggles, “Towards a better understanding of context and context-
awareness, in Int’l symp. handheld and ubiq. comp., 1999.
[6] A. K. Dey, G. D. Abowd, and D. Salber, A conceptual framework
and a toolkit for supporting the rapid prototyping of context-aware
applications,” HCI, 2001.
[7] H. Martin, D. Bucher, E. Suel, P. Zhao, F. Perez-Cruz, and M. Raubal,
“Graph convolutional neural networks for human activity purpose im-
putation,” in Spatiotemporal workshop co-located with NIPS, 2018.
[8] T. N. Kipf and M. Welling, “Semi-supervised classification with graph
convolutional networks, arXiv preprint arXiv:1609.02907, 2016.
[9] A. Mohamed, F. Lejarza, S. Cahail, C. Claudel, and E. Thomaz, “Har-
gcnn: Deep graph cnns for human activity recog. from highly unlabeled
mobile sensor data,” in Proc. PerCom, 2022.
[10] S. Suh, V. F. Rey, and P. Lukowicz, Adversarial deep feature extraction
network for user independent har, in PerCom, 2022.
[11] F. Attal, S. Mohammed, M. Dedabrishvili, F. Chamroukhi, L. Oukhellou,
and Y. Amirat, “Physical har using wearable sensors,” Sensors, 2015.
[12] M. Berchtold, M. Budde, H. R. Schmidtke, and M. Beigl, “An extensible
modular recognition concept that makes ar practical,” in AAAI, 2010.
[13] A. Dogrucu, A. Perucic, A. Isaro, D. Ball, E. Toto, E. A. Rundensteiner,
E. Agu, R. Davis-Martin, and E. Boudreaux, “Moodable: On feasibility
of instantaneous depreswension assessment using machine learning on
voice samples with retrospectively harvested smartphone and social
media data,” Smart Health, 2020.
[14] X. He, K. Deng, X. Wang, Y. Li, Y. Zhang, and M. Wang, “Lightgcn:
Simplifying and powering graph convolution network for recommenda-
tion,” in Proc. ACM SIGIR, 2020.
[15] X. Wang, X. He, M. Wang, F. Feng, and T.-S. Chua, “Neural graph
collaborative filtering, in Proc. ACM SIGIR, 2019.
[16] Y. Zhu, Z. Guan, S. Tan, H. Liu, D. Cai, and X. He, “Heterogeneous
hypergraph embedding for doc. recomm.” Neurocomputing, 2016.
[17] C. Zhang, D. Song, C. Huang, A. Swami, and N. V. Chawla, “Hetero-
geneous graph neural network,” in Proc. ACM SIGKDD, 2019.
[18] Y. Feng, H. You, Z. Zhang, R. Ji, and Y. Gao, “Hypergraph neural
networks,” in Proc. AAAI, 2019.
[19] S. Bai, F. Zhang, and P. H. Torr, “Hypergraph convolution and hyper-
graph attention,” Pattern Recognition, 2021.
[20] D. Yang, B. Qu, J. Yang, and P. Cudre-Mauroux, “Revisiting user
mobility and social relationships in lbsns: a hypergraph embedding
approach,” in The WWW Conf., 2019.
[21] Y. Vaizman, K. Ellis, and G. Lanckriet, “Recognizing detailed human
context in the wild from smartphones and smartwatches, IEEE Perva-
sive Computing, 2017.
[22] Y. Vaizman, N. Weibel, and G. Lanckriet, “Context recognition in-the-
wild: Unified model for multi-modal sensors and multi-label classifica-
tion,” ACM IMWUT, 2018.
[23] W. Ge and E. O. Agu, “Qcruft: Quaternion context recog. under
uncertainty using fusion & temporal learning,” in Proc. ICSC, 2022.
[24] W. Ge and E. Agu, “Cruft: Context recog. under uncertainty using fusion
and temporal learning,” in Proc. ICMLA, 2020.
[25] “Lightgbm: A highly efficient gradient boosting decision tree,” Proc.
NIPS, 2017.
[26] D. Yang, B. Qu, J. Yang, and P. Cudr´
e-Mauroux, “Lbsn2vec++: Hetero-
geneous hypergraph embedding for loc.-based soc. nets.” TKDE, 2020.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Location-Based Social Networks (LBSNs) have been widely used as a primary data source for studying the impact of mobility and social relationships on each other. Traditional approaches manually define features to characterize users’ mobility homophily and social proximity, and show that mobility and social features can help friendship and location prediction tasks, respectively. However, these hand-crafted features not only require tedious human efforts, but also are difficult to generalize. Against this background, we propose in this paper LBSN2Vec++, a heterogeneous hypergraph embedding approach designed specifically for LBSN data for automatic feature learning. Specifically, LBSN data intrinsically forms a heterogeneous hypergraph including both user-user homogeneous edges (friendships) and user-time-POI-semantic heterogeneous hyperedges (check-ins). Based on this hypergraph, we first propose a random-walk-with-stay scheme to jointly sample user check-ins and social relationships, and then learn node embeddings from the sampled (hyper)edges by not only preserving the n -wise node proximity captured by the hyperedges, but also considering embedding space transformation between node domains to fully grasp the complex structural characteristics of the LBSN heterogeneous hypergraph. Using real-world LBSN datasets collected in six cities all over the world, our extensive evaluation shows that LBSN2Vec++ significantly and consistently outperforms both state-of-the-art graph embedding techniques by up to 68 percent and the best-performing hand-crafted features in the literature by up to 70.14 percent on friendship and location prediction tasks.
Article
Full-text available
In this paper, we present a hypergraph neural networks (HGNN) framework for data representation learning, which can encode high-order data correlation in a hypergraph structure. Confronting the challenges of learning representation for complex data in real practice, we propose to incorporate such data structure in a hypergraph, which is more flexible on data modeling, especially when dealing with complex data. In this method, a hyperedge convolution operation is designed to handle the data correlation during representation learning. In this way, traditional hypergraph learning procedure can be conducted using hyperedge convolution operations efficiently. HGNN is able to learn the hidden layer representation considering the high-order data structure, which is a general framework considering the complex data correlations. We have conducted experiments on citation network classification and visual object recognition tasks and compared HGNN with graph convolutional networks and other traditional methods. Experimental results demonstrate that the proposed HGNN method outperforms recent state-of-theart methods. We can also reveal from the results that the proposed HGNN is superior when dealing with multi-modal data compared with existing methods.
Conference Paper
User dependence remains one of the most difficult general problems in Human Activity Recognition (HAR), in particular when using wearable sensors. This is due to the huge variability of the way different people execute even the simplest actions. In addition, detailed sensor fixtures and placement will be different for different people or even at different times for the same users. In theory, the problem can be solved by a large enough data set. However, recording data sets that capture the entire diversity of complex activity sets is seldom practicable. Instead, models are needed that focus on features that are invariant across users. To this end, we present an adversarial subject-independent feature extraction method with the maximum mean discrepancy (MMD) regularization for human activity recognition. The proposed model is capable of learning a subject-independent embedding feature representation from multiple subjects datasets and generalizing it to unseen target subjects. The proposed network is based on the adversarial encoder-decoder structure with the MMD to realign the data distribution over multiple subjects. Experimental results show that the proposed method not only outperforms state-of-the-art methods over the four real-world datasets but also improves the subject generalization effectively. We evaluate the method on well-known public data sets showing that it significantly improves user-independent performance and reduces variance in results.
Article
Recently, graph neural networks have attracted great attention and achieved prominent performance in various research fields. Most of those algorithms have assumed pairwise relationships of objects of interest. However, in many real applications, the relationships between objects are in higher-order, beyond a pairwise formulation. To efficiently learn deep embeddings on the high-order graph-structured data, we introduce two end-to-end trainable operators to the family of graph neural networks, i.e., hypergraph convolution and hypergraph attention. Whilst hypergraph convolution defines the basic formulation of performing convolution on a hypergraph, hypergraph attention further enhances the capacity of representation learning by leveraging an attention module. With the two operators, a graph neural network is readily extended to a more flexible model and applied to diverse applications where non-pairwise relationships are observed. Extensive experimental results with semi-supervised node classification demonstrate the effectiveness of hypergraph convolution and hypergraph attention.
Article
Depression is a leading cause of disability and is associated with suicide risk. However, a quarter of patients with major depression remain undiagnosed. Prior work has demonstrated that a smartphone user's depression level can be detected by analyzing data gathered from their smartphone's sensors or from their social media posts over a few weeks after enrollment in a user study. These studies typically utilize a prospective study design, which is burdensome as it requires participants smartphone data to be gathered for prolonged periods before their depression level can be assessed. In contrast, we present a feasibility study of our Mood Assessment Capable Framework (Moodable) that facilitates almost instantaneous mood assessment by analyzing instantaneous voice samples provided by the user as well as historical sensor data harvested (scraped) from their smartphone and recent social media posts. Our retrospective, low-burden approach means that Moodable no longer requires study participants to engage with their phone for weeks before a depression score can be inferred. Moodable has the potential to minimize user data collection burden, increase user compliance, avoid study awareness bias and offer a near instantaneous depression screening. To lay a solid foundation for Moodable, we first surveyed 202 volunteer participants about their willingness to share voice samples and various smartphone and social media data types for mental health assessment. Based on these findings, we then developed the Moodable app. Thereafter, we utilized Moodable to collect short voice samples, and a rich array of retrospectively harvested data from users' smartphones (location, browser history, call logs) and social media accounts (instagram, twitter and facebook), with appropriate permissions, of 335 volunteer participants who also responded to 9 depression related questions of the Patient Health Questionaire (PHQ-9). Moodable then used machine learning to build classification models and classify the user's depression and suicidal ideation, for users which scores where unknown to the models. Results of Moodable's screening capability are promising. In particular, for the depression classification task we achieved F1 scores (the harmonic mean of the precision and recall) of 0.766, sensitivity of 0.750, and specificity of 0.792. For the suicidal ideation task we achieved F1 scores of 0.848, sensitivity of 0.864, and specificity of 0.725. This work could significantly increase depression-screening at the population level and opens numerous avenues for further research into this newly proposed paradigm of instantaneously screening depression and suicide risk levels from voice samples and retrospective smartphone and social media data.
Conference Paper
Representation learning in heterogeneous graphs aims to pursue a meaningful vector representation for each node so as to facilitate downstream applications such as link prediction, personalized recommendation, node classification, etc. This task, however, is challenging not only because of the demand to incorporate heterogeneous structural (graph) information consisting of multiple types of nodes and edges, but also due to the need for considering heterogeneous attributes or contents (e.g., text or image) associated with each node. Despite a substantial amount of effort has been made to homogeneous (or heterogeneous) graph embedding, attributed graph embedding as well as graph neural networks, few of them can jointly consider heterogeneous structural (graph) information as well as heterogeneous contents information of each node effectively. In this paper, we propose HetGNN, a heterogeneous graph neural network model, to resolve this issue. Specifically, we first introduce a random walk with restart strategy to sample a fixed size of strongly correlated heterogeneous neighbors for each node and group them based upon node types. Next, we design a neural network architecture with two modules to aggregate feature information of those sampled neighboring nodes. The first module encodes "deep" feature interactions of heterogeneous contents and generates content embedding for each node. The second module aggregates content (attribute) embeddings of different neighboring groups (types) and further combines them by considering the impacts of different groups to obtain the ultimate node embedding. Finally, we leverage a graph context loss and a mini-batch gradient descent procedure to train the model in an end-to-end manner. Extensive experiments on several datasets demonstrate that HetGNN can outperform state-of-the-art baselines in various graph mining tasks, i.e., link prediction, recommendation, node classification & clustering and inductive node classification & clustering.