ArticlePublisher preview available

# Semantic enhanced Markov model for sequential E-commerce product recommendation

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

## Abstract and Figures

To model sequential relationships between items, Markov Models build a transition probability matrix P\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf {P}$$\end{document} of size n×n\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n \times n$$\end{document}, where n represents number of states (items) and each matrix entry p(i,j)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_{(i,j)}$$\end{document} represents transition probabilities from state i to state j. Existing systems such as factorized personalized Markov chains (FPMC) and fossil either combine sequential information with user preference information or add the high-order Markov chains concept. However, they suffer from (i) model complexity: an increase in Markov Model’s order (number of states) and separation of sequential pattern and user preference matrices, (ii) sparse transition probability matrix: few product purchases from thousands of available products, (iii) ambiguous prediction: multiple states (items) having same transition probability from current state and (iv) lack of semantic knowledge: transition to next state (item) depends on probabilities of items’ purchase frequency. To alleviate sparsity and ambiguous prediction problems, this paper proposes semantic-enabled Markov model recommendation (SEMMRec) system which inputs customers’ purchase history and products’ metadata (e.g., title, description and brand) and extract products’ sequential and semantic knowledge according to their (i) usage (e.g., products co-purchased or co-reviewed) and (ii) textual features by finding similarity between products based on their characteristics using distributional hypothesis methods (Doc2vec and TF-IDF) which consider the context of items’ usage. Next, this extracted knowledge is integrated into the transition probability matrix P\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf {P}$$\end{document} to generate personalized sequential and semantically rich next item recommendations. Experimental results on various E-commerce data sets exhibit an improved performance by the proposed model
This content is subject to copyright. Terms and conditions apply.
International Journal of Data Science and Analytics
https://doi.org/10.1007/s41060-022-00343-y
REGULAR PAPER
Semantic enhanced Markov model for sequential E-commerce product
recommendation
Mahreen Nasir1
·C. I. Ezeife1
Received: 29 June 2021 / Accepted: 24 June 2022
© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2022
Abstract
To model sequential relationships between items, Markov Models build a transition probability matrix Pof size n×n,
where nrepresents number of states (items) and each matrix entry p(i,j)represents transition probabilities from state i
to state j. Existing systems such as factorized personalized Markov chains (FPMC) and fossil either combine sequential
information with user preference information or add the high-order Markov chains concept. However, they suffer from (i)
model complexity: an increase in Markov Model’s order (number of states) and separation of sequential pattern and user
preference matrices, (ii) sparse transition probability matrix: few product purchases from thousands of available products, (iii)
ambiguous prediction: multiple states (items) having same transition probability from current state and (iv) lack of semantic
knowledge: transition to next state (item) depends on probabilities of items’ purchase frequency. To alleviate sparsity and
ambiguous prediction problems, this paper proposes semantic-enabled Markov model recommendation (SEMMRec) system
which inputs customers’ purchase history and products’ metadata (e.g., title, description and brand) and extract products’
sequential and semantic knowledge according to their (i) usage (e.g., products co-purchased or co-reviewed) and (ii) textual
features by ﬁnding similarity between products based on their characteristics using distributional hypothesis methods (Doc2vec
and TF-IDF) which consider the context of items’ usage. Next, this extracted knowledge is integrated into the transition
probability matrix Pto generate personalized sequential and semantically rich next item recommendations. Experimental
results on various E-commerce data sets exhibit an improved performance by the proposed model
Keywords Recommendation systems ·Sequential recommendation ·Semantics ·Markov model ·Collaborative ﬁltering ·
E-commerce
1 Introduction
Domain-driven data mining discovers actionable knowledge
and insights from complex data and behaviors. Various
frameworks, algorithms, models and evaluation systems for
actionable knowledge discovery have been studied in the past
[68]. Traditional data driven pattern mining and knowledge
This research was supported by the Natural Science and Engineering
Research Council (NSERC) of Canada under an operating Grant
(OGP-0194134) and a University of Windsor grant received by Dr. C.
I. Ezeife.
BMahreen Nasir
nasir11d@uwindsor.ca
C. I. Ezeife
cezeife@uwindsor.ca
1School of Computer Science, University of Windsor,
discovery lacks outputs that are actionable. However, in this
modern era of big data, it is imperative to discover knowledge
and insights from complex data to facilitate business decision
makers for performing appropriate actions. For instance, big
E-commerce platforms like Amazon1and AliBaba2strive
continuously to discover actionable knowledge (decision-
making actions) from their customers’ historical trends to
better serve their customers’ future needs and retain their
market share. The past years have seen a signiﬁcant paradigm
shift in the evolution of domain-driven actionable knowledge
discovery from the traditional data-driven pattern mining
[14,21,22]. During the last decade, several new research
problems and challenges emerged where incorporating the
domain knowledge into data mining processes and models
(e.g., text mining, deep neural networks, graph embedding
1https://www.amazon.com/.
2https://www.alibaba.com/.
123
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Modeling uncertainty has been a major challenge in developing Machine Learning solutions to solve real world problems in various domains. In Recommender Systems, a typical usage of uncertainty is to balance exploration and exploitation, where the uncertainty helps to guide the selection of new options in exploration. Recent advances in combining Bayesian methods with deep learning enable us to express uncertain status in deep learning models. In this paper, we investigate an approach based on Bayesian deep learning to improve personalized recommendations. We first build deep learning architectures to learn useful representation of user and item inputs for predicting their interactions. We then explore multiple embedding components to accommodate different types of user and item inputs. Based on Bayesian deep learning techniques, a key novelty of our approach is to capture the uncertainty associated with the model output and further utilize it to boost exploration in the context of Recommender Systems. We test the proposed approach in both a Collaborative Filtering and a simulated online recommendation setting. Experimental results on publicly available benchmarks demonstrate the benefits of our approach in improving the recommendation performance.
Article
Full-text available
Frequent pattern mining is one among the popular data mining techniques. Frequent pattern mining approaches extract interesting associations among the items in a given transactional database. The items of the transactional database can be organized as a concept hierarchy. Notably, frequent pattern mining does not distinguish the patterns by analyzing the categories of the items in a given concept hierarchy. In several applications, it is often useful to distinguish among the frequent patterns by analyzing how the items of the pattern are mapped to different categories of the concept hierarchy. In this paper, we propose a new interestingness measure, designated as diversity rank (drank), for capturing the diversity of a given pattern by analyzing the extent to which the items of the pattern are associated with the categories of the corresponding concept hierarchy. Given a transactional database over a set I of items and the corresponding concept hierarchy on I, we propose a methodology to compute the drank of the given pattern. Furthermore, by extending the notion of drank, we propose an approach to improve the diversity and accuracy of association rule-based recommender system. The results of our performance evaluation on the real-world MovieLens dataset demonstrate that the proposed diversity model extracts different kinds of patterns as compared to frequent patterns. Furthermore, our proposed recommender system approach improves the diversity performance w.r.t. the existing association rule-based recommender system without significantly compromising the accuracy. Overall, the proposed concept hierarchy-based diverse pattern model provides a scope to develop new approaches for improving the performance of frequent pattern mining-based applications.
Collaborative filtering (CF) has been one of the most important and popular recommendation methods, which aims at predicting users’ preferences (ratings) based on their past behaviors. Recently, various types of side information beyond the explicit ratings users give to items, such as social connections among users and metadata of items, have been introduced into CF and shown to be useful for improving recommendation performance. However, previous works process different types of information separately, thus failing to capture the correlations that might exist across them. To address this problem, in this work, we study the application of heterogeneous information network (HIN), which offers a unifying and flexible representation of different types of side information, to enhance CF-based recommendation methods. However, we face challenging issues in HIN-based recommendation, i.e., how to capture similarities of complex semantics between users and items in a HIN, and how to effectively fuse these similarities to improve final recommendation performance. To address these issues, we apply metagraph to similarity computation and solve the information fusion problem with a “matrix factorization (MF) + factorization machine (FM)” framework. For the MF part, we obtain the user-item similarity matrix from each metagraph and then apply low-rank matrix approximation to obtain latent features for both users and items. For the FM part, we apply FM with Group lasso (FMG) on the features obtained from the MF part to train the recommending model and, at the same time, identify the useful metagraphs. Besides FMG, a two-stage method, we further propose an end-to-end method, hierarchical attention fusing, to fuse metagraph-based similarities for the final recommendation. Experimental results on four large real-world datasets show that the two proposed frameworks significantly outperform existing state-of-the-art methods in terms of recommendation performance.
Article
Existing research exploits the semantic information from reviews to complement user-item interactions for item recommendation. However, as these approaches either defer the user-item interactions until the prediction layer or simply concatenate all the reviews of a user/item into a single review, they fail to capture the complex correlations between each user-item pair or introduce noises. Thus, we propose a novel Hierarchical and Interactive Gate Network (HIGnet) model for rating prediction. Modeling local word informativeness and global review semantics in a hierarchical manner enable us to exploit textual features of users/items and capture complex semantic user-item correlations at different levels of granularities. Experiments on five challenging real-world datasets demonstrate the state-of-the-art performance of the proposed HIGnet model. To facilitate community research, the implementation of the proposed model is made publicly available (https://github.com/uqjwen/higan).
Article
Many social studies and practical cases suggest that people's consumption behaviors and social behaviors are not isolated but interrelated in social network services. However, most existing research either predicts users' consumption preferences or recommends friends to users without dealing with them simultaneously. We propose a holistic approach to predict users' preferences on friends and items jointly and thereby make better recommendations. To this end, we design a graph neural network that incorporates a mutualistic mechanism to model the mutual reinforcement relationship between users' consumption behaviors and social behaviors. Our experiments on the two-real world datasets demonstrate the effectiveness of our approach in both social recommendation and link prediction.
Book
Collaborative Filtering Recommender Systems discusses a wide variety of the recommender choices available and their implications, providing both practitioners and researchers with an introduction to the important issues underlying recommenders and current best practices for addressing these issues.
Book
This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.