Conference Paper

Design and Evaluation of an HPC-based Expert System to speed-up Retail Data Analysis using Residual Networks Combined with Parallel Association Rule Mining and Scalable Recommenders

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Given the Covid-19 pandemic, the retail industry shifts many business models to enable more online purchases that produce large transaction data quantities (i.e., big data). Data science methods infer seasonal trends about products from this data and spikes in purchases, the effectiveness of advertising campaigns, or brand loyalty but require extensive processing power leveraging High-Performance Computing to deal with large transaction datasets. This paper proposes an High-Performance Computing-based expert system architectural design tailored for ’big data analysis’ in the retail industry, providing data science methods and tools to speed up the data analysis with conceptual interoperability to commercial cloud-based services. Our expert system leverages an innovative Modular Supercomputer Architecture to enable the fast analysis by using parallel and distributed algorithms such as association rule mining (i.e., FP-Growth) and recommender methods (i.e., collaborative filtering). It enables the seamless use of accelerators of supercomputers or cloud-based systems to perform automated product tagging (i.e., residual deep learning networks for product image analysis) to obtain colour, shapes automatically, and other product features. We validate our expert system and its enhanced knowledge representation with commercial datasets obtained from our ON4OFF research project in a retail case study in the beauty sector.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
Identification of packaged products in retail environments still relies on barcodes, requiring active user input and limited to one product at a time. Computer vision (CV) has already enabled many applications, but has so far been under-discussed in the retail domain, albeit allowing for faster, hands-free, more natural human-object interaction (e.g. via mixed reality headsets). To assess the potential of current convolutional neural network (CNN) architectures to reliably identify packaged products within a retail environment, we created and open-source a dataset of 300 images of vending machines with 15k labeled instances of 90 products. We assessed observed accuracies from transfer learning for image-based product classification (IC) and multi-product object detection (OD) on multiple CNN architectures, and the number of images instances required per product to achieve meaningful predictions. Results show that as little as six images are enough for 90% IC accuracy, but around 30 images are needed for 95% IC accuracy. For simultaneous OD, 42 instances per product are necessary and far more than 100 instances to produce robust results. Thus, this study demonstrates that even in realistic, fast-paced retail environments, image-based product identification provides an alternative to barcodes, especially for use-cases that do not require perfect 100% accuracy.
Article
Full-text available
Pattern mining is a fundamental technique of data mining to discover interesting correlations in the data set. There are several variations of pattern mining, such as frequent itemset mining, sequence mining, and high utility itemset mining. High utility itemset mining is an emerging data science task, aims to extract knowledge based on a domain objective. The utility of a pattern shows its effectiveness or benefit that can be calculated based on user priority and domain-specific understanding. The sequential pattern mining (SPM) issue is much examined and expanded in various directions. Sequential pattern mining enumerates sequential patterns in a sequence data collection. Researchers have paid more attention in recent years to frequent pattern mining over uncertain transaction dataset. In recent years, mining itemsets in big data have received extensive attention based on the Apache Hadoop and Spark framework. This paper seeks to give a broad overview of the distinct approaches to pattern mining in the Big Data domain. Initially, we investigate the problem involved with pattern mining approaches and associated techniques such as Apache Hadoop, Apache Spark, parallel and distributed processing. Then we examine major developments in parallel, distributed, and scalable pattern mining, analyze them in the big data perspective and identify difficulties in designing the algorithms. In particular, we study four varieties of itemsets mining, i.e., parallel frequent itemsets mining, high utility itemset mining, sequential patterns mining and frequent itemset mining in uncertain data. This paper concludes with a discussion of open issues and opportunity. It also provides direction for further enhancement of existing approaches.
Article
Full-text available
Market basket analysis is the process of extracting purchasing trends from records in company databases, taking into account the products that customers buy in a single transaction. In this study, a market basket analysis was conducted on a five-and-a-half year data of a large hardware company operating in the retail sector, and related product categories were identified. In determining the association rules, both the Apriori and FP-Growth algorithms were run separately and their usefulness in such a set of data was compared. In addition, the data set was divided into Data Set-1 and Data Set-2 so that the consistency of the rules was discussed by comparing the correctness of rules extracted from the first data set with rules derived from the second data set containing consecutive timed data.
Article
Full-text available
Collaborative filtering algorithms are important building blocks in many practical recommendation systems. As a result, many large-scale data processing environments include collaborative filtering models for which the Alternating Least Squares (ALS) algorithm is used to compute latent factor matrix decompositions. In this paper, we significantly accelerate the convergence of parallel ALS-based optimization methods for collaborative filtering using a nonlinear conjugate gradient (NCG) wrapper around the ALS iterations. We also provide a parallel implementation of the accelerated ALS-NCG algorithm in the distributed Apache Spark data processing environment and an efficient line search technique requiring only 1 map-reduce operation on distributed datasets. In both serial numerical experiments on a linux workstation and parallel numerical experiments on a 16 node cluster with 256 computing cores, we demonstrate using the MovieLens 20M dataset that the combined ALS-NCG method requires many fewer iterations and less time to reach high ranking accuracies than standalone ALS. In parallel, NCG-ALS achieves an acceleration factor of 4 or greater in clock time when an accurate solution is desired, with the acceleration factor increasing with greater desired accuracy. Furthermore, the NCG acceleration mechanism is fully parallelizable and scales linearly with problem size on datasets with up to nearly 1 billion ratings. Our NCG acceleration approach is versatile and may be used to accelerate other parallel optimization methods such as stochastic gradient descent and coordinate descent.
Conference Paper
Full-text available
Recommendation Systems (RSs) are becoming tools of choice to select the online information relevant to a given user. Collaborative Filtering (CF) is the most popular approach to build Recommendation System and has been successfully employed in many applications. Collaborative Filtering algorithms are much explored technique in the field of Data Mining and Information Retrieval. In CF, past user behavior are analyzed in order to establish connections between users and items to recommend an item to a user based on opinions of other users. Those customers, who had similar likings in the past, will have similar likings in the future. In the past decades due to the rapid growth of Internet usage, vast amount of data is generated and it has becomea challenge for CF algorithms. So, CF faces issues with sparsity of rating matrix and growing nature of data. These challenges are well taken care of by Matrix Factorization (MF). In this paper we are going to discuss different Matrix Factorization models such as Singular Value Decomposition (SVD), Principal Component Analysis (PCA) and Probabilistic Matrix Factorization (PMF). This paper attempts to present a comprehensive survey of MF model like SVD to address the challenges of CF algorithms, which can be served as a roadmap for research and practice in this area.
Article
Full-text available
In Data Mining, Association Rule Mining is a standard and well researched technique for locating fascinating relations between variables in large databases. Association rule is used as a precursor to different Data Mining techniques like classification, clustering and prediction. The aim of the paper is to guage the performance of the Apriori algorithm and Frequent Pattern (FP) growth algorithm by comparing their capabilities. The evaluation study shows that the FP-growth algorithm is efficient and ascendable than the Apriori algorithm.
Article
Full-text available
Recommendation Systems apply Information Retrieval techniques to select the online information relevant to a given user. Collaborative Filtering is currently most widely used approach to build Recommendation System. CF techniques uses the user behavior in form of user item ratings as their information source for prediction. There are major challenges like sparsity of rating matrix and growing nature of data which is faced by CF algorithms. These challenges are been well taken care by Matrix Factorization. In this paper we attempt to present an overview on the role of different MF model to address the challenges of CF algorithms, which can be served as a roadmap for research in this area.
Article
In the increasingly competitive fashion retail industry, companies are constantly adopting strategies focused on adjusting the products characteristics to closely satisfy customers’ requirements and preferences. Although the lifecycles of fashion products are very short, the definition of inventory and purchasing strategies can be supported by the large amounts of historical data which are collected and stored in companies’ databases. This study explores the use of a deep learning approach to forecast sales in fashion industry, predicting the sales of new individual products in future seasons. This study aims to support a fashion retail company in its purchasing operations and consequently the dataset under analysis is a real dataset provided by this company. The models were developed considering a wide and diverse set of variables, namely products’ physical characteristics and the opinion of domain experts. Furthermore, this study compares the sales predictions obtained with the deep learning approach with those obtained with a set of shallow techniques, i.e. Decision Trees, Random Forest, Support Vector Regression, Artificial Neural Networks and Linear Regression. The model employing deep learning was found to have good performance to predict sales in fashion retail market, however for part of the evaluation metrics considered, it does not perform significantly better than some of the shallow techniques, namely Random Forest.
Article
Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
Article
In recent years, deep neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarises relevant work, much of it from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
Article
FP-growth algorithm recursively generates huge amounts of conditional pattern bases and conditional FP-trees when the dataset is huge. In such a case, both the memory usage and computational cost are expensive, such that, the FP-tree can not meet the memory requirement. In this work, we propose a novel parallel FP-growth algorithm, which is designed to run on the computer cluster. To avoid memory overflow, this algorithm finds all the conditional pattern bases of frequent items by the projection method without constructing an FP-tree. Hereafter, it splits the mining task into number of independent sub-tasks, executes these sub-tasks in parallel on nodes and then aggregates the results back for the final result. Our algorithm works independently at each node. As a result, it can efficiently reduce the inter-node communication cost. Experiments show that this parallel algorithm not only avoids the memory overflow but accelerate the computational speed. In addition, it achieves much better scalability than that of the FP-growth algorithm.
Article
A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task in one domain of interest, but we only have sufficient training data in another domain of interest, where the latter data may be in a different feature space or follow a different data distribution. In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data-labeling efforts. In recent years, transfer learning has emerged as a new learning framework to address this problem. This survey focuses on categorizing and reviewing the current progress on transfer learning for classification, regression, and clustering problems. In this survey, we discuss the relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift. We also explore some potential future issues in transfer learning research.
Conference Paper
Collaborative filtering (CF) techniques have achieved widespread success in E-commerce nowadays. The tremendous growth of the number of customers and products in recent years poses some key challenges for recommender systems in which high quality recommendations are required and more recommendations per second for millions of customers and products need to be performed. Thus, the improvement of scalability and efficiency of collaborative filtering (CF) algorithms become increasingly important and difficult. In this paper, we developed and implemented a scaling-up item-based collaborative filtering algorithm on MapReduce, by splitting the three most costly computations in the proposed algorithm into four Map-Reduce phases, each of which can be independently executed on different nodes in parallel. We also proposed efficient partition strategies not only to enable the parallel computation in each Map-Reduce phase but also to maximize data locality to minimize the communication cost. Experimental results effectively showed the good performance in scalability and efficiency of the item-based CF algorithm on a Hadoop cluster.
Article
With the recent explosive growth of the Web, recommendation systems have been widely accepted by users. Item-based Collaborative Filtering (CF) is one of the most popular approaches for determining recommendations. A common problem of current item-based CF approaches is that all users have the same weight when computing the item relationships. To improve the quality of recommendations, we incorporate the weight of a user, userrank, into the computation of item similarities and differentials. In this paper, a data model for userrank calculations, a PageRank-based user ranking approach, and a userrank-based item similarities/differentials computing approach are proposed. Finally, the userrank-based approaches improve the recommendation results of the typical Adjusted Cosine and Slope One item-based CF approaches.
Market basket analysis using fp-growth and apriori algorithm: A case study of mumbai retail store
  • K Venkatachari
K. Venkatachari et al., "Market basket analysis using fp-growth and apriori algorithm: A case study of mumbai retail store," BVIMSR's Journal of Management Research, vol. 8, no. 1, pp. 56-63, 2016.
The performance of sequential and parallel implementations of fp-growth in mining a pharmacy database
  • N Khader
N. Khader et al., "The performance of sequential and parallel implementations of fp-growth in mining a pharmacy database," in Proceedings of the 2015 Industrial and Systems Engineering Research Conference.
An efficient parallel FP-Growth algorithm
  • M Chen
M. Chen et al., "An efficient parallel FP-Growth algorithm," in 2009 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, 2009, pp. 283-286.