Conference Paper

Walmart Online Grocery Personalization: Behavioral Insights and Basket Recommendations

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Food is so personal. Each individual has her own shopping characteristics. In this paper, we introduce personalization for Walmart online grocery. Our contribution is twofold. First, we study shopping behaviors of Walmart online grocery customers. In contrast to traditional online shopping, grocery shopping demonstrates more repeated and frequent purchases with large orders. Secondly, we present a multi-level basket recommendation system. In this system, unlike typical recommender systems which usually concentrate on single item or bundle recommendations, we analyze a customer’s shopping basket holistically to understand her shopping tasks. We then use multi-level cobought models to recommend items for each of the purposes. At the stage of selecting particular items, we incorporate both the customers’ general and subtle preferences into decisions. We finally recommend the customer a series of items at checkout. Offline experiments show our system can reach 11 % item hit rate, 40 % subcategory hit rate and 70 % category hit rate. Online tests show it can reach more than 25 % order hit rate.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... A large body of research in this field comes also from industry. For example, [25] aims at studying shopping behavior and a multilevel basket recommendation system is proposed. Authors point out that customers value greatly the possibility of saving time, thus grocery websites should have basket autocompletion powered by recommender systems. ...
... Grocery, as noted by [25] is both "multi-people" and "multitask". In practical terms, being multi-people means that we cannot assume purchases are results of the tastes of a single person but can be rather a complex compound of different tastes of many people (e.g., a family). ...
... 42,43 Conversely, the potential for targeted online marketing and personalized recommendations have been hypothesized to increase unhealthy food purchases online. 44,45 Although, to our knowledge, the effect of online marketing on unhealthy food selection has not been empirically tested, various simulated, online-grocery, experimental studies have demonstrated that targeted online marketing of healthy foods and provision of nutrition information were promising strategies to improve quality of foods purchased in the virtual environment. [46][47][48] On the other hand, there is plausible evidence that food and beverage industries have disproportionately targeted marketing of unhealthy items to low-income, diverse populations across media markets. ...
Article
Full-text available
Context: Online grocery services are an emerging component of the food system with the potential to address disparities in access to healthy food. Objective: We assessed the barriers and facilitators of equitable access to healthy foods in the online grocery environment, and the psychosocial, purchasing, and dietary behaviors related to its use among low-income, diverse populations. Data sources: Four electronic databases were searched to identify relevant literature; 16 studies were identified. Results: Barriers to equitable access to healthy food included cost and limited availability of online grocery services in food deserts and rural areas. The expansion of online grocery services and the ability to use nutrition assistance benefits online were equity-promoting factors. Perceived low control over food selection was a psychosocial factor that discouraged online grocery use, whereas convenience and lower perceived stress were facilitators. Findings were mixed regarding healthfulness of foods purchased online. Although few studies assessed diet, healthy food consumption was associated with online grocery use. Conclusion: Researchers should assess the impact of online grocery shopping on low-income families' food purchases and diet. Systematic review registration: PROSPERO registration no. CRD: 42021240277.
... We hope to examine and adapt the above techniques to grocery transaction data. We notice that grocery shopping differs from conventional e-commerce applications, largely due to issues of regularity and necessity [23,33]. Such nuances require us to carefully investigate grocery shopping behavior and build domain-specific representations and recommendation algorithms. ...
Conference Paper
We study the problem of representing and recommending products for grocery shopping. We carefully investigate grocery transaction data and observe three important patterns: products within the same basket complement each other in terms of functionality (complementarity); users tend to purchase products that match their preferences (compatibility); and a significant fraction of users repeatedly purchase the same products over time (loyalty). Unlike conventional e-commerce settings, complementarity and loyalty are particularly predominant in the grocery shopping domain. This motivates a new representation learning approach to leverage complementarity and compatibility holistically, as well as a new recommendation approach to explicitly account for users' 'must-buy' purchases in addition to their overall preferences and needs. Doing so not only improves product classification and recommendation performance on both public and proprietary transaction data covering various grocery store types, but also reveals interesting findings about the relationships between preferences, necessity, and loyalty in consumer purchases.
... While online grocery shopping offers potential solutions to many healthy food access challenges, there are potential pitfalls that need to be better understood. For example, with online shopping, retailers have ready access to customer data on purchasing patterns (31) and thus can target marketing to these customers. Once purchased, items usually remain on an individual's past purchasing 'list' and thus are regularly seen by the customer, which could turn an unhealthy 'once in a while' treat into a pervasive prompt for more frequent purchases. ...
Article
Full-text available
Objectives (i) To determine the current state of online grocery shopping, including individuals’ motivations for shopping for groceries online and types of foods purchased; and (ii) to identify the potential promise and pitfalls that online grocery shopping may offer in relation to food and beverage purchases. Design PubMed, ABI/INFORM and Google Scholar were searched to identify published research. Setting To be included, studies must have been published between 2007 and 2017 in English, based in the USA or Europe (including the UK), and focused on: (i) motivations for online grocery shopping; (ii) the cognitive/psychosocial domain; and (iii) the community or neighbourhood food environment domain. Subjects Our search yielded twenty-four relevant papers. Results Findings indicate that online grocery shopping can be a double-edged sword. While it has the potential to increase healthy choices via reduced unhealthy impulse purchases, nutrition labelling strategies, and as a method to overcome food access limitations among individuals with limited access to a brick-and-mortar store, it also has the potential to increase unhealthy choices due to reasons such as consumers’ hesitance to purchase fresh produce online. Conclusions Additional research is needed to determine the most effective ways to positively engage customers to use online grocery shopping to make healthier choices.
... Walmart has begun to implement this idea in the form of a new recommendation at checkout system for its online grocery (Yuan et al. 2016): ...
Preprint
Full-text available
In this paper, we consider a personalized assortment planning problem under inventory constraints, where the type of each arriving customer is defined by a primary item of interest. As long as that item is in stock, the customer adds it to her shopping cart, at which point the retailer can recommend to the customer an assortment of add-ons to go along with her primary item. This problem is motivated by the new "recommendation at checkout'' systems that have been deployed at many online retailers, and also serves as a framework which unifies many existing problems in online algorithms (personalized assortment planning, single-leg booking, online matching with stochastic rewards). In our problem, add-on recommendation opportunities are eluded when primary items go out of stock, which poses additional challenges for the development of an online policy. We overcome these challenges by introducing the notion of an inventory protection level in expectation, and derive an algorithm with a 1/4 competitive ratio guarantee under adversarial arrivals.
Chapter
In recommendation systems, items of interest are often classified into categories such as genres of movies. Existing research has shown that diversified recommendations can improve real user experience. However, most existing methods do not consider the fact that users’ levels of interest (i.e., user preferences) in different categories usually vary, and such user preferences are not reflected in the diversified recommendations. We propose an algorithm that considers user preferences for different categories when recommending diversified results, and refer to this problem as personalized recommendation diversification. In the proposed algorithm, a model that captures user preferences for different categories is optimized jointly toward both relevance and diversity. To provide the proposed algorithm with informative training labels and effectively evaluate recommendation diversity, we also propose a new personalized diversity measure. The proposed measure overcomes limitations of existing measures in evaluating recommendation diversity: existing measures either cannot effectively handle user preferences for different categories, or cannot evaluate both relevance and diversity at the same time. Experiments using two real-world datasets confirm the superiority of the proposed algorithm, and show the effectiveness of the proposed measure in capturing user preferences.
Article
Online shopping of grocery and gourmet products differ from other shopping activities due to its routine nature of buy-consume-buy. The existing recommendation algorithms of ecommerce websites are suitable only to render recommendation for products of one time purchase. So, in order to identify and recommend the products that users are likely to buy again and again, a novel recommender algorithm is proposed based on linguistic decision analysis model. The proposed buyagain recommender algorithm finds the semantic value of the user comments and computes the semantic value along with the user rating to render recommendation to the user. The efficiency of the buyagain recommender algorithm is evaluated using the grocery and gourmet dataset of amazon ecommerce websites. The end result proves that the algorithm accurately recommends the product that the user likes to purchase once again.
Conference Paper
Many websites offer promotions in terms of bundled items that can be purchased together, usually at a discounted rate. 'Bundling' may be a means of increasing sales revenue, but may also be a means for content creators to expose users to new items that they may not have considered in isolation. In this paper, we seek to understand the semantics of what constitutes a 'good' bundle, in order to recommend existing bundles to users on the basis of their constituent products, as well the more difficult task of generating new bundles that are personalized to a user. To do so we collect a new dataset from the Steam video game distribution platform, which is unique in that it contains both 'traditional' recommendation data (rating and purchase histories between users and items), as well as bundle purchase information. We assess issues such as bundle size and item compatibility, and show that these features, when combined with traditional matrix factorization techniques, can lead to highly effective bundle recommendation and generation.
Conference Paper
Full-text available
As the amount of recorded digital information increases, there is a growing need for flexible recommender systems which can incorporate richly structured data sources to improve recommendations. In this paper, we show how a recently introduced statistical relational learning framework can be used to develop a generic and extensible hybrid rec-ommender system. Our hybrid approach, HyPER (HYbrid Probabilistic Extensible Recommender), incorporates and reasons over a wide range of information sources. Such sources include multiple user-user and item-item similarity measures, content, and social information. HyPER automatically learns to balance these different information signals when making predictions. We build our system using a powerful and intuitive probabilistic programming language called probabilistic soft logic [1], which enables efficient and accurate prediction by formulating our custom recommender systems with a scalable class of graphical models known as hinge-loss Markov random fields. We experimentally evaluate our approach on two popular recommendation datasets, showing that HyPER can effectively combine multiple information types for improved performance, and can significantly outperform existing state-of-the-art approaches.
Conference Paper
Full-text available
Extracting interesting patterns from large data stores efficiently is a challenging problem in many domains. In the data mining literature, pattern frequency has often been touted as a proxy for interestingness and has been leveraged as a pruning criteria to realize scalable solutions. However, while there exist many frequent pattern algorithms in the literature, all scale exponentially in the worst case, restricting their utility on very large data sets. Furthermore, as we theoretically argue in this article, the problem is very hard to approximate within a reasonable factor, with a polynomial time algorithm. As a counter point to this theoretical result, we present a practical algorithm called Localized Approximate Miner (LAM) that scales linearithmically with the input data. Instead of fully exploring the top of the search lattice to a user-defined point, as traditional mining algorithms do, we instead explore different parts of the complete lattice, efficiently. The key to this efficient exploration is the reliance on min-wise independent permutations to collect the data into highly similar subsets of a partition. It is straight- forward to implement and scales to very large data sets. We illustrate its utility on a range of data sets, and demonstrate that the algorithm finds more patterns of higher utility in much less time than several state-of-the-art algorithms. Moreover, we realize a natural multi-level parallelization of LAM that further reduces runtimes by up to 193-fold when leveraging 256 CMP cores spanning 32 machines.
Article
Full-text available
This paper presents an interactive hybrid recommendation system that generates item predictions from multiple social and semantic web resources, such as Wikipedia, Facebook, and Twitter. The system employs hybrid techniques from traditional recommender system literature, in addition to a novel interactive interface which serves to explain the recommendation process and elicit preferences from the end user. We present an evaluation that compares different interactive and non-interactive hybrid strategies for computing recommendations across diverse social and semantic web APIs. Results of the study indicate that explanation and interaction with a visual representation of the hybrid system increase user satisfaction and relevance of predicted content.
Article
Full-text available
The World Wide Web is moving from a Web of hyper-linked Documents to a Web of linked Data. Thanks to the Semantic Web spread and to the more recent Linked Open Data (LOD) initiative, a vast amount of RDF data have been published in freely accessible datasets. These datasets are connected with each other to form the so called Linked Open Data cloud. As of today, there are tons of RDF data available in the Web of Data, but only few applications really exploit their potential power. In this paper we show how these data can successfully be used to develop a recommender system (RS) that relies exclusively on the information encoded in the Web of Data. We implemented a content-based RS that leverages the data available within Linked Open Data datasets (in particular DBpedia, Freebase and LinkedMDB) in order to recommend movies to the end users. We extensively evaluated the approach and validated the effectiveness of the algorithms by experimentally measuring their accuracy with precision and recall metrics.
Article
Full-text available
Bundling is pervasive in today's markets. However, the bundling literature contains inconsistencies in the use of terms and ambiguity about basic principles underlying the phenomenon. The literature also lacks an encompassing classification of the various strategies, clear rules to evaluate the legality of each strategy, and a unifying tramework to indicate when each is optimal. Based on a review of the marketing, economics, and law literature, this article develops a new synthesis of the field of bundling, which provides three important benetits. First, the article clearly and consistently defines bundling terms and identifies two key dimensions that enable a comprehensive classitícation of bundling strategies. Second, it formulates clear rules for evaluating the legality of each of these strategies. Third, it proposes a framework of 12 propositions that suggest which bundling strategy is optimal in various contexts. The synthesis provides managers with a framework with which to understand and choose bundling strategies. It also provides researchers with promising avenues for further research.
Conference Paper
Full-text available
We describe a recommender system in the domain of grocery shopping. While recommender systems have been widely studied, this is mostly in relation to leisure products (e.g. movies, books and music) with non-repeated purchases. In grocery shopping, however, consumers will make multiple purchases of the same or very similar products more fre- quently than buying entirely new items. The proposed rec- ommendation scheme offers several advantages in addressing the grocery shopping problem, namely: 1) a product simi- larity measure that suits a domain where no rating informa- tion is available; 2) a basket sensitive random walk model to approximate product similarities by exploiting incomplete neighborhood information; 3) online adaptation of the rec- ommendation based on the current basket and 4) a new performance measure focusing on products that customers have not purchased before or purchase infrequently. Em- pirical results benchmarking on three real-world data sets demonstrate a performance improvement of the proposed method over other existing collaborative filtering models.
Article
Full-text available
Purpose – This paper seeks to understand the triggers which influence the adoption (and the discontinuation) of online grocery shopping. Specifically, the research aims to establish the role of situational factors in the process of adoption. Design/methodology/approach – A two-step research process is employed. First, exploratory qualitative research is carried out, with the purpose of gaining an in-depth understanding of consumers' online grocery shopping behaviour. This is followed by a large-scale quantitative survey extending the findings of the qualitative research and validating the role of situational factors in instigating the commencement (and discontinuation) of online grocery buying. Cluster analysis is used to segment consumers based on the importance of specific types of situations. Findings – Both qualitative and quantitative results establish the importance of situational factors, such as having a baby or developing health problems, as triggers for starting to buy groceries online. Many shoppers are found to discontinue online grocery shopping once the initial trigger has disappeared or they have experienced a problem with the service. Practical implications – While situational factors are beyond a marketer's control, they could be used as a basis for marketing communications content and target advertising, for instance, by using magazines directed at new parents. Originality/value – The importance of situational factors as triggers for the adoption of online grocery shopping suggests an erratic adoption process, driven by circumstances rather than by a cognitive elaboration and decision. The adoption of online shopping seems to be contingent and may be discontinued when the initiating circumstances change.
Article
Full-text available
Association rule discovery has emerged as an important problem in knowledge discovery and data mining. The association mining task consists of identifying the frequent itemsets, and then forming conditional implication rules among them. We present efficient algorithms for the discovery of frequent itemsets which forms the compute intensive phase of the task. The algorithms utilize the structural properties of frequent itemsets to facilitate fast discovery. The items are organized into a subset lattice search space, which is decomposed into small independent chunks or sublattices, which can be solved in memory. Efficient lattice traversal techniques are presented which quickly identify all the long frequent itemsets and their subsets if required. We also present the effect of using different database layout schemes combined with the proposed decomposition and traversal techniques. We experimentally compare the new algorithms against the previous approaches, obtaining improvements of more than an order of magnitude for our test databases
Article
Over the past two decades, a large amount of research effort has been devoted to developing algorithms that generate recommendations. The resulting research progress has established the importance of the user-item (U-I) matrix, which encodes the individual preferences of users for items in a collection, for recommender systems. The U-I matrix provides the basis for collaborative filtering (CF) techniques, the dominant framework for recommender systems. Currently, new recommendation scenarios are emerging that offer promising new information that goes beyond the U-I matrix. This information can be divided into two categories related to its source: rich side information concerning users and items, and interaction information associated with the interplay of users and items. In this survey, we summarize and analyze recommendation scenarios involving information sources and the CF algorithms that have been recently developed to address them. We provide a comprehensive introduction to a large body of research, more than 200 key references, with the aim of supporting the further development of recommender systems exploiting information beyond the U-I matrix. On the basis of this material, we identify and discuss what we see as the central challenges lying ahead for recommender system technology, both in terms of extensions of existing techniques as well as of the integration of techniques and technologies drawn from other research areas.
Article
Recommender system has become an important component in modern eCommerce. Recent research on recommender systems has been mainly concentrating on improving the relevance or profitability of individual recommended items. But in reality, users are usually exposed to a set of items and they may buy multiple items in one single order. Thus, the relevance or profitability of one item may actually depend on the other items in the set. In other words, the set of recommendations is a bundle with items interacting with each other. In this paper, we introduce a novel problem called the Bundle Recommendation Problem (BRP). By solving the BRP, we are able to find the optimal bundle of items to recommend with respect to preferred business objective. However, BRP is a large-scale NP-hard problem. We then show that it may be sufficient to solve a significantly smaller version of BRP depending on properties of input data. This allows us to solve BRP in real-world applications with millions of users and items. Both offline and online experimental results on a Walmart.com demonstrate the incremental value of solving BRP across multiple baseline models.
Article
Online grocery shopping becomes more and more popular in recent years. To facilitate the purchase process, many online stores provide a shopping recommendation system for their consumers. So far, the generic recommendation systems mainly consider preferences of a consumer based on his/her purchase histories. Nevertheless, it is noted that there is nothing to do with the right timing to purchase a product from the view point of product replenishment or economic purchasing. Hence, we develop a new recommendation scheme especially for online grocery shopping by incorporating two additional considerations, i.e., product replenishment and product promotion. We believe that such a new scheme should be able to provide a better recommendation list which fit consumer desires, needs, and budget considerations and finally boost transactions.
Article
The increasing proliferation of online shopping and purchasing has naturally led to a growth in the popularity of comparison-shopping search engines, popularly known as “shopbots”. We extend the one-product-at-a-time search approach used in current shopbot implementations to consider purchasing plans for a bundle of items. Our approach leverages bundle-based pricing and promotional deals frequently offered by online merchants to extract substantial savings. Interestingly, our approach can also identify “freebies” that consumers can obtain at no extra cost. We also develop a model to extend the capability of the current recommendation algorithms that are mainly based on collaborative filtering and item-to-item similarity techniques, to incorporate product price and savings as an additional important factor in making recommendations to shoppers. We develop a practical algorithm that can be employed when the number of items is large or when the real-time nature of shopbot applications dictates quick response rates to consumer queries. A detailed experimental analysis with real-world data from major retailers suggests that the proposed models can provide significant savings for bundle purchasing consumers, and frequently identify freebies for consumers. Together the results underscore the potential benefits that can accrue by incorporating our models into current shopbot systems.
Conference Paper
Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist prolific patterns and/or long patterns. In this study, we propose a novel frequent pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develop an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. Efficiency of mining is achieved with three techniques: (1) a large database is compressed into a highly condensed, much smaller data structure, which avoids costly, repeated database scans, (2) our FP-tree-based mining adopts a pattern fragment growth method to avoid the costly generation of a large number of candidate sets, and (3) a partitioning-based, divide-and-conquer method is used to decompose the mining task into a set of smaller tasks for mining confined patterns in conditional databases, which dramatically reduces the search space. Our performance study shows that the FP-growth method is efficient and scalable for mining both long and short frequent patterns, and is about an order of magnitude faster than the Apriori algorithm and also faster than some recently reported new frequent pattern mining methods.
Conference Paper
MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google's clusters every day, processing a total of more than twenty petabytes of data per day.
Article
Recommender systems are an important part of the information and e-commerce ecosystem. They represent a powerful method for enabling users to filter through large information and product spaces. Nearly two decades of research on collaborative filtering have led to a varied set of algorithms and a rich collection of tools for evaluating their performance. Research in the field is moving in the direction of a richer understanding of how recommender technology may be embedded in specific domains. The differing personalities exhibited by different recommender algorithms show that recommendation is not a one-size-fits-all problem. Specific tasks, information needs, and item domains represent unique problems for recommenders, and design and evaluation of recommenders needs to be done based on the user tasks to be supported. Effective deployments must begin with careful analysis of prospective users and their goals. Based on this analysis, system designers have a host of options for the choice of algorithm and for its embedding in the surrounding user experience. This paper discusses a~wide variety of the choices available and their implications, aiming to provide both practicioners and researchers with an introduction to the important issues underlying recommenders and current best practices for addressing these issues.
Amazon fresh”. http:// fresh. amazon. com/ , Accessed 03
  • Amazon
TasteWeights: a visual interactive hybrid recommender system
  • S Bostandjiev
  • J Donovan
  • T Hollerer
Grocery home shopping
  • Walmart
Jaccard index”. https:// en. wikipedia. org/ wiki/ Jaccard_ index, Accessed 03
  • Wikipedia