Association for Information Systems
AIS Electronic Library (AISeL)
," &(*/"-"./-+0,+*" &.&+*0,,+-/*!
Design of shopper segmentation systems in retail.
Evidence from 2 heterogeneous retail cases
Department of Management Science and Technology, Athens University of Economics and Business, Athens Greece
ELTRUN, Department of Management Science and Technology, Athens University of Economics & Business, Greece
Athens University of Economics & Business',-)/-&0"$-
Athens University of Economics and Business (AUEB)$&!0"$-
5&.)/"-&(&.-+0$%//+3+03/%"," &(*/"-"./-+0,+*" &.&+*0,,+-/*!*(3/& ./(" /-+*& &--3"/%.
""* ",/"!#+-&* (0.&+*&*-+ ""!&*$.+#/%"-"3),+.&0)3*0/%+-&4"!!)&*&./-/+-+#(" /-+*& &--3
"+-)+-"&*#+-)/&+*,("." +*/ / "(&--3&.*"/+-$
1&!"* "#-+)%"/"-+$"*"+0.-"/&( .". Proceedings of the 2018 Pre-ICIS SIGDSA Symposium
Designing shopper segmentation systems
2018 Pre-ICIS SIGDSA Symposium on Decision Analytics Connecting People, Data & Things, San Francisco 2018 1
2018 Pre-ICIS SIGDSA Symposium
Design of shopper segmentation systems in
retail. Evidence from 2 heterogeneous retail
Anastasia Griva, Cleopatra Bardaki, Katerina Pramatari, George
ELTRUN, the E-Business Research Center, Department of Management Science
& Technology, Athens University of Economics & Business, Athens Greece
47A Evelpidon St. & 33 Lefkados St., 113 62, Athens, Greece
email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org
Data proliferation in the retail industry enables data-driven segmentation systems that support retailers to
embrace more customer-centric strategies. This research is motivated by the abundance of data reflecting
the buying behavior of retail shopper and utilizes them for identifying shopping patterns. These patterns
correspond to different shopper segments with specific preferences that may guide tailor-made services. In
this context we propose a shopper segmentation approach that highlights the shopping intentions of
consumers that motivate them to visit the stores. Our approach proposes a holistic view of the consumer
shopping attitude that sees beyond the consumer’s entire sales history or associations of the purchased
products. We move the attention from the purchased products to the shopping needs that motivate the
shopper’s shopping trips and, in particular, we translate shopping basket per visit to shopping intention per
visit. We adopt a broader perspective of shopping trips and we delve into the product categories in shopping
baskets to reveal the shopping intention behind each basket. While other researchers view shoppers just as
associations of product items i.e. cereals milk (e.g. Cil, 2012; Srikant & Agrawal, 1995;) or as a bulk of
visits(e.g. high spending shoppers) (e.g. Aeron et al., 2012; Boone & Roehm, 2002; Han et al., 2014; Liao et
al., 2011, Park e t al., 2014), we want to give a description of the consumers’ behavior during their visits.
We applied to and validated our approach through two heterogeneous retail cases to demonstrate its
generalizability. The first one concerns sales data from different channels and stores of a major fast-moving
consumer goods (FMCG) retailer. The second one concerns sales data obtained by the physical stores of a
Fortune 500 specialty retailer of home improvement and construction products – also known as do-it-
yourself (DIY) retailer. Applying our system’s segmentation approach to two heterogeneous retailers, we
identified and assessed how different retailer (e.g. shopping channel/ place, product brand), shopper (e.g.
basket variety, volume) and data (e.g. data variety, volume) features affect the design and application of
shopper segmentation systems. We highlight those features/elements that prospective practitioners and
academics should consider if they want to conduct successful shopper segmentation analysis. We detected
various data characteristics (e.g. data variety, basket variety, and shopping channel) that affect both the
data mining results, as well as the translation of the shopper visit segments to shopping intentions.
Delving deeper into the literature, we identified studies mainly in the marketing domain (e.g. Bradlow et
al., 2017) that discuss several features that affect big data analytics systems in general. However, they do
not present evidence of how these features affected relevant segmentation cases. Also, in the IS literature,
there is a great majority of papers (e.g. Aeron et al., 2012; Boone & Roehm, 2002; Boztuǧ & Reutterer, 2008;
Miguéis et al., 2012; Rust & Huang, 2014) that perform shopper segmentation. Though, to the best of our
knowledge, authors describe their own case and not “the bigger” picture, i.e. how system inputs and features
(e.g. data) affect and alter the segmentation process, system and results/outputs; it is only implied, and
they do not discuss how different features affected segmentation results. In our interdisciplinary study, we
Designing shopper segmentation systems
2018 Pre-ICIS SIGDSA Symposium on Decision Analytics Connecting People, Data & Things, San Francisco 2018 2
identify all these features that the marketing literature has highlighted for studying consumer behavior and
shopping habits. Thus, this research also aspires to bridge marketing researchers and managers with data
scientists. The consumer segmentation analysis and its results should be both handled considering the
“marketing” characteristics of the shoppers and the retailers. Especially the accumulated experience of the
marketing managers and their intuition is necessary for a reliable, meaningful interpretation of the shopper
Figure 1 summarizes the features affecting each phase of shopper segmentation. As shown, the translation
layer is the one that is affected by most of the features. At this phase to extract wisdom from the results, we
need experts’ opinion that know the market. Experts not only consider the tangible, quantitative features
(e.g. value, volume etc.) to identify shoppers missions and motives, but also intangible elements such as
their domain knowledge and accumulate experience. Likewise, we could claim that variety is the most
important feature that affects all the phases of our approach, from the outliers’ elimination, and the product
taxonomy calibration, to the identification of the unit of analysis and the translation of the results into
insights. Closing, we should mention that price feature didn’t affect our segmentation. First, it wasn’t
available in all our cases, secondly even in the FMCG case, that was available it didn’t influence our results.
Hence, we partially confirm existing literature that admits that price feature plays an important role in more
particular products e.g. cars.
Figure 1. Shopper Segmentation system
Moreover, the two cases revealed that the units of analysis used in the literature, i.e. product items in a
single visit, or all shopper visits, are not sufficient and applicable in every retail context, but there are cases
where we should examine groups of “x” sequential visits. The value of “x” differs according to the domain
the data derived from. As we proved and as other researches support (Wolf and McQuitty 2011), a shopper
usually visits a retail store that sells products for home improvement many times and purchases few
materials each time. We devise and test a new unit of analysis where we examine groups of x continuous
visits. This intermediate unit of analysis is dictated by the particularity of some retail domains that demand
many store visits during small time windows.
Regarding the value of such a system, it is stressed when considering the consumer-oriented business
decisions it can support. Our approach/system could be evolved into to a tool for designing innovative
marketing campaigns and bundled promotions and cross-coupon programs for product categories that
belong to the same shopping visit segment. Likewise, we can create offline and online product catalogues.
For instance, we have detected women that a professional visit a DIY store for to purchase woodwork
products. Thus, to promote the new collection, it could be more effective to send them product catalogues
that meet their specific preferences, instead of including all the products. Additionally, the extracted
knowledge could be valuable for advertising purposes; e.g. breakfast products advertisements. On the other
hand, it might be used to dictate a new redesigned store layout where product categories in the same visit
segment are positioned in nearby store aisles and shelves. This way shoppers will locate products more
easily and buy more in less time. Further, the store manager could reengineer store operations management
and replenishment strategies by ordering groups of products based on the identified visit segments (Griva
et al., 2018). Last, predicting future behaviors and missions based on historical data can support several
operations e.g. product replenishment, out of stock situations.
Designing shopper segmentation systems
2018 Pre-ICIS SIGDSA Symposium on Decision Analytics Connecting People, Data & Things, San Francisco 2018 3
Future research may address some limitations of this study e.g. cases where the purpose of the visit is to
return items, or buying as a gif etc. Also, we can use data derived from alternative technologies (e.g. Radio
Frequency Identification-RFID, Global Positioning System-GPS) to evaluate the proposed approach. For
instance, data that indicate the shoppers in-store movements and the product categories they interact with
during a visit. Then comparing the resulting visit segments from POS and the IoT (Internet of Things) data
we can identify the selling gaps. From a technical perspective, we can apply more data mining techniques
and compare the resulting visit segments. Also, other techniques e.g. graph mining could also be examined
to further analyze each resulting segment and cope with the difficulty to identify more detailed segments in
Shopper Segmentation, Retail Analytics, Shopper Behavior, Cluster Analysis, Data Mining
We would like to thank Wharton Customer Analytics Initiative (https://wcai.wharton.upenn.edu), for
providing the dataset regarding the Fortune 500 Specialty retailer. This research has been supported by the
European Commission under the H2020 project Transforming Transport
https://transformingtransport.eu/ (Under Grand agreement no: 731932).
Aeron, H., Kumar, A., & Moorthy, J. 2012. "Data mining framework for customer lifetime value-based
segmentation", Journal of Database Marketing and Customer Strategy Management, (19:1), pp. 17-
Boone, D. S., & Roehm, M. 2002. "Retail segmentation using artificial neural networks", International
Journal of Research in Marketing, (19 :3), pp. 287-301.
Boztuǧ, Y., & Reutterer, T. 2008. "A combined approach for segment-specific market basket analysis",
European Journal of Operational Research, (187:1), pp. 294-312.
Bradlow, E. T., Gangwar, M., Kopalle, P., & Voleti, S. 2017. "The Role of Big Data and Predictive Analytics
in Retailing", Journal of Retailing, (93:1), pp. 79-95.
Cil, I. 2012. "Consumption universes based supermarket layout through association rule mining and
multidimensional scaling", Expert Systems with Applications, (39:10), pp. 8611-8625.
Griva, A., Bardaki, C., Pramatari, K., & Papakiriakopoulos, D. 2018. "Retail business analytics: Customer
visit segmentation using market basket data", Expert Systems with Applications, (100:2018), pp. 1-
Gupta, S., Hanssens, D., Hardie, B., Kahn, W., Kumar, V., Lin, N., Sriram, S. 2006. "Modeling customer
lifetime value", Journal of Service Research, (9:2), pp. 139-155.
Han, S., Ye, Y., Fu, X., & Chen, Z. 2014. "Category role aided market segmentation approach to convenience
store chain category management", Decision Support Systems, (57:1), pp. 296-308.
Liao, S., Chen, Y., & Hsieh, H. 2011. "Mining customer knowledge for direct selling and marketing", Expert
Systems with Applications, (38:5), pp. 6059-6069.
Miguéis, V. L., Camanho, A. S., & Falcão e Cunha, J. 2012. "Customer data mining for lifestyle
segmentation", Expert Systems with Applications, (39:10), pp. 9359-9366.
Park, C. H., Park, Y.-H., & Schweidel, D. A. 2014. "A multi-category customer base analysis", International
Journal of Research in Marketing, (31:3), pp. 266-279.
Rust, R. T., & Huang, M.-H. 2014. "The Service Revolution and the Transformation of Marketing Science",
Marketing Science, (33:2), pp. 206-221.
Srikant, R., & Agrawal, R. 1995. "Mining Generalized Association Rules", in VLDB ’95 Proceedings of the
21th International Conference on Very Large Data Bases, Umeshwar Dayal, Peter M. D. Gray,
Shojiro Nishio (eds.), Zurich, Switzerland, pp. 407-419.
Wolf, M., & McQuitty, S. 2011. "Understanding the do-it-yourself consumer: DIY motivations and
outcomes", AMS Review, (1), pp. 154–170.