Content uploaded by Theresa Maria Rausch
Author content
All content in this area was uploaded by Theresa Maria Rausch on Feb 14, 2022
Content may be subject to copyright.
1
Published in International Journal of Electronic Business
https://doi.org/10.1504/IJEB.2022.10044963
Gotta Buy ‘Em All? Online Shopping Cart Abandonment Among New and Existing
Customers
Theresa Maria Rausch, Benedikt M. Brand
University of Bayreuth, Universitätsstraße 30, 95447 Bayreuth, Germany
Abstract
For e-retailers, shopping cart abandonment rates are essential measures of their success within the
e-market. Extant behavioural literature determined factors triggering cart abandonment, whereas
another stream of literature explored customers' online purchase behaviour with clickstream data
drawing on different segmentations such as mobile versus desktop shoppers. Nevertheless, research
still lacks an understanding of cart abandonment with unbiased user generated behaviour. This
study fills a research gap by determining factors resulting in cart abandonment based on clickstream
data. Since particularly new and existing customers need to be addressed differently, the study
identifies drivers for both. The findings indicate that mobile shoppers exhibit a higher likelihood
of abandoning their cart, which even intensifies for new customers. For existing customers, the
odds of completing the purchase decreases with every additional item in the customers' cart and
new customers are rather likely to abandon the cart with an increasing number of cart page
impressions.
Keywords: online shopping cart abandonment; purchase behavior; clickstream data; customer
segmentation; e-commerce; user generated behavior
Reference to this paper should be made as follows: Rausch, T.M. and Brand, B.M. (2022)
‘Gotta buy ‘em all? Online shopping cart abandonment among new and existing customers’, Int.
J. Electronic Business, Vol. 17, No. 2, pp.109–134.
2
1 Introduction
To strengthen a company’s position within its competitive environment, marketers need to
understand essential determinants of purchase and further, non-purchase behavior. The latter might
be even more crucial for preventing marketers from misallocating their financial budgets. Mostly,
non-purchase behavior can be observed among customers placing items in their shopping cart for
reasons other than immediate purchase. This phenomenon is known as shopping cart abandonment
and is particularly apparent in the online context of e-commerce: it is the behavioral outcome of
consumers placing (an) item(s) in their online shopping cart without making a purchase by
completing the checkout process during that online session (Huang et al., 2018; Kukar-Kinney and
Close, 2010). As approximately two-third of online shoppers abandon the shopping process after
filling their shopping cart (Overby and Lee, 2006), a huge potential lies in analyzing the reasons
for shopping cart abandonment. Thus, extant literature investigated the behavioral perspective of
online shopping cart abandonment by identifying inhibitors to the purchase process: financial risks
and concerns about delivery and return policies (Kukar-Kinney and Close, 2010), the usage of
shopping carts as organization tools or for entertainment purposes (Kukar-Kinney and Close,
2010), and inhibitors at the checkout stage like perceived transaction inconvenience and privacy
intrusion (Rajamma et al., 2009) are – inter alia – the main factors leading to online shopping cart
abandonment.
Particularly within the online context, the ability to track consumers’ activities allows companies
to collect unbiased information about consumers’ behavior. The detailed records of past usage
behaviors comprised by log files and resulting clickstream data can be analyzed by marketers to
gain valuable insights: clickstream data frequently have been modeled to derive implications for
website design or advertising efforts (Chatterjee et al., 2003; Montgomery et al., 2004) and further,
3
to predict consumers’ future behaviors e.g. regarding purchase (Bucklin and Sismeiro, 2003;
Mallapragada et al., 2016; Moe and Fader, 2004b; 2004a).
While the antecedents of online shopping cart abandonment are well understood by behavioral
literature and clickstream data has been studied by methodological research to analyze consumers’
behavior, research still lacks clickstream models analyzing shopping cart abandonment
representing a recently postulated literature gap (Mallapragada et al., 2016). Additionally, despite
the richness of clickstream data, prior literature has not yet focused on information gained from
heterogeneous customer segments by distinguishing between new and existing customers. While
some studies shed light on behavioral differences between e.g. mobile and desktop online shopping
(Ransbotham et al., 2019; Raphaeli et al., 2017; Singh and Swait, 2017), questions about potential
disparities between new and existing customers remain unanswered within literature. Behavioral
research could draw on these new insights and could further investigate behavioral characteristics
of these two groups to gather additional knowledge aside from our observable clickstream insights.
Hence, within this study, we intend to answer the following research question: Why do existing
versus new customers abandon their online shopping cart? Are there differences in their
abandonment behavior based on observable clickstream variables? Thereby, our paper contributes
to literature by using unbiased clickstream data to investigate the observable drivers of online
shopping cart abandonment, whereas preceding shopping cart abandonment literature focused on
single, unobservable behavioral aspects. This study thus extends and combines the research fields
of both shopping cart abandonment as well as clickstream data. From a practical perspective, this
segmentation can be deemed reasonable since preceding research found new customers to perceive
for instance security concerns and trust issues regarding the unknown online shop (Belanger et al.,
2002; Liao and Cheung, 2001; Ranganathan and Ganapathy, 2002) and thus, they need to be
4
addressed specifically to be attracted. In contrast, existing customers drawing on past shopping
experiences (Cho et al., 2006) have to be treated differently in order to turn them into loyal ones.
Therefore, a comprehensive understanding of factors enhancing online shopping cart abandonment
based on real-world clickstream data and how they differ for both new and existing customers
needs to be investigated as well as which consequences might arise for online shop managers and
website designers. They can adapt their marketing strategies depending on the underlying customer
type within their online shop based on our findings and thus, increase their sales. E-commerce
businesses will be able to adapt the shopping experience or purchasing process.
To address this research gap, this study utilizes clickstream data of a leading German online retailer
to identify different drivers of online shopping cart abandonment for new and existing customers.
We model online shopping cart as a binary classification problem and use logit modeling to identify
the crucial drivers among new and existing customers regarding onlien shopping cart abandonment.
The remainder of this paper is organized as follows: the subsequent section describes the related
work on online shopping cart abandonment and clickstream data. Then, we outline the
methodology comprising a preliminary data analysis and the modeling approach. Next, we present
the findings and discuss the theoretical contribution as well as managerial implications and outline
limitations and guidance for future research. The final section draws a conclusion.
2 Related Work
2.1 Online Shopping Cart Abandonment
The online shopping cart abandonment phenomenon causes substantial losses of turnover for online
retailers (Huang et al., 2018; Rajamma et al., 2009) resulting in a weakened position within their
competitive environment. Therefore, extant marketing literature addressed this problem by
5
drawing on a behavioral perspective to identify and understand essential determinants of online
shopping cart abandonment summarized in Table 1.
Table 1: Prior Findings on Online Shopping Cart Abandonment.
Authors
Finding(s)
Implication(s) for This Study
Park and Kim
(2003)
Interface quality, product and service information
quality, security perception, and site awareness
determine consumers’ site loyalty and thus, their
purchase behavior (n=602)
Website design features influence
consumers’ purchase decision and
further, shopping cart abandonment
Cho et al. (2006)
Consumers' confusion by information overload,
high value-consciousness, negative past
experiences, intention to conduct price
comparisons, and unreliable websites might trigger
online shopping cart abandonment (n=245)
Various motives inherent to
potentially different consumer
groups affect online shopping cart
abandonment
Rajamma et al.
(2009)
Increased perceived transaction inconvenience and
high perceived risk serve as inhibitors at checkout
stage enhance online shopping cart abandonment
(n=707)
Findings seem to be applicable to
new customers unfamiliar with the
checkout process
Kukar-Kinney
and Close (2010)
Perceived privacy intrusion and security concerns
propels consumers to buy offline (n=255)
Shopping carts as entertainment
value, as an organization tool, as the
wait for sale, and the concerns about
costs appear to be antecedents of
shopping cart abandonment
Close and Kukar-
Kinney (2010)
Items are added to online shopping cart for reasons
other than immediate purchase (n=289)
Shopping carts are used for
entertainment, as organization tool,
as the wait for sale, and for obtaining
additional information on products
Huang et al.
(2018)
Intrapersonal and interpersonal conflicts disturb
consumers' emotions during mobile shopping and
in turn, lead to shopping cart abandonment (n=232)
Device utilized for shopping online
impacts purchase behavior and thus,
shopping cart abandonment
behavior
As shopping cart abandonment is affected by security concerns (Huang et al., 2018; Kukar-Kinney
and Close, 2010; Park and Kim, 2003), previous experiences (Cho et al., 2006), and interpersonal
conflicts (disagreement between oneself and others) (Huang et al., 2018), it becomes obvious that
these behavioral patterns might vary by person (Ding et al., 2015). Hence, such differences in
6
online shopping cart abandonment would require large sample sizes to reveal precise and
representative results for relevant subgroups (such as new and existing customers). However,
extant literature analyzing shopping cart abandonment applied structural equation modeling for
their investigations in which limited amount of data points are inherent. Consequently, their ability
to explain online shopping cart abandonment in specific subgroups is also limited (Close and
Kukar-Kinney, 2010). Furthermore, information collected applying structural equation modeling
is affected by methodological limitations of using five- or seven-point Likert scales, self-reporting
biases (Woodside, 2013), and modeling data generally assuming homogeneity (Sarstedt et al.,
2011). Therefore, larger datasets avoiding the mentioned limitations would allow to segment all
consumer information by appropriate variables (such as new and existing customers) resulting in
more homogeneous subgroups and thereby, enabling more precise implications (Olbrich and
Holsing, 2011). Customers, who gained positive experience with a specific online shop provider
and thereby developed a certain amount of trust and/or loyalty towards the retailer (e.g., existing
customers), are more likely to have fewer security concerns than potential customers visiting an
online shop for the first time (e.g., new customers), leading to declined likelihood of online
shopping cart abandonment. These differences could then be examined (e.g., regarding the device
used (Huang et al., 2018)) to reveal deeper insights into the individual consumer segments and
derive more precise implications for each segment. Hence, we analyze online shopping cart
abandonment using clickstream data as no prior study has made use of the advantages inherent to
large data sets.
2.2 Clickstream Data
Drawing on a more holistic perspective of online shopping behavior, further literature shifted away
from explanatory behavioral approaches to data-driven methods predicting online purchase
7
behavior in general. Typically, such predictions are based on clickstream data (Moe and Fader,
2004b; Sismeiro and Bucklin, 2004; van den Poel and Buckinx, 2005). Clickstream data model the
navigation path a customer takes through the online shop (Montgomery, 2001; Montgomery et al.,
2004) and can be extracted from log files which register all requests and information transferred
between the customer’s computer and the company’s commercial web server (Bucklin and
Sismeiro, 2003; Senecal et al., 2005).
In the early stages of clickstream analysis research, literature developed models for predicting
purchase probability (Moe and Fader, 2004a; Montgomery et al., 2004) or visiting behavior (Moe
and Fader, 2004b) by analyzing previous website visiting behavior and purchases (Moe and Fader,
2004a; 2004b). Moreover, clickstream data has been frequently utilized by researchers to predict
not only purchase behavior, but other similar outcome variables as well. For instance, Bucklin and
Sismeiro (Bucklin and Sismeiro, 2003) as well as Danaher et al. (2006) investigated drivers
affecting the length of time spent viewing a website and the visitors’ decision to continue browsing
or to exit the website. In the latter study, visit duration was additionally analyzed with regards to
different user demographics, where significant interaction effects were observed.
While those early applications of clickstream analysis built their investigation primarily upon visits
and browsing behavior, the majority of the subsequent studies explored variables more closely
related to actual products (van den Poel and Buckinx, 2005), product type (Chang et al., 2004;
Mallapragada et al., 2016), product recommendations (Olbrich and Holsing, 2011; Senecal et al.,
2005), product search refinement tools (Chen and Yao, 2017), product search behavior (Lockwood
et al., 2006; Johnson et al., 2004; Moe, 2006) and the purchase process (Kalczynski et al., 2006;
Sismeiro and Bucklin, 2004).
8
Unlike previous clickstream data analysis, research started to implement more detailed clickstream
variables and demographic variables (van den Poel and Buckinx, 2005), and analyzed the data
based on segments (Danaher et al., 2006; Li et al., 2002). In these studies, demographics (such as
age, gender) served as segmenting variables (Danaher et al., 2006; Li et al., 2002), and showed to
have a significant effect on visits and time spent on social commerce platforms (Kumar et al.,
2019). Other studies found correlations between demographic information and visiting strategies;
however, these could occur randomly instead of utilizing variables able to explain them due to
causality, such as behavioral variables (Phang et al., 2010).
Even though Moe and Fader (2004a) drew on a taxonomy of online shoppers’ behavior (Moe,
2003) by distinguishing between four groups (directed buyers, search/deliberation visitors, hedonic
browsers, knowledge-building visitors), they developed only one model on an individual level basis
which takes these differently behaving groups into account.
In contrast to literature revealing segments within clickstream data by drawing on demographics,
research revealed characteristic shopper segments based on browsing (Su and Chen, 2015) and
search behavior (Lockwood et al., 2006). Based on segmentation criteria (such as household size,
geographical region, etc.; Mallapragada et al., 2016), and type of shopper (established vs. ordinary;
Olbrich and Holsing, 2011), drivers for increasing online purchases were analyzed. While the
former study highlighted the need for online retailers to treat visitors differently in order to attract
new and to keep loyal customers (Mallapragada et al., 2016), the latter one found established
members of social shopping communities to be more profitable than ordinary users (Olbrich and
Holsing, 2011). Hence, both studies indicate the need for dealing differently with new and existing
customers.
9
Further, clickstream data was utilized to – inter alia – reduce shopping cart abandonment by varying
the website design (Rich and Wilson, 2010). Rich and Wilson (2010) suggested different website
design implementations to reduce shopping cart abandonment on a B2B website and analyzed
differences between a control group (not exposed to the website changes) and an experimental
group (exposed to website changes). Nevertheless, their study is limited to purely descriptive
results comparing disparities between both groups (each n=450) and by focusing on one B2B
website only. In contrast, we aim to analyze several thousand observations of online shoppers
abandoning their shopping cart, which increases our findings generalizability. Besides, we did not
conduct an experiment on the effect of website design changes (harboring the danger of biases),
but collected clickstream data of an existing online shop to analyze reasons for abandoning a
shopping cart.
3 Data and Methodology
3.1 Preliminary Data Analysis and Preprocessing
The aggregated clickstream data was gathered of a leading German online retailer, which primarily
distributes apparel for women and further, jewelry and home furnishing. The data were created by
extracting actual customer’s chronological online shop activities out of sequential log files and
subsequently, all observations or activities during the customers’ sessions were aggregated in a
first step. Thereby, a session is a period of sustained web browsing or a sequence of the user’s page
viewings until the user exits the online shop (Montgomery et al., 2004). We thus yielded customer-
level data.
10
Table 2: Explanatory Variables.
Variable
Description
References
SCAi
Dependent dummy variable capturing customer’s
shopping cart abandonment
yiif thcustomer abandoned;
otherwise.
(Huang et al., 2018)
LOGINi
Dummy variable capturing whether the customer logged
into his/her account
xiif th customer logged in;
otherwise.
(Suchacka and Chodak,
2017)
LOGIN_CHECKOUTi
Dummy variable capturing whether the customer logged
into his/her account during checkout process
xiif th customer logged in during checkout;
otherwise.
(Rajamma et al., 2009)
PISi
Metric variable capturing the number of overall page
impressions during the customer’s sessions
(Bucklin and Sismeiro,
2003; van den Poel and
Buckinx, 2005; Bucklin
et al., 2002; Moe, 2003;
Suchacka and Chodak,
2017)
PIS_SCi
Metric variable capturing the number of shopping carts
page impressions during the customer’s sessions
(Mallapragada et al.,
2016)
QUANTITYi
Metric variable capturing the number of number of
items in the shopping cart during the customer’s
sessions
NEW_CUSTi
Dummy variable capturing new customers
xiif th customeris new;
otherwise.
MOBILE_CUSTi
Dummy variable capturing customers that access the
online shop via mobile phone
xiif th customer accessed via mobile phone;
otherwise.
(Huang et al., 2018;
Raphaeli et al., 2017;
Yang, 2005; Yang and
Forney, 2013)
WEB_CUSTi
Dummy variable capturing customers that access the
online shop via computer
xiif th customer accessed via computer;
otherwise.
(Huang et al., 2018;
Raphaeli et al., 2017;
Yang, 2005; Yang and
Forney, 2013)
We further modeled different explanatory variables for each customer as listed in Table 2. Some
of the variables were developed based on preceding behavioral research findings. Particularly,
inhibitors at the checkout stage were found to be the main factors leading to online shopping cart
abandonment (Rajamma et al., 2009) and thus, two variables were modeled indicating whether
customer signed into his/her account at all (LOGINi and more specifically, signed
11
into his/her account during checkout process (LOGIN_CHECKOUTi. The latter might be a strong
indicator for customers overcoming barriers at the checkout stage and therefore, terminating the
purchase process. However, proceeding to the second step of the checkout process does not
necessarily lead to a purchase and vice-versa purchasing does not necessarily require a login since
customers can complete their purchase as a guest without signing in.
The number of page impressions is frequently considered to be a meaningful predictor by extant
literature (Sismeiro and Bucklin, 2004; van den Poel and Buckinx, 2005; Bucklin et al., 2002).
Hence, the number of overall page impressions (PISi were modeled for each customer. Since most
customers view their shopping cart before they proceed to the checkout process, we further
captured the number of shopping cart page views for each customer (PIS_SCi. Customers who do
not view their shopping cart might utilize the cart for entertainment purposes or as an organization
tool (Close and Kukar-Kinney, 2010; Kukar-Kinney and Close, 2010) without intending to
complete the purchase process.
We captured the number of items in the customer’s shopping cart (QUANTITYi and whether the
customer is new (NEW_CUSTi, which were determined by their login process and/or account
creation. The latter can be concluded from e.g. Rajamma et al.’s (2009) findings proving that
increased perceived transaction inconvenience like long registration forms enhances shopping cart
abandonment which seems particularly applicable to new customers. Moreover, building upon
evidence from prior literature (Moe and Fader, 2004b), it is assumed that new customers exhibit
different online shopping behavioral patterns (including lower purchase propensity) compared with
existing customers. Therefore, we will hereinafter focus on online shopping cart abandonment by
distinguishing between new and existing customers.
12
Since prior behavioral research investigated mobile shopping cart abandonment (Huang et al.,
2018; Raphaeli et al., 2017), we expect the device the customer is using to access the online shop
to impact online shopping cart abandonment. Thus, we modeled whether customer is accessing
the online shop via mobile phone (MOBILE_CUSTi or computer (WEB_CUSTi).
We modeled the dependent variable - shopping cart abandonment (SCAi) - as a dummy variable
using the information about customer’s compiled and ordered shopping carts:
SCAiif number of compiled shopping carts>0 & number of ordered shopping carts=0;
if number of compiled shopping carts>0 & number of ordered shopping carts>0.
After preprocessing, the data comprise 121,680 observations between February 1, 2019 to February
7, 2019, i.e., one week. Thus, we considered each weekday once. There was no public holiday nor
any other special occasion during the observed week preventing any situational variables (Kumar
et al., 2019). Since we want to focus on shopping cart abandoners and purchasing customers as a
binary prediction problem, we filtered out customers not adding any items to their shopping cart,
i.e., so-called hedonic browsers (Moe and Fader, 2004a). We further removed incorrectly captured
observations and thus, 11,586 customer-level observations (9.52%) remained.
Table 3: Descriptive Statistics.
Variable
Observations of Shopping Cart
Abandonments (n=7,858)
Observations of Purchasers (n=3,728)
Mean
SD
Median
Min
Max
Mean
SD
Median
Min
Max
PIS
21.4
25.6
12
1
347
47.7
36.8
37
4
325
PIS_SC
.9
2.1
0
0
43
3.1
3.5
2
0
53
QUANTITY
2.7
3.2
2
1
66
3.4
3.4
2
1
38
Counts
Proportion
Counts
Proportion
LOGIN
7,352
93.56%
3,699
99.22%
LOGIN_CHECKOUT
514
6.54%
1,401
37.58%
NEW_CUST
624
7.94%
212
5.69%
MOBILE_CUST
3,765
47.91%
1,151
30.87%
WEB_CUST
3,227
41.07%
2,023
54.27%
Note: SD = Standard Deviation, Min = Minimum, Max = Maximum.
13
Our data contain 7,858 (67.82%) observations of shopping cart abandonments (or non-purchasers
respectively) and 3,728 (32.18%) observations of purchasers. This distribution of purchasers and
online shopping cart abandonments mirrors prior findings (Close and Kukar-Kinney, 2010),
according to which consumers purchase online approximately 39% of the times they visit an online
shop. Considering the descriptive statistics in Table 3, we find that there is a larger proportion of
customers signing in to their account during the checkout process among the purchasers (37.58%)
than among the shopping cart abandonments (6.54%). Generally, customers place their order after
logging in to their account (99.22%) and do not order as a guest.
Furthermore, the number of purchasers’ overall page impressions (and shopping cart page viewings
respectively) is 2.2 times (and 3.4 times respectively) higher than of non-purchasers on average.
Purchasers add more items to their shopping cart (3.4 items) than non-purchasers (2.7 items).
Comparing mobile online shoppers with customers using a computer, online shopping cart
abandonment occurs more frequently among mobile shoppers (47.91% or 3,765 observations
respectively) compared to customers shopping via computer (41.07% or 3,227 observations
respectively). This is further reflected by a higher number of customers completing their purchases
via computer (54.27% or 2,023 observations respectively) compared to shoppers via mobile
(30.87% or 1,151 observations respectively). Table A.1 in the Appendix presents the correlation
coefficients of all variables.
To gather further insights into the effect of new and existing customers on online shopping cart
abandonment, we splitted our dataset: Thereby, the first segment comprises 836 observations of
new customers, whereas the other segment consists of 10,750 observations of existing customers.
Table 4 depicts the descriptive statistics of the new customers’ segment. The data comprise 624
(74.64%) observations of abandonments and 212 (25.36%) observations of purchasers. Among the
14
abandoners, there is a larger proportion of mobile customers (321 observations or 51.44%
respectively) compared to the purchasers (75 observations or 35.38% respectively) and in turn, a
larger proportion of purchasers via computer (108 observations or 50.94% respectively) than
abandoners via computer (229 observations or 36.70% respectively). Further, purchasers exhibit a
higher number of page impressions (46.1) compared to abandoners (40.9) on average. Table A.2
in the Appendix shows the correlation coefficients of the variables.
Table 4: Descriptive Statistics of New Customers’ Segment.
Variable
Observations of Shopping Cart
Abandonments (n=624)
Observations of Purchasers (n=212)
Mean
SD
Median
Min
Max
Mean
SD
Median
Min
Max
PIS
40.9
32.7
30
3
232
46.1
37.5
33
5
234
PIS_SC
2.5
3.0
2
0
27
2.6
2.9
2
0
16
QUANTITY
2.8
3.0
2
1
37
2.8
2.8
2
1
17
Counts
Proportion
Counts
Proportion
LOGIN
118
18.91%
183
86.32%
LOGIN_CHECKOUT
103
16.51%
169
79.72%
MOBILE_CUST
321
51.44%
75
35.38%
WEB_CUST
229
36.70%
108
50.94%
Note: SD = Standard Deviation, Min = Minimum, Max = Maximum.
Table 5 and Table A.3 in the Appendix summarize the descriptive statistics and correlation matrix
respectively of the existing customers’ segment: As existing customers were identified via their
login into their account and in order to achieve a model as parsimonious as possible, we dropped
the LOGIN variable. Similarly, there is a larger proportion of mobile shopping cart abandoners
(3,444 observations or 47.61% respectively) than mobile purchasers (1,076 observations or 35.38%
respectively) among the existing customers and vice-versa, a larger proportion of purchasers via
computer (1,915 observations or 50.94% respectively) than abandoners via computer (2,998
observations or 41.44%). Further, purchasers view more pages (47.8 impressions on average) and
more specifically, exhibit a higher number of shopping cart page views (3.2 impressions on
15
average) than abandoners (19.8 and .8 impressions on average). Purchasers add more items to their
shopping cart (3.4 items on average) than online abandoners (2.8 items on average).
Table 5: Descriptive Statistics of Existing Customers’ Segment.
Variable
Observations of Shopping Cart
Abandonments (n=7,234)
Observations of Purchasers (n=3,516)
Mean
SD
Median
Min
Max
Mean
SD
Median
Min
Max
PIS
19.8
24.3
11
1
347
47.8
36.8
37
4
325
PIS_SC
.8
1.9
0
0
43
3.2
3.6
2
0
53
QUANTITY
2.8
3.2
2
1
66
3.4
3.4
2
1
38
Counts
Proportion
Counts
Proportion
LOGIN_CHECKOUT
411
5.68%
1,232
79.72%
MOBILE_CUST
3,444
47.61%
1,076
35.38%
WEB_CUST
2,998
41.44%
1,915
50.94%
Note: SD = Standard Deviation, Min = Minimum, Max = Maximum.
3.2 Model Selection
Similar to preceding purchase behavior literature (van den Poel and Buckinx, 2005; Punj, 2011;
Emmanouilides and Hammond, 2000; Rajamma et al., 2009), we used a logit model to predict
shopping cart abandonment. We chose logit modeling as it can be considered conceptually simple
and suitable for meaningful interpretations (in contrast to e.g. high-performance machine learning
approaches), it yields robust results and thus, is frequently utilized in marketing literature (Batsell,
1980; Bucklin and Gupta, 1992; Guadagni and Little, 1983; Malhotra, 1982; 1984). The binary
logit model assesses the a posteriori probability of the outcome shopping cart abandonment
for each customer
with independent variables and parameters , i.e.,
. In our case, the
set of possible independent variables is given by
.
16
Since our main aim is to assess the contribution or impact of each variable for online shopping cart
abandonment in order to derive implications for e-commerce businesses by collating new versus
existing customers, we compared different model formulations similar to previous purchase
behavior literature (Bucklin and Sismeiro, 2003; Mallapragada et al., 2016; van den Poel and
Buckinx, 2005). We selected the best specification by sequentially adding predictors (i.e., forward
variable selection) and testing their contribution to the model performance measures, i.e.,
Nagelkerke’s Pseudo-R (Nagelkerke, 1991), -2 log-likelihood, and Akaike Information Citerion
(AIC) (Akaike, 1973). Table 6 summarizes the results from fitting the different logit models for
various specifications of the independent variables, ranging from intercept only to full
specification. From Model S1 to Model S2, we can observe a steady increase (or decrease
respectively) of the Pseudo-R (or -2 log-likelihood respectively). Albeit Model S7 provides the
best AIC value, Model S9 (i.e., the full model) is the best-fitting model with respect to the Pseudo-
R and -2 log-likelihood results. Nevertheless, its improvements compared to Model S8 are
marginal and according to the likelihood ratio test, Model S9 does not consitute an improvement
to Model S8. Thus, we consider Model S8 as a baseline model for further analysis due to parsimony
reasons. The baseline model can be represented by
No critical levels of multicollinearity in the baseline model were found since the correlations
coefficients were all below .80 and the variance inflation factors (VIFs) were all well below 3.00.
To gain further insights into varying effects of subgroups, we set up models for both the new and
existing customer segment respectively. We conducted the same procedure for variable selection
17
as described above for the baseline model, see Table 7 and 8: For both segments, we tested each
variable’s contribution to the goodness of fit measures. Considering the new customer segment,
Model N7 yields the best values of Pseudo-R and AIC. Albeit Model N8 provides the best -2 log-
likelihood results, it does not consitute an improvement to Model N7 with respect to the likelihood
ratio test. We further tested the goodness of fit measures if the variables LOGIN_CHECKOUT,
PIS, and PIS_SC were excluded from Model N7 since their contribution to the investigated
measures was marginal but every considered constellation would be inferior to Model N7.
Accordingly, we choose Model N7 for further analysis of the new customer segment
Although the variables LOGIN and LOGIN_CHECKOUT exhibit high correlation (see Table A.2
in the Appendix), the VIFs of all considered variables are smaller or around 3.00. The model yields
a Pseudo-R of .504.
Drawing on the existing customer segment, there is a steady increase (or decrease respectively) of
Pseudo-R (and -2 log-likelihood as well as AIC respectively) from Model E1 to Model E6.
Further, since Model E7 does not constitute an improvement to Model E6 regarding the goodness
of fit measures, we consider Model E6 for further investigation. The model can be written as
All variables exhibit a VIF smaller or around 2.00 and thus, multicollinearity does not pose a
problem. Our final model for the existing customer segment exhibits a Pseudo-R of .450.
18
We now discuss the parameter estimation results for the best-fitting baseline Model S8 and the
Models N7 and E6 for both subgroups of new and existing customers respectively.
19
Table 6: Logit Model Specifications: Baseline Model.
Model S9
5.7653
(.2775)
-2.2740
(.2442)
-1.2695
(.0679)
-.7299
(.0389)
-.9675
(.0526)
.2999
(.0389)
.4617
(.1395)
.4414
(.0789)
-.0239
(.0761)
.446
0
10,103.4
9
-.1
10,121
+1
.10
*** = p001, ** = p01, * = p05.
Note: Values for Variables are Estimates and Standard Errors in Parentheses. Likelihood Ratio Test Results (Chi-Square Values) Refer to the Respective Preceding
Model.
Model S8
5.7446
(.2695)
-2.2742
(.2442)
-1.2697
(.0680)
-.7291
(.0389)
-.9684
(.0525)
.3001
(.0389)
0.4615
(.1394)
.4601
(.0518)
.446
+.007
10,103.5
9
-79.51
10,120
+3
79.52***
Model S7
5.9574
(.2684)
-2.3008
(.2441)
-1.2781
(.0677)
-.7197
(.0386)
-1.0126
(.0522)
.2907
(.0387)
.4943
(.1394)
.439
+.001
10,183.1
0
-12.34
10,197
-10
12.34***
Model S6
6.4505
(.2301)
-2.7924
(.2011)
-1.2018
(.0636)
-.7187
(.0386)
-1.0107
(.0522)
.2857
(.0386)
.438
+.005
10,195.4
4
-56.67
10,207
-55
5.00***
Model S5
6.5500
(.2304)
-2.7856
(.2023)
-1.2819
(.0628)
-.7181
(.0382)
-.8643
(.0478)
.433
+.018
10,252.11
-345.57
10,262
-344
345.57***
Model S4
4.2324
(.2121)
-2.4011
(.2085)
-1.6891
(.0601)
-.0266
(.0009)
.405
+.204
10,597.68
-2,161.21
10,606
-2,159
2,161.20***
Model S3
2.8592
(.1909)
-1.7688
(.1925)
-2.0932
(.0569)
.201
+.172
12,758.89
-155.97
12,765
-1,554
1,556.00***
Model S2
2.8592
(.1909)
-2.1723
(.1920)
.029
+.029
14,314.86
-241.6
14,319
-24
241.66***
Model S1
.7457
(.0199)
0
14,556.53
14,559
0
Variable
INTERCEPT
LOGIN
LOGIN_CHECKOUT
ln(PIS)
ln(PIS_SC+1)
ln(QUANTITY)
NEW_CUST
MOBILE_CUST
WEB_CUST
Pseudo- (Nagelkerke)
Pseudo-
-2 log-likelihood
-2 log-likelihood
AIC
AIC
Chi-Square
20
Table 7: Logit Model Specifications: New Customers’ Segment.
Model N8
3.1890
(.6151)
-2.6846
(.4418)
-.9193
(.4139)
-.2388
(.1836)
.8097
(.2496)
-.6320
(.1878)
.6838
(.3417)
-.1517
(.3334)
.504
+0
597.44
-.21
613.44
+1.79
.21
*** = p001, ** = p01, * = p05.
Note: Values for Variables are Estimates and Standard Errors in Parentheses. Likelihood Ratio Test Results (Chi-Square Values) Refer to the Respective Preceding
Model.
Model N7
3.0586
(.5424)
-2.6920
(.4416)
-.9142
(.4138)
-.2335
(.1834)
.8017
(.2495)
-.6278
(.1878)
.8027
(.2204)
.504
+.016
597.65
-13.71
611.65
-11.71
13.71****
Model N6
3.2788
(.5307)
-3.0270
(.4352)
-.5334
(.3986)
-1778
(.1804)
.7520
(.2484)
-.6643
(.1845)
.488
+.016
611.36
-13.09
623.36
-11.09
13.09***
Model N5
3.3367
(.5315)
-2.7893
(.4214)
-.5289
(.3938)
-.2264
(.1786)
.2784
(.2084)
.472
+.002
624.45
-1.79
634.45
+.21
1.79
Model N4
3.1070
(-4972)
-2.7634
(.4205)
-.5694
(.3924)
-.0748
(.1376)
.470
+.001
626.24
-.29
634.24
+1.71
.29
Model N3
2.8592
(.1909)
-2.7902
(.4178)
-.5642
(.3921)
.469
+.002
626.53
-2.06
632.53
-.06
2.06
Model N2
2.8592
(.1909)
-3.2980
(0.2245)
.467
+.467
628.59
-318.17
632.59
-316.17
318.16***
Model N1
1.0796
(.0795)
0
946.76
948.76
0
Variable
INTERCEPT
LOGIN
LOGIN_CHECKOUT
ln(PIS)
ln(PIS_SC+1)
ln(QUANTITY)
MOBILE_CUST
WEB_CUST
Pseudo- (Nagelkerke)
Pseudo-
-2 log-likelihood
-2 log-likelihood
AIC
AIC
Chi-Square
21
Table 8: Logit Model Specifications: Existing Customers’ Segment.
Model E7
3.5848
(.1320)
-1.2520
(.0693)
-.7423
(.0402)
-1.0406
(.0544)
.3209
(.0403)
.4081
(.0818)
-.0277
(.0788
.450
+0
9,396.03
-.12
9,410
+1.8
0.12
*** = p001, ** = p01, * = p05.
Note: Values for Variables are Estimates and Standard Errors in Parentheses. Likelihood Ratio Test Results (Chi-Square Values) Refer to the
Respective Preceding Model.
Model E6
3.5607
(.1127)
-1.2522
(.0693)
-.7413
(.0401)
-1.0416
(.0543)
.3210
(.0403)
.4298
(.0538)
.450
+.005
9,396.15
-64.29
9,408.2
-62.2
64.29***
Model E5
3.7382
(.1105)
-1.2658
(.0691)
-.7333
(.0399)
-1.0826
(.0540)
.3125
(.0402)
.445
+.006
9,460.44
-63.06
9,470.4
-161.1
63.06***
Model E4
3.8530
(.1087)
-1.3466
(.0683)
-.7318
(.0395)
-.9262
(.0496)
.439
+.034
9,523.50
-372.07
9,531.5
-370.1
372.07***
Model E3
4.7690
(.1026)
-1.4533
(.0668)
-1.2336
(.0309)
.405
+.225
9,895.57
-2,211.08
9,901.6
-2,209.4
2,211.10***
Model E2
1.0944
(.0242)
-2.1922
(.0619)
.180
+.180
12,106.65
-1,489.16
12,111
-1,481
1,483.2***
Model E1
.7215
(.0206)
0
13,589.81
13,592
Variable
INTERCEPT
LOGIN_CHECKOUT
ln(PIS)
ln(PIS_SC+1)
ln(QUANTITY)
MOBILE_CUST
WEB_CUST
Pseudo- (Nagelkerke)
Pseudo-
-2 log-likelihood
-2 log-likelihood
AIC
AIC
Chi-Square
22
4 Results
Table 9 provides a summary of the best-fitting baseline model based on Pseudo-R, -2 log-
likelihood, and likelihood ratio test. We further present the the variables’ odd ratios. If the odd
ratio’s value is above 1, the variable increases the probability of online shopping cart abandonment
and vice-versa for values smaller than 1. If the odd ratio is exactly 1, the variable does not have an
impact on online shopping cart abandonment.
Drawing on the odd ratio values above 1, for customers shopping via mobile phone, the likelihood
of online shopping cart abandonment rises by 1.58 times compared to customers shopping via other
devices. Similarly, for new customers, the odds of shopping cart abandonment is increased by 1.59
times compared to existing customers. Further, the number of items in the customer’s shopping
cart has a significant impact on shopping cart abandonment: for every 2.72 additional items in the
customer’s shopping cart, the odds of abandonment increases by 35%.
In contrast, the odds of making a purchase increases with every 2.72 additional page impressions
by approximately 108% and with every 2.72 additional impressions of the shopping cart page by
approximately 163%. The PIS variable yielded the highest improvement regarding Pseudo-R, -2
log-likelihood, and AIC and therefore, provides high explanatory power (see Table 6).
Furthermore, the likelihood of making a purchase will be approximately 10 times higher if the
customer logs in to his/her account. More specifically, if the customer logs in to his/her account
during checkout, the odds of making a purchase is 3.57 times higher, i.e., it is likely that the
customer completes the purchase process if he/she logs in during checkout. This becomes even
more apparent considering the descriptive statistics in Table 3: among all purchasers, 37.58%
logged in to their account during checkout. Further, with a Pseudo-R improvement of .172 (see
23
Table 6), the variable yielded a major improvement of the model compared to the null model and
thus, is assumed to provide high explanatory power.
Table 9: Best-Fitting Baseline Logit Model.
Variable
Estimate
Standard Error
Wald’s
Significance
Odds Ratio
INTERCEPT
5.7446
.2695
21.32
< .001
312.51
LOGIN
-2.2742
.2442
-9.31
< .001
.10
LOGIN_CHECKOUT
-1.2697
.0680
-18.68
< .001
.28
ln(PIS)
-.7291
.0389
-18.77
< .001
.48
ln(PIS_SC+1)
-.9684
.0525
-18.44
< .001
.38
ln(QUANTITY)
.3001
.0389
7.72
< .001
1.35
NEW_CUST
.4615
.1394
3.31
< .001
1.59
MOBILE_CUST
.4601
.0518
8.88
< .001
1.58
Pseudo- (Nagelkerke)
.446
-2 log-likelihood
10,103.59
AIC
10,120
As stated earlier, we extended the baseline model by splitting up the dataset into subgroups of new
and existing customers. The logit model for the former group is presented in Table 10.
The likelihood of shopping cart abandonment among new customers increases by 2.23 times if the
customer accesses the shop via his/her mobile phone compare to new customers accessing the
online shop via tablet or computer. Similarly, for every 2.72 additional impressions of the shopping
cart page, the odds of online shopping cart abandonment increases by approximately 123% for new
customers.
The odds of making a purchase among new customers is raised by 14.29 times if the customer logs
in to his/her account (i.e., creates an account). The explanatory power of the LOGIN variable is
further reflected by its substantial Pseudo-R enhancement of .467 (see Table 7). Moreover, the
likelihood of terminating the purchasing process among new customers rises by 2.50 times if the
24
customer logs into his/her account during checkout (i.e., creates an account during checkout). For
every 2.72 additional items in new customers’ shopping cart, the odds of completing the purchase
becomes approximately 89% higher.
Table 10: Best-Fitting Logit Model for New Customers’ Segment.
Variable
Estimate
Standard Error
Wald’s
Significance
Odds Ratio
INTERCEPT
3.0586
.0586
5.63
< .001
21.30
LOGIN
-2.6920
.4416
-6.10
< .001
.07
LOGIN_CHECKOUT
-.9142
.4138
-2.21
.027
.40
ln(PIS)
-.2335
.1834
-1.27
.203
.79
ln(PIS_SC+1)
.8017
.2495
3.21
.001
2.23
ln(QUANTITY)
-.6278
.1878
-3.34
< .001
.53
MOBILE_CUST
.8027
.2204
3.64
< .001
2.23
Pseudo- (Nagelkerke)
.504
-2 log-likelihood
597.65
AIC
611.65
Table 11 depicts the existing customers’ segment logit model. Similarly to new customers, the odds
of online shopping cart abandonment is increased by 1.54 times in case the existing customer shops
via his/her mobile phone. However, in contrast to new customers, the likelihood of abandoning the
online shopping cart rises by approximately 38% for every 2.72 additional items in the existing
customers’ cart.
If the existing customer logs in to his/her account during checkout, the odds of making a purchase
becomes 3.49 times higher. Besides, for every 2.72 additional page impressions and more
specifically, for every 2.72 additional impressions of the shopping cart page, the probability of
completing the transaction rises by approximately 108% and 186% respectively. For the existing
customers’ segment, the variable for the number of page impressions yielded the highest Pseudo-
25
R improvement and hence, can be considered to provide considerable explanatory power for
existing customers’ shopping cart abandonment.
Table 11: Best-Fitting Logit Model for Existing Customers’ Segment.
Variable
Estimate
Standard Error
Wald’s
Significance
Odds Ratio
INTERCEPT
3.5848
.1320
31.60
< .001
35.19
LOGIN_CHECKOUT
-1.2522
.0693
-18.07
< .001
.29
ln(PIS)
-.7413
.0401
-18.47
< .001
.48
ln(PIS_SC+1)
-1.0416
.0543
-19.19
< .001
.35
ln(QUANTITY)
.3210
.0403
7.96
< .001
1.38
MOBILE_CUST
.4298
.0538
7.99
< .001
1.54
Pseudo- (Nagelkerke)
.450
-2 log-likelihood
9,396.15
AIC
9,408.2
5 Discussion
5.1 Theoretical Contribution
The findings extend and complement prior literature on online shopping cart abandonment. Since
extant research focused on the behavioral perspective, we investigated cart abandonment with real
unbiased online shop clickstream data to build upon prior findings and gain further insights. To the
best of our knowledge, the underlying research is the first study to fill a research gap within online
shopping cart abandonment literature regarding different customer segments by investigating the
drivers of cart abandonment for both new and existing customers, whereas prior purchase behavior
research focused on other segmentations such as comparing ordinary users with community
members (Olbrich and Holsing, 2011), household size, geographical region, and race
(Mallapragada et al., 2016), or demographic data (Phang et al., 2010; Kumar et al., 2019). With
our research question, we complement existing shopping cart abandonment literature by providing
more granular insights on the factors of online shopping cart abandonment of the two segments
26
existing versus new customers, which has not been investigated before. Specifically, we make the
following contributions to existing research:
There is empirical evidence that online shopping cart abandonment occurs more frequently for both
new and existing customers accessing the online shop via mobile phone. Prior shopping cart
research focused on the impact of mobile shopping on shopping cart abandonment in general
(Huang et al., 2018), but did not explore the effect of mobile shopping on different customer
segments. We thus prove that using mobile phones for online shopping impacts the probability of
completing the purchase negatively regardless of the length of customer relationship. Nevertheless,
our findings indicate that the impact is stronger for new than for existing customers resulting in
large profit losses. This might be due to perceived security and privacy concerns of mobile
shopping (Yang, 2005; Yang and Forney, 2013) or mobile shopping specific attributes in general
that deter the completion of the transaction (Huang et al., 2018).
Overall, the findings indicate that new customers exhibit a decreasing probability of completing
the purchase process with every additional shopping cart page impression. Extant literature focused
on the impact of overall page views in a purchase behavior context (Bucklin and Sismeiro, 2003;
van den Poel and Buckinx, 2005; Moe, 2003), but did not investigate the effect of specific page
types such as the shopping cart page across different customer segments. Based on the results, it
can be assumed that the longer new customers reflect their shopping cart, i.e., the more often they
view the shopping cart page, the more likely they might perceive trust issues or risks regarding the
unknown online shop (Belanger et al., 2002; Liao and Cheung, 2001; Ranganathan and Ganapathy,
2002) or might perceive the registration process as inconvenient (Rajamma et al., 2009; Srinivasan
et al., 2002; Girard et al., 2003).
27
Moreover, the results showed that with an increasing number of items in the existing customers’
shopping cart the likelihood of completing the transaction decreases. Apparently, existing
customers perceive increasing negative financial consequences with a higher number of items in
their shopping cart and hence, do not complete the transaction. Additionally, this might indicate
that existing customers misuse the shopping cart and further, their account as an organization tool
to store products within their account to rediscover the products quickly and maintain the option
of purchasing them in a later session (Close and Kukar-Kinney, 2010; Kukar-Kinney and Close,
2010). In contrast, new customers’ probability of purchasing increases with more products placed
in the shopping card, which might be due to delivery costs per order.
If the customer logs in to his/her account during the checkout process, the customer is more likely
to make a purchase. Only 4.44% of all observations abandoned their shopping cart after logging in
at the checkout stage indicating that the drivers of shopping cart abandonment mostly occur before
the customer proceeds to the checkout process.
5.2 Practical Implications
The results from our models provide several implications for e-commerce managers and online
shop designers by identifying drivers of online shopping cart abandonment for new as well as
existing customers. Figure 1 summarizes our results regarding each variable’s impact on the
likelihood of online shopping cart abandonment.
Figure 1: Variables’ impact on the likelihood of online shopping cart abandonment.
28
For both customer segments, the likelihood of online shopping cart abandonment increases if
customers access the shop via their mobile phone. Thus, e-commerce businesses are recommended
to provide a mobile phone compatible version of their online shop or even a suitable application
since mobile users value convenience and accessibility when shopping via mobile phone (Holmes
et al., 2013; Yang and Kim, 2012; Wang et al., 2015). This implies facilitating conditions in terms
of technological infrastructure (Yang, 2010), like page sizes being adjusted to mobile phone
screens, intuitive shopping navigation, and fast loading times. Further, mobile shoppers can be
triggered to complete their purchase by offering free shipping or mobile coupons. Such unexpected
contextual offers can help to overcome mobile shopper’s internal conflicts regarding mobile
shopping as identified by Huang et al. (2018). Furthermore, mobile shopping cart abandonment
might be due to perceived security and privacy concerns of mobile shopping (Yang, 2005; Yang
and Forney, 2013). E-Commerce businesses might implement an official certificate assuring the
website’s security and an encrypted connection on the shopping cart page to lower perceived risks
regarding mobile shopping. Also, mobile shoppers might use their shopping cart as an organization
tool or a wish list ‘on the go’ in order to store and quickly rediscover certain products and hence,
complete their purchase during subsequent sessions. Overall, the effect of mobile shopping on cart
abandonment even intensifies if the mobile shopper is new (increases likelihood of shopping cart
abandonment by 2.23 times compared to existing customers by 1.54 times). Consequently, online
shop operators are currently losing large amounts of sales and the opportunity to acquire new
customers which might turn into loyal customers by not providing appropriate mobile versions of
their online shops.
Further, with an increasing number of items in the shopping cart, the probability of completing the
purchase diminishes for existing customers. Customers might perceive an increasing intrapersonal
29
conflict regarding total costs and financial impact with a growing number of items in the shopping
cart. Therefore, e-commerce businesses are recommended to offer the possibility of installment
payment and alternative payment methods for their existing customers instead of immediate
payment to create a lag between ordering process and payment. Hence, the financial impact of the
order might not be considered as severe as in the case of immediate payment. Besides, this might
further indicate that existing customers utilize their shopping cart and their account as an
organization tool or wish list more often, in case they are uncertain about which product to buy and
would therefore like to compare similar product options. Also, existing customers might use the
shopping cart for storing products within their account to rediscover the products quickly and
maintain the option of purchasing them in a later session. Contrary to existing customers, new
customers’ probability of online shopping cart abandonment decreases with an increasing number
of items in their shopping cart. This further supports our assumption of existing customers utilizing
their online shopping cart and thus, their account as an organization tool to store products for a
later purchase. Besides, while existing customers ordering regularly are likely to be part of a paid
subscription with monthly fees and no shipping costs (such as Amazon Prime), new customers
might be less interested in ordering articles one at a time, but to buy all items with only one order
and avoid additional shipping costs. The described paid subscription program is also employed at
the analyzed online shop. Therefore, online shop operators could offer free temporally limited
testing of such programs for new customers.
In contrast to existing customers, new customers are rather likely to abandon their shopping cart
with an increasing number of shopping cart page impressions. Unexperienced new customers might
undergo a higher hesitation reaction regarding proceeding to checkout the longer they reflect their
shopping cart, i.e., with every additional impression of their shopping cart, compared with existing
30
customers. They might perceive security risks or trust issues concerning the online shop or might
perceive the registration process as inconvenient or annoying the longer they can reconsider the
purchasing process, i.e., the more frequently they view the shopping cart page (Park and Kim,
2003; Rajamma et al., 2009). Thus, e-commerce managers should offer e.g. ordering with a guest
account and avoid long registration forms. This is further highlighted by our finding that the
probability of completing the transaction among new customers rises in case they are logging in to
their account, i.e., in case they are creating an account. Therefore, the creation of an account should
be simplified with short registration forms requiring only basic customer information.
Generally, customers logging in to their account during checkout are likely to complete the
transaction and only few observations abandoned their shopping cart after logging in at the
checkout stage. For existing customers already familiar with the checkout process of the online
shop, the likelihood for completing a purchase increases even more dramatically (3.49 times
higher) when logging in to their account during checkout than for new customers (2.50 times
higher). To reduce the remaining barriers at this stage, the login procedure should be designed as
short and smooth as possible, for instance with the possibility to use a ‘remember my password’-
function. Further, online retailers should create their checkout process as simple as possible and
reduce it to essentials. E.g., Amazon enabled 1-Click ordering for mobile shoppers to shorten the
checkout process dramatically.
5.3 Limitations and Future Research
Our research is subject to limitations, which might stimulate further research. First, we were only
able to model a limited set of useful explanatory variables. With respect to extant literature (Bucklin
and Sismeiro, 2003; Moe and Fader, 2004a; van den Poel and Buckinx, 2005), we expect e.g.
demographic variables, historical purchase behavior, or the time customers spend in the online
31
shop to be informative variables. Further segmentations differentiating between male and female
customers or young and old customers would yield deeper insights. Future research could model
such variables in case a broader extent of information about the customers is available.
Further, just-browsing customers were excluded from the investigation. A possible direction for
future research could be to conduct a multi-class classification by differentiating between
purchasers, abandonments, and just-browsing customers.
Besides, our investigation was restricted to the dataset of one e-commerce website only. Applying
our model on clickstream information of other e-commerce businesses and a longer period of
observations would lead to an extended generalizability of our results. Also, our research is limited
to the data of a fashion online shop. Data of other e-commerce businesses offering different
products such as utilitarian goods might generate other results and conclusions.
6 Conclusion
The underlying paper analyzed clickstream data of a German online retailer comprising 11,586 real
customer observations to investigate drivers of online shopping cart abandonment. Thereby,
different explanatory variables were modeled and proposed – partially based on previous literature
findings – and utilized logit modeling to determine each variable’s impact on online shopping cart
abandonment of both new and existing customers’ segments. Our findings indicate that mobile
shoppers exhibit a higher likelihood of abandoning their shopping cart. This relation intensifies for
new customers. For existing customers, the odds of completing the purchase decreases with every
additional item in the customers’ shopping cart and in contrast, new customers are rather likely to
abandon the shopping cart with an increasing number of shopping cart page impressions.
32
Appendix: Correlation Matrices
Table A.1: Correlation Matrix of Full Sample.
Variable
1
2
3
4
5
6
7
8
9
1 SCA
1.00
2 LOGIN
-.13***
1.00
3 LOGIN_CHECKOUT
-.39***
.10***
1.00
4 PIS
-.30***
-.05***
.29***
1.00
5 PIS_SC
-.36***
-.06***
.25***
.70***
1.00
6 QUANTITY
-.09***
-.00
.01
.30***
.49***
1.00
7 NEW_CUST
.04***
-.79***
.12***
.10***
.09***
-.01
1.00
8 MOBILE_CUST
.16***
-.04***
.08***
-.10***
-.12****
-.06***
.03***
1.00
9 WEB_CUST
-.12***
.03***
.06***
.06***
.10***
.05***
-.03***
-.78***
1.00
*** = p001, ** = p01, * = p05.
Table A.2: Correlation Matrix of New Customers' Segment.
Variable
1
2
3
4
5
6
7
8
1 SCA
1.00
2 LOGIN
-.61***
1.00
3 LOGIN_CHECKOUT
-.59***
.93***
1.00
4 PIS
-.09*
.14***
.12***
1.00
5 PIS_SC
-.01
.06
.04
.68***
1.00
6 QUANTITY
-.01
-.06
-.07*
.42***
.65***
1.00
7 MOBILE_CUST
-.14***
-.08*
-.01
-.01
-.03
-.05
1.00
8 WEB_CUST
-.13***
.09*
.03
.00
.06
.04
-.78***
1.00
*** = p001, ** = p01, * = p05.
Table A.3: Correlation Matrix of Existing Customers' Segment.
Variable
1
2
3
4
5
6
7
1 SCA
1.00
2 LOGIN_CHECKOUT
-.38***
1.00
3 PIS
-.41***
.30***
1.00
4 PIS_SC
-.39***
.26***
.70***
1.00
5 QUANTITY
-.09***
.02*
.30***
.49***
1.00
6 MOBILE_CUST
.16***
-.09***
-.11***
-.11***
-.07***
1.00
7 WEB_CUST
-.12***
.07***
.07***
.07***
.05***
-.78***
1.00
*** = p001, ** = p01, * = p05.
33
References
Akaike, H. (1973) ‘Information Theory and an Extension of the Maximum Likelihood Principle’, Proceedings of the
2nd International Symposium on Information Theory, pp.267–281.
Batsell, R.R. (1980) ‘Consumer Resource Allocation Models at the Individual Level’, Journal of Consumer
Research, Vol. 7, No. 1, pp.78–87.
Belanger, F., Hiller, J.S. and Smith, W.J. (2002) ‘Trustworthiness in electronic commerce: the role of privacy,
security, and site attributes’, Journal of Strategic Information Systems, Vol. 11, 3-4, pp.245–270.
Bucklin, R.E. and Gupta, S. (1992) ‘Brand Choice, Purchase Incidence, and Segmentation: An Integrated Modeling
Approach’, Journal of Marketing Research, Vol. 29, No. 2, pp.201–215.
Bucklin, R.E., Lattin, J.M., Ansari, A., Bell, D. and Coupey, E. (2002) ‘Choice and the Internet: From Clickstream to
Research Stream’, Marketing Letters, Vol. 13, No. 3, pp.245–258.
Bucklin, R.E. and Sismeiro, C. (2003) ‘A Model of Web Site Browsing Behavior Estimated on Clickstream Data’,
Journal of Marketing Research, Vol. 40, No. 3, pp.249–267.
Chang, P., Mendonça, D. and Im, I. (2004) ‘Inside the customer: modeling cognition during online shopping’,
AMCIS 2004 Proceedings, p.429.
Chatterjee, P., Hoffman, D.L. and Novak, T.P. (2003) ‘Modeling the Clickstream: Implications for Web-Based
Advertising Efforts’, Marketing Science, Vol. 22, No. 4, pp.520–541.
Chen, Y. and Yao, S. (2017) ‘Sequential search with refinement: Model and application with click-stream data’,
Management Science, Vol. 63, No. 12, pp.4345–4365.
Cho, C.-H., Kang, J. and Cheon, H.J. (2006) ‘Online Shopping Hesitation’, CyberPsychology & Behavior, Vol. 9,
No. 3, pp.261–274.
Close, A.G. and Kukar-Kinney, M. (2010) ‘Beyond buying: Motivations behind consumers' online shopping cart
use’, Journal of Business Research, Vol. 63, 9-10, pp.986–992.
Danaher, P.J., Mullarkey, G.W. and Essegaier, S. (2006) ‘Factors affecting web site visit duration: A cross-domain
analysis’, Journal of Marketing Research, Vol. 43, No. 2, pp.182–194.
34
Ding, A.W., Li, S. and Chatterjee, P. (2015) ‘Learning user real-time intent for optimal dynamic web page
transformation’, Information Systems Research, Vol. 26, No. 2, pp.339–359.
Emmanouilides, C. and Hammond, K. (2000) ‘Internet usage: Predictors of active users and frequency of use’,
Journal of Interactive Marketing, Vol. 14, No. 2, pp.17–32.
Girard, T., Korgaonkar, P. and Silverblatt, R. (2003) ‘Relationship of Type of Product, Shopping Orientations, and
Demographics with Preference for Shopping on the Internet’, Journal of Business and Psychology, Vol. 18,
No. 1, pp.101–120.
Guadagni, P.M. and Little, J.D.C. (1983) ‘A Logit Model of Brand Choice Calibrated on Scanner Data’, Marketing
Science, Vol. 2, No. 3, pp.203–238.
Holmes, A., Byrne, A. and Rowley, J. (2013) ‘Mobile shopping behaviour: insights into attitudes, shopping process
involvement and location’, International Journal of Retail & Distribution Management, Vol. 42, No. 1, pp.25–
39.
Huang, G.-H., Korfiatis, N. and Chang, C.-T. (2018) ‘Mobile shopping cart abandonment: The roles of conflicts,
ambivalence, and hesitation’, Journal of Business Research, Vol. 85, pp.165–174.
Johnson, E.J., Moe, W.W., Fader, P.S., Bellman, S. and Lohse, G.L. (2004) ‘On the Depth and Dynamics of Online
Search Behavior’, Management Science, Vol. 50, No. 3, pp.299–308.
Kalczynski, P.J., Senecal, S. and Nantel, J. (2006) ‘Predicting on-line task completion with clickstream complexity
measures: A graph-based approach’, International Journal of Electronic Commerce, Vol. 10, No. 3, pp.121–141.
Kukar-Kinney, M. and Close, A.G. (2010) ‘The determinants of consumers’ online shopping cart abandonment’,
Journal of the Academy of Marketing Science, Vol. 38, No. 2, pp.240–250.
Kumar, A., Salo, J. and Li, H. (2019) ‘Stages of User Engagement on Social Commerce Platforms: Analysis with the
Navigational Clickstream Data’, International Journal of Electronic Commerce, Vol. 23, No. 2, pp.179–211.
Li, S., Liechty, J.C. and Montgomery, A.L. (2002) ‘Modeling category viewership of web users with multivariate
count models’.
Liao, Z. and Cheung, M.T. (2001) ‘Internet-based e-shopping and consumer attitudes: an empirical study’,
Information & Management, Vol. 38, No. 5, pp.299–306.
35
Lockwood, A., Jones, J. and Zhang, J. (2006) ‘Does Search Matter? Using Clickstream Data to Examine the
Relationship between Online Search and Purchase Behavior’, ICIS 2006 Proceedings, p.58.
Malhotra, N.K. (1984) ‘The Use of Linear Logit Models in Marketing Research’, Journal of Marketing Research,
Vol. 21, No. 1, pp.20–31.
Malhotra, N.K. (1982) ‘Information Load and Consumer Decision Making’, Journal of Consumer Research, Vol. 8,
No. 4, pp.419–430.
Mallapragada, G., Chandukala, S.R. and Liu, Q. (2016) ‘Exploring the Effects of “What” (Product) and “Where”
(Website) Characteristics on Online Shopping Behavior’, Journal of Marketing, Vol. 80, No. 2, pp.21–38.
Moe, W.W. (2006) ‘An Empirical Two-Stage Choice Model with Varying Decision Rules Applied to Internet
Clickstream Data’, Journal of Marketing Research, Vol. 43, No. 4, pp.680–692.
Moe, W.W. (2003) ‘Buying, Searching, or Browsing: Differentiating between Online Shoppers Using In-Store
Navigational Clickstream’, Journal of Consumer Psychology, Vol. 13, 1/2, pp.29–39.
Moe, W.W. and Fader, P.S. (2004b) ‘Capturing evolving visit behavior in clickstream data’, Journal of Interactive
Marketing, Vol. 18, No. 1, pp.5–19.
Moe, W.W. and Fader, P.S. (2004a) ‘Dynamic Conversion Behavior at E-Commerce Sites’, Management Science,
Vol. 50, No. 3, pp.326–335.
Montgomery, A.L. (2001) ‘Applying Quantitative Marketing Techniques to the Internet’, Interfaces, Vol. 31, No. 2,
pp.90–108.
Montgomery, A.L., Li, S., Srinivasan, K. and Liechty, J.C. (2004) ‘Modeling Online Browsing and Path Analysis
Using Clickstream Data’, Marketing Science, Vol. 23, No. 4, pp.579–595.
Nagelkerke, N.J.D. (1991) ‘A Note on a General Definition of the Coefficient of Determination’, Biometrika,
Vol. 78, No. 3, pp.691–692.
Olbrich, R. and Holsing, C. (2011) ‘Modeling consumer purchasing behavior in social shopping communities with
clickstream data’, International Journal of Electronic Commerce, Vol. 16, No. 2, pp.15–40.
Overby, J. W. and Lee, E. J. (2006) The effects of utilitarian and hedonic online shopping value on consumer
preference and intentions, Journal of Business Research, Vol. 59 No. 10-11, pp.1160-1166.
36
Park, C.-H. and Kim, Y.-G. (2003) ‘Identifying key factors affecting consumer purchase behavior in an online
shopping context’, International Journal of Retail & Distribution Management, Vol. 31, No. 1, pp.16–29.
Phang, C.W., Kankanhalli, A., Ramakrishnan, K. and Raman, K.S. (2010) ‘Customers’ preference of online store
visit strategies: an investigation of demographic variables’, European Journal of Information Systems, Vol. 19,
No. 3, pp.344–358.
Punj, G. (2011) ‘Effect of Consumer Beliefs on Online Purchase Behavior: The Influence of Demographic
Characteristics and Consumption Values’, Journal of Interactive Marketing, Vol. 25, No. 3, pp.134–144.
Rajamma, Rajasree K., Paswan, Audhesh, K. and Hossain, M.M. (2009) ‘Why do shoppers abandon shopping cart?
Perceived time, risk, and transaction inconvenience’, Journal of Product & Brand Management, Vol. 18, No. 3,
pp.188–197.
Ranganathan, C. and Ganapathy, S. (2002) ‘Key dimensions of business-to-consumer web sites’, Information &
Management, Vol. 39, No. 6, pp.457–465.
Ransbotham, S., Lurie, N.H. and Liu, H. (2019) ‘Creation and Consumption of Mobile Word of Mouth: How Are
Mobile Reviews Different?’, Marketing Science, Vol. 38, No. 5, pp.773–792.
Raphaeli, O., Goldstein, A. and Fink, L. (2017) ‘Analyzing online consumer behavior in mobile and PC devices: A
novel web usage mining approach’, Electronic commerce research and applications, Vol. 26, pp.1–12.
Rich, M. and Wilson, R.D. (2010) ‘Using clickstream data to enhance business‐to‐business web site performance’,
Journal of Business & Industrial Marketing.
Sarstedt, M., Becker, J.-M., Ringle, C.M. and Schwaiger, M. (2011) ‘Uncovering and Treating Unobserved
Heterogeneity with FIMIX-PLS: Which Model Selection Criterion Provides an Appropriate Number of
Segments?’, Schmalenbach Business Review, Vol. 63, No. 1, pp.34–62.
Senecal, S., Kalczynski, P.J. and Nantel, J. (2005) ‘Consumers' decision-making process and their online shopping
behavior: a clickstream analysis’, Journal of Business Research, Vol. 58, No. 11, pp.1599–1608.
Singh, S. and Swait, J. (2017) ‘Channels for search and purchase: Does mobile Internet matter?’, Journal of
Retailing and Consumer Services, Vol. 39, pp.123–134.
Sismeiro, C. and Bucklin, R.E. (2004) ‘Modeling Purchase Behavior at an E-Commerce Web Site: A Task-
Completion Approach’, Journal of Marketing Research, Vol. 41, No. 3, 306.323.
37
Srinivasan, S.S., Anderson, R. and Ponnavolu, K. (2002) ‘Customer loyalty in e-commerce: an exploration of its
antecedents and consequences’, Journal of Retailing, Vol. 78, No. 1, pp.41–50.
Su, Q. and Chen, L. (2015) ‘A method for discovering clusters of e-commerce interest patterns using click-stream
data’, Electronic commerce research and applications, Vol. 14, No. 1, pp.1–13.
Suchacka, G. and Chodak, G. (2017) ‘Using association rules to assess purchase probability in online stores’,
Information Systems and e-Business Management, Vol. 15, No. 3, pp.751–780.
van den Poel, D. and Buckinx, W. (2005) ‘Predicting online-purchasing behaviour’, European Journal of
Operational Research, Vol. 166, No. 2, pp.557–575.
Wang, R.J.-H., Malthouse, E.C. and Krishnamurthi, L. (2015) ‘On the Go: How Mobile Shopping Affects Customer
Purchase Behavior’, Journal of Retailing, Vol. 91, No. 2, pp.217–234.
Woodside, A.G. (2013) ‘Moving beyond multiple regression analysis to algorithms: Calling for adoption of a
paradigm shift from symmetric to asymmetric thinking in data analysis and crafting theory’, Journal of Business
Research, Vol. 66, No. 4, pp.463–472.
Yang, K. (2010) ‘Determinants of US consumer mobile shopping services adoption: implications for designing
mobile shopping services’, Journal of Consumer Marketing, Vol. 27, No. 3, pp.262–270.
Yang, K. (2005) ‘Exploring factors affecting the adoption of mobile commerce in Singapore’, Telematics and
Informatics, Vol. 22, No. 3, pp.257–277.
Yang, K. and Forney, J.C. (2013) ‘The moderating role of consumer technology anxiety in mobile shopping
adoption: Differential effects of facilitating conditions and social influences’, Journal of Electronic Commerce
Research, Vol. 14, No. 4, pp.334–347.
Yang, K. and Kim, H.-Y. (2012) ‘Mobile shopping motivation: an application of multiple discriminant analysis’,
International Journal of Retail & Distribution Management, Vol. 40, No. 10, pp.778–789.
38
LOGIN_CHECKOUT
PIS
PIS_SC
QUANTITY
MOBILE_CUST
New Customers
Existing Customers
Figure 1: Variables’ Impact on the Likelihood of Online Shopping Cart Abandonment.