ArticlePDF Available

Assessing Stationarity in Web Analytics: A study of Bounce Rates

Authors:

Abstract and Figures

Evidence-based methods for evaluating marketing interventions such as A/B testing have become standard practice. However, the pitfalls associated with the misuse of this decision-making instrument are not well understood by managers and analytics professionals. In this study, we assess the impact of stationarity on the validity of samples from conditioned time series, which are abundant in web metrics. Such a prominent metric is the bounce rate, which is prevalent in assessing engagement with web content as well as the performance of marketing touchpoints. In this study, we show how to control for stationarity using an algorithmic transformation to calculate the optimum sampling period. This distance is based on a novel stationary ergodic process that considers that a stationary series presents reversible symmetric features and is calculated using a dynamic time warping (DTW) algorithm in a self-correlation procedure. This study contributes to the expert and intelligent systems literature by demonstrating a robust method for subsampling time series data, which are critical in decision making.
This content is subject to copyright. Terms and conditions apply.
1
Assessing Stationarity in Web Analytics: A study of
Bounce Rates
Marios Poulos
*
Faculty of Information Science and Informatics, Department of Archives and Library
Science, Ionian University, Ioannou Theotoki 72, 49100, Corfu, Greece.
Nikolaos Korfiatis
Norwich Business School, University of East Anglia. Elizabeth Fry Building, NR47J,
Norwich, United Kingdom
Sozon Papavlassopoulos
Faculty of Information Science and Informatics, Department of Archives and Library
Science, Ionian University, Ioannou Theotoki 72, 49100, Corfu, Greece.
This paper should be cited as:
Poulos, M., Korfiatis, N., & Papavlassopoulos, S. (2020). Assessing Stationarity in Web
Analytics: A study of Bounce Rates. Expert Systems (Forthcoming)
*
Corresponding Author, E-mail: mpoulos@ionio.gr; Ioannou Theotoki 72, 49100, Corfu, Greece
2
Abstract
Evidence-based methods for evaluating marketing interventions such as A/B testing have
become standard practice. However, the pitfalls associated with the misuse of this decision-
making instrument are not well understood by managers and analytics professionals. In this
study, we assess the impact of stationarity on the validity of samples from conditioned time
series, which are abundant in web metrics. Such a prominent metric is the bounce rate, which
is prevalent in assessing engagement with web content as well as the performance of marketing
touchpoints. In this study, we show how to control for stationarity using an algorithmic
transformation to calculate the optimum sampling period. This distance is based on a novel
stationary ergodic process that considers that a stationary series presents reversible symmetric
features and is calculated using a dynamic time warping (DTW) algorithm in a self-correlation
procedure. This study contributes to the expert and intelligent systems literature by
demonstrating a robust method for subsampling time series data, which are critical in decision
making.
Keywords: Time Series, Numerical Analysis, Stationary Progress, Bounce Rate, DTW
3
1. Introduction
The proliferation of analytical methods and tools has led to a critical paradigm shift in
managerial decision making, highlighting the importance of evidence-based evaluations of the
impact of interventions across a spectrum of business practices (e.g., marketing). As in other
areas, such as medicine, evidence-based methods in management practice (Marr, 2010; Pfeffer
& Sutton, 2006) seek to evaluate not only whether the effect of an intervention is observable
but also the reliability and validity of the results presented by the evaluation criterion used. The
expert and intelligent systemsliterature has necessitated the need for correct data input across
a variety of methods and application domains. A very prominent case (as regards to measurable
economic significance) is the evaluation of the impact of interventions in presentation/interface
elements in various marketing functions, such as e-commerce and display advertising. The
former has given rise to so-called customer-drivendevelopment (Edvardsson et al., 2012),
in which real customers (or users) evaluate features of a particular medium under realistic
marketing mix conditions.
Such a problem considers evaluating the performance of media use and consumption
in terms of easy-to-understand metrics to guide budget allocation (Danaher & Rust, 1996).
Considering a typical application scenario of an online retailer or an advertising agency,
performance metrics capture consumer engagement with the medium and its effectiveness in
attracting consumers’ attention. A very prominent metric, which is the focus of this study, is
the bounce rate, which is defined as the ratio of single-page user sessions to the total sessions
within a given time duration (Sculley et al., 2009). A high bounce rate can lead to a poor
retailer/advertiser return on investment (ROI) and suggests that users may have a poor
experience once they land on a particular page through a referral link (e.g., by clicking on an
ad or by finding the page through a web search). The former is commonly referred to as a
marketing touchpoint.
4
A typical way to address such a deficiency is to intervene in the interface elements to
find the combination that leads to the best performance metric (e.g., the lowest bounce or click-
through rate) and measure the effect of this intervention. This approach is known as A/B testing
and considers splitting the visitor traffic into two streams, which are assigned to the baseline
condition (B) or its alteration (A). When considering multiple alternations wherein the
comparison of the difference considers more than two groups, M/N or multivariate testing is
performed (Kohavi et al., 2009). The evaluation of the effect of these interventions is performed
through a typical test of the mean differences using either two-sample parametric tests or
ANOVA when considering various alternative interventions. This capability is integrated into
web analytics tools, which are prevalently used to guide decision making by analytics
professionals. With the ever-increasing dimensionality of the test features and attributes of
testing, expert input has become limited and biased (Sauter, 2014).
A typical example of such a bias is the decision concerning the duration of A/B tests
and whether enough statistical power has been accrued to declare a winner or best-performing
configuration. Deciding the length of a test is critical since, in the case of the worst performance
in the post-hoc period of such a test, opportunity costs arise from lost conversions. Seasonal
and cyclical variations of demand have been demonstrated to affect several aspects of economic
activity, with online shopping being no exception.
In this study, we aim to address a typical question that is abundant in this type of test,
which is how long we should sample a session in order to extract results that capture an
adequate level of periodicity for an effect to be observed?” An affirmative and non-biased
answer would allow analytics professionals (e.g., those active in search engine optimization)
to safely evaluate the economic significance of their intervention, avoiding Type I and Type II
errors that typically accompany such undertakings and may result from inadequate sampling.
5
Such a challenge, while approachable by a set of standard statistical practices, has the
characteristic of considering the evaluation of a metric that is of a longitudinal rather than a
cross-sectional nature. The former assumes that the time series of the evaluation criterion used
in a typical A/B testing scenario corresponds to the aggregated metrics of an entire source and
is free of any precondition, and the condition of the time series is inherent in its data structure
(e.g., the way the metric is calculated). In this case, the time series is also referred to as a
conditioned time series (Hamilton, 1994). In some embodiments, a source contains a number
of conditioned time series, such as metrics, including visits, page views, bounce rates,
pages/visits, new visits, average time on site, etc. (Vaughan & Yang, 2013).
Considering that such time-series data are relatively large or high-frequency,
approaches related to sampling and periodicity pose a challenge to standard analytics tools
(Varian, 2014). From a statistical viewpoint, the problem that we are looking to address here
is more specifically discussed in the work of Downing, Fedorov, Lawkins, Morris, and
Ostrouchov (2000). Because of size, there is the assumption that the dataset cannot be analysed
at once and should be analysed in segments. The strategy adopted in our study considers the
segmentation of a large data series into a series of segments of arbitrary length and then an
examination of one part of the division at a time to allow unequal segments to reach an optimal
segment length. In this way, the variation of the stationarity per period is investigated to
ascertain whether there is a stable periodical pattern of this variation, which in turn, can be a
guiding heuristic of sample size. Building on previous work (Poulos, 2016), our methodology
provides a simple but robust approach to dealing with the segmentation and periodicity
estimation of time series data representing conditioned metrics. Considering that such metrics
are abundant in web analytics and marketing practices, our work also has practical implications.
Our study responds to several points of interest already outlined in the literature, such
as that of Mortenson, Doherty, and Robinson (2015), regarding the integration of operational
6
and computational intelligence methods with the emerging field of data analytics and in
particular high-frequency data from digital trails of customer activity. From the perspective
that sampling periods can alter the significance of marketing interventions, such as those
measured in A/B or M/N testing scenarios, our paper also contributes to the practice of web
analytics by incorporating research with real-world data captured through analytics tools that
are considered standard in the industry (Google Analytics). To this end, this paper is structured
as follows. Section 2 discusses related work and the background of the bounce rate definition
and the use of A/B and M/N testing methodology in evaluating the significance of marketing
interventions. We provide an analytical formulation and explanation of the algorithmic process
in Section 3, where the problem of identifying the optimal sampling period for a conditioned
time series is discussed. A benchmark evaluation using data from an online retailer is discussed
in Section 4, along with implications for practice in Section (5). The paper concludes with
Section 6, discussing limitations and future research directions.
2. Related work
2.1 Bounce rates
Bounce rates represent a significant benchmark for the assessment of the engagement value of
interactionsso-called touchpointsin various areas of content authoring and advertising
(Murthy & Mantrala, 2005). In their simplest form, bounce rates can be defined as the ratio of
extremely short-lived sessions (generally defined as single-page sessions) established either by
direct entry (when the user types the URL into the browser) or by referral entry (by clicking on
a hyperlink) and its correspondent landing. Several established industry tools, such as Google
Analytics (Clifton, 2012; Plaza, 2011), define bounce rates as sessions in which either
immediate back-button clicks have been initiated once the user loads the page or as abandoned
clickstreams in which no further action has been taken after the user initiates a session.
7
Considering the universe of n sessions initiated on a display space (e.g., website,
banner, etc.) with each session corresponding to an event time clickstream of k length:
St = ti=1..., ti=k (1)
the bounce rate (BR) is defined as the ratio of sessions in which the depth of the clickstream
is singular to the overall number of sessions, such as:
(2)
Due to its simplicity, the bounce rate has been a standard benchmark for evaluation of
the performance of entry points (or referrals) in web analytics. In the case of display or
sponsored search advertising, bounce rates can be used to measure the performance of an ad
and provide input for decision making in advertising budget allocation (Jeziorski & Moorthy,
2017). For example, if a landing page (the part of the website to which the click-through action
leads) has a bounce rate of 80%, this suggests that only 20% of the users that clicked on the ad
or sponsored search result were engaged with the action encapsulated in the landing page.
Considering that click-through rates are linearly dependent on the cost per click (which, in the
case of sponsored search results, varies and is the result of an auction), then an 80%
abandonment of the landing page corresponds to a significant loss of the investment provided
in the advertising budget.
Nevertheless, while optimizing bounce rates is an obvious approach, several
practitioners consider high percentages to be the results of induced demands that can be driven
by other factors and not necessarily by user attention (e.g., accidental landings, technical errors,
user interruptions, etc.). Industry reports suggest that an average bounce rate of 40% is nominal
for particular sectors (e.g., retailing), and as such, more resources should be directed toward
the optimization of user trajectories regarding k 2 actions in the clickstream (eCommerce
Europe, 2016). Furthermore, due to its inherent behavioural nature, the bounce rate depends on
the targeting that the ad initiates. Entries initiated through sponsored search advertising (e.g.,
8
Google Adwords) tend to have lower bounce rates than do entries initiated through display
advertisements (e.g., banners) due to the inherent information targeting that the advertising
mechanism uses (Yang & Ghose, 2010).
In the academic literature, researchers have associated increased bounce rates with the
engaging nature of the informational content contained in the website or the visual attributes
of the content (Lindgaard, Fernandes, Dudek, & Brown, 2006), including audio features (e.g.,
in the case of disruption). However, our understanding of bounce rate characteristics and
whether they can be predicted is somewhat limited (Wells, Valacich, & Hess, 2011), and
content optimization techniques, such as A/B testing, have become prevalent as standard tools
in the industry.
2.2 A/B testing and sample size
In its simplest form, an A/B test is a randomized controlled experiment technique that involves
the experimental evaluation of an overall evaluation criterion OEC (e.g., the performance of
an alteration of a web page) against a baseline. From an analytical point of view, it considers a
hypothesis test of two samples, with the null hypothesis corresponding to the baseline variant,
resembling a between-subjects design from an experimental point of view. It has been adopted
by content designers and marketing analysts for the evaluation of different stages of the
purchase funnel in e-commerce scenarios (Hoban & Bucklin, 2015). Typically, content
designers select a feature that has a level of uncertainty regarding its effect on a performance
metric (e.g., bounce rates, click-through rates, etc.). Then, a new page is created (Version B),
and a visitor is randomly assigned to either page A (or the baseline), which is the unaltered
version of the website, or page B, which represents the altered version of the page. The subject
assignment procedure is performed through a randomized mechanism (a so-called splitter),
which is typically executed on a server using a cookie assignment to the visitor. This procedure
9
is performed to ensure that for the duration of the experiment, repeat visits are assigned to the
same version of the page.
Since the evaluation of the altered version against the baseline is performed with a
parametric test, assumptions of normality are followed for all parameters of the problem,
including confidence intervals and statistical powers. For several categories of web analytics
metrics, for which the underlying distribution is not normal (e.g., Gaussian or Poisson),
appropriate non-parametric tests are used. For example, if we consider the evaluation of the
effect of an intervention on click-through rates, which has been shown to follow a binomial
distribution (REF), Fischer’s exact test is used, while non-parametric tests, such as the Mann-
Whitney U-test, are dominant when no assumptions about the underlying distribution are made.
The standard guiding principle behind the reliability of the test is the statistical significance of
the difference between the sample means and the appropriate statistical power that the
difference in the selected metric is going to exhibit. Several researchers in the literature have
studied the issue from a statistics point of view, and the probability perspective (Brodersen,
Gallusser, Koehler, Remy, & Scott, 2015; Varian, 2016) and alternative corrections and criteria
have been proposed and adopted from the experimental literature. For example, Gibbs
sampling may be appropriate for the selection of sample intervals for A/B testing if no direct
data are available about the probability distribution of the chosen OEC.
Regardless of the evaluation approach, questions regarding the optimal sampling size
and length are still debatable and subject to the sensitivity of the selected test, and the
assumptions regarding the underlying distribution. Our aim in this study is not to delve into the
mechanism used to compare the differences between the two samples but to direct our attention
toward the issue of sub-sample selection to evaluate the OEC in the context of A/B testing.
This issue is directly related to the question of the experimental duration and its time series
specific nature. Building on prior work concerning time series stationarity detection (Poulos,
10
2016), our approach considers the extraction of the stationarity degree to guarantee equal
likelihoods of activity captured by the OEC across the testing sample.
2.3 Our contribution
The problem that we tackle with is that the underlying assumption of the random assignment
achieved with a split generator in an A/B testing scenario may not be enough to safeguard the
validity of the test result, and as such, a more robust approach based on the time series
characteristics of the targeted metric is needed.
This problem is of high economic significance for users of an advertising network and,
in particular, retailers, since it is costly at two levels. First, the direct advertising cost involves
the cost-per-click (CPC) associated with a bounced visit, and second and most importantly, lost
opportunity results from missed activity of a potential client. Arguably, the problem of
assessing the usability performance of a web space (e.g., an e-commerce site) considers not
only the bounce rate but also the overall trackable activity until the point of checkout (and
hence other elements of the purchase funnel, which can lead to an abandonment of the
clickstream). However, concerning the question of decisions related to budget allocation (e.g.,
for sponsored-search or display advertising), the returns of these decisions may be harmful if
the optimization strategy does not consider an accurate estimation of the time dependence.
Inherent sources of error in this case, such as stationarity, have been known to influence the
reliability of time-dependent metrics (Sculley et al., 2011), and our intention in this study is to
address this issue by introducing an analytical process.
The method is based on an algorithm [Poulos, M. (2016).] that detects the sampling
stability of a time series. The sampling stability is expressed by the discovery of some dominant
periodicity extracted from the change of the stationarity degree within a particular time series
11
segment. Therefore, the algorithmic contribution of the study could be applied beyond the
bounce rate issue. The details of this contribution are discussed in section 5.1.
3. Analytic formulation
2.4 Preliminaries
The extraction of the stationarity degree is based on previous work (Poulos, 2016;
Sharifdoost, Mahmoodi, & Pasha, 2009), in which it has been defined that a discrete time
stationary process {Mn} with i = 1,...: n, is time reversible for every positive integer n if the
following equation is satisfied:
(M1, M2,...,Mn) = (Mn,Mn1,...,M1) (3)
Then, it is considered that a discrete time series with i = 0,...,n produces a
mirror time series, which can be described as:
(4)
Thus, taking into account Equation 4, the degree of stationarity is based on in the
following formulation:
(5)
If error = 0, then the time series consists of a stationary process based on the error
estimation of the dissimilarity measure between the discrete time series and the reversible
. Then, using Euclidean and dynamic time warping (DTW) techniques, the local
dissimilarity of the function f is defined between any pair of elements Mn Nn, with the
shortcut:
( ) ( )
,1 , , 0
n
ii
ij d i j f M N
==
(6)
12
Then, if the path is the lowest cost path between two series, the corresponding
dynamic time warping (DTW) technique (Salvador & Chan, 2007) provides the warping
curve φ(k), k = 1,2,…,T as:
( ) ( ) ( )
( )
( ) ( )
 
,
1,2,...,
^
y
y
k k k with
kk n
 

=
(7)
The warping functions ϕx(k)φψ(k) remap the time indices of M N accordingly. Given
ϕ and following Cortez, Rio, Rocha, and Sousa (2012), the average accumulated distortion
between the warped time series M N is calculated as follows:
( ) ( ) ( )
( )
( )
1
,
,xx
d k k m
d M N M
 
=
=
(8)
where mϕ(k) is a per-step weighting coefficient of the corresponding normalization constant
(mϕ), which confirms that the accumulated distortions are comparable along different paths.
To ensure reasonable warps, constraints are usually imposed on ϕ. The basic idea
underlying DTW is to find the optimal alignment ϕ such that:
(9)
Therefore, one picks the distortion of the time axes of M N, which brings a couple of
the time series as near to each other as possible.
2.5 Procedural definition
Graphically, this algorithm is described in Figure 1. We provide a more detailed analytical
overview and the algorithmic steps below.
[Insert Figure 1 here]
13
Step1. Let us consider the matrix M, which contains the hourly bounce rate data set with the
size (1 × R), R N
( )
1,
i
jM j x M
=
, where
( ) ( )
1, : ,0
i
jM j x M x j x j i x R i
== + + +
(10)
Index j corresponds to the number of repetitions of the algorithm in the same window
length each time, with a unit step of sliding. Additionally, the indicator i is the selected size of
the investigated window, which is constant for each experiment, and x is the beginning point
of the series. Then, the corresponding mirror data set is:
( ) ( )
,1 , : ,0
i
ij N j x N x j i x j x R i
== + + + 
(11)
Subsequently, the extraction of the stationarity value according to Equation 7 is
depicted in the following square matrix
( ) ( )
( )
1,1 1,2 1,i 1 1,
2,1 2,2 2, 1 2,
1
1,1 1,2 1, 1 1,
,1 ,2 , 1 ,
..
..
......
, , , ......
..
..
i
ii
i
j
j j j i j i
j j j i j i
Z Z Z Z
Z Z Z Z
Z M j x N j x
Z Z Z Z
Z Z Z Z
=
− −




=





(12)
Step2. Then, the matrix
 
( ) ( )
( )
11, , ,
i
i
i
ij
A Z M j x N j x
==
=
is produced, along with a second
matrix using the same procedure
 
( ) ( )
( )
11, , ,
n
i
ij
B Z M j y N j y
==
=
(13)
where 0 < y < R n is produced to construct a correlated pair of matrices.
Step3. Thereafter, aiming to produce a smoothing procedure in the data of matrices ]
and ii=1 [Bi], a cumulative moving average (CMA) procedure is submitted as follows:
 
1
1,1
1
i
nn
in
nn
A CMA
CMA CMA n i
n
=
+
= +  
+
(14)
and
14
 
1
1,1
1
i
nn
in
nn
B CMB
CMB CMB n i
n
=
+
= +  
+
(15)
Then, the new matrices are:
 
21
1
11 1 2 ( 1) ( 3)
, , , ,..., , ,
n n n k k
i
i i i i i k
ii i i i k n i k n
MA A A A A A A A
−+
== = = = − = −

         
=         


Step4. Then,
 
10
pi
cF MA
=
==
and
 
10
pi
cG MB
=
==
are calculated to extract the local
maxima points of the graphs corresponding to the matrices [MA] and [MB].
Step5. Consequently, the differences between adjacent elements of the [F] and [G] matrices
are calculated, i.e.,
 
11
1 1 2 1
11
11
pp
pp
c c c c
cc
cc
T F F T G G
−−
−−
==
==
=  =  −
 
(16)
Step6. Then, the mean values of the matrices
12
TT
 
 
are determined.
Step7. Then, the matrix
 
12
,W T T

=
is determined, and the standard error of the mean of
the matrix [M] is calculated as follows:
( )
22
121
2
i
i
error
WW
s
=
=
(17)
Step8. Finally, using a two-tailed t-test with df = 1 for
W


, the below equation is obtained:
_lim *
_lim *
error value
error value
lower it W s t ll W ul
upper it W s t


=−
 


=+



(18)
where ll and ul are the lower and upper limits, respectively.
15
4. Experimental part
2.6 Data and methods
The experiment considered a dataset sourced from an online retailer active in the segment of
consumer electronics
1
. The retailer’s objective was to evaluate the performance of a search
engine optimization intervention that was carried out to improve the overall bounce rate that
the e-shop exhibits when visitors land on the website by clicking on an organic (non-sponsored)
search result through Google or secondary search providers.
We gained access to the retailer’s Google Analytics account and extracted data from
the main landing page, which listed entry points for the different categories (e.g., digital
cameras, laptops, etc.). Hourly data were obtained using the API provided by the Google
Analytics backend and exported to CSV files for further processing. The resulting input data
matrix corresponded to the click-stream for an approximate two-year period and had a sample
size n=18288 visitor sessions. During this period, the retailer’s website remained unchanged
concerning visual cues and interface characteristics. We used the default computation for the
bounce rates from Google Analytics and performed some preliminary analysis to ensure that
during the period to be analysed, there was no technical failure (downtime) of the website that
would interrupt the continuity of the time series. The graphical representation of the variation
of the bounced sessions in our dataset is shown in Figure 2.
[Insert Figure 2 here]
Having acquired the data and prepared the input data series, we proceed with the
implementation of the analytic procedure as described in Section 3.2. For clarity, we refer to
the points of the time series by their index value, which is set from 1 to the maximum length
of the data matrix (n = 18288). We outline the numerical computation of the steps that we used
in the sections that follows.
16
[Insert Figure 3 here]
Step 1. For a random value x = 11813 with i = 60 and taking j = 1,2,3...60, a matrix
M(60,11813) is constructed according to Equation 8 (see the blue line in Figure 3 and
Table 1). In the same way, the mirror N of matrix M is produced:
( ) ( )
,1 , 1: ,0
i
ij N j x N x j x j x R i
== + + + 
according to Equation 9 (see the red line in Figure 3 and Table 1). Then, the calculated
degree of stationarity D11 (see Equation 10) is computed using the dataset D11 = 0.0209.
Similarly, the other values of the matrix with its the corresponding dimensions (60 × 60) are
obtained.
Step 2 Thereafter, the matrix
 
( ) ( )
( )
11, , ,
i
i
i
ij
A D M j x N j x
==
=
of size (1x60) is obtained. In
the same way, the matrix
 
( ) ( )
( )
11, , ,
n
i
i
ij
B D M j y N j y
==
=
of size (1x60) is
obtained.
Step 3. Using a cumulative procedure with a 5-point (n=5) moving average, the matrices
 
1i
iMA
=
and
 
1i
iMB
=
are obtained.
Step 4. According to Equation 11, the local maxima points of the graphs of the matrices [MA]
and [MB] are calculated in the new matrices [F] [G] (see Table 1).
Step 5. Consequently, the differences between the adjacent elements of matrices [F] and [G]
are calculated in the new matrices [T1] [T2].
Step6. Then, the mean values of the matrices
12
TT
 
 
are determined (see Table 1,
column: Mean).
Step7. Then, the matrix
 
12
,W T T

=
is determined, and the standard error of the mean of
the matrix is calculated (see Table 1, column: Error).
17
Step8. According to Equation 14, for a 98% confidence interval with t=31.820, a p-value of
0.02 for 2% significance and W¯ = 25.2767 (see Table 1, cell: Mean Error) which is
transformed as follows:
_lim 25.2767 0.0222*31.821 24.5703 25.2767 25.9831
_lim 25.2767 0.0222*31.821
lower it
upper it
=−

 

=+

(19)
2.7 Results
According to the experimental procedure and taking into account the results presented in Table
1 and the resulting transformation of the time series depicted in Figure 4, an apparent
periodicity of the applied processing is observed. This observation is focused on the measure
of the differentiated positions of the local maxima points and puts great emphasis on the
dominant query, by subjecting the task on finding the necessary sample size that significantly
captures the observed periodicity of the time series.
[Insert Table 1 here]
The results of the experimental procedure (step 8), provide a sample size s=25, which
can be interpreted as that the variation of the stationarity degree has stable maxima periodically
for a set of 25 data points bounce rate samples. Considering that our data represents hourly
bounce rates, the benchmark data suggest a window of 25 hours for the evaluation of
interventions for content optimization.
[Insert Figure 4 here]
In more detail, the above calculations are achieved via Equations 4-18 using the mean
difference between the matched pairs technique. The matched data pairs were obtained in a
random way according to Equation 13 and had a scalable range from 60-200 with the step
increment of 5, that is, 30 matched data pairs were created in total (see Table 1). Therefore,
for each matched pairfor example, the values at length 60, which are depicted between the
18
data x=[11813, 11873] and y=[5391, 5451]the local peaks are calculated (see Figure 1, start).
Then, as the mean matched data pairs are determined, the mean value of the distance between
each local peak value is obtained. In the start case, the number of peaks for the pair x and y
data set is two (2), and the mean distances are 28 and 29, respectively. Therefore, the
M.D.B.M.P is calculated (see Equation 17) from the difference between the above means,
which, in Table 1, is depicted as the error (e=0.5). In the same way, the M.D.B.M.P results are
calculated through the last data set (see Figure 1, end). Additionally, in the Appendix, the
graphical transformations of the above calculations are depicted for the 30 matched data pairs.
5. Discussion
2.8 Theoretical Implications
This study contributes to the expert and intelligent systems literature by demonstrating a robust
method for subsampling time series data, which are critical in decision making. In particular
the application of the stationarity detection algorithm as demonstrated in previous work
(Poulos, 2016) allows for evaluating more complex problems in business practice such as
measuring website prominence (Papavlasopoulos, 2019; Poulos, Papavlasopoulos,
Kostagiolas, & Kapidakis, 2017) as well as dimensionality reduction in text analytics (Poulos,
2017).
The particular implication in researching patterns in high-frequency time series data
can also be applied in patterns of web queries such as those in Google Trends. This extraction
of the periodical non-stationarity features of time series can complement existing approaches
for novelty detection in scientific literature utilizing the patterns on prominent keywords
appearing in scientific publications (Papavlasopoulos, 2019). While the application in this
context concerns consumer activity it confirms previous results that proxy a visitor’s activity
using the search queries and the related keywords that have been utilized. This method is
19
implemented via the same algorithm, with the only exception being that parameter M is fed
with a multidimensional data structure (see Equation 10). While this study aims to assess when
periodicity can distort the outcomes of a marketing intervention, it confirms similar results with
the study of Papavlasopoulos (2019) which investigates when a keyword time series gives non-
stationarity peaks. As such, asserting the condition of a non-stationary categorical time series,
yields goodness of fit in the prediction issue.
A further implication that can be investigated further in future studies comes from the
aggregation of individual time series using grouping factors such as product category and
brand.Poulos et al., (2017) demonstrated that asserting stationarity of aggregated time series of
search keywords using Google Trends can be achieved, using an example of publishing houses
and their corresponding publications. In a similar manner, the data type for parameter M
(Equation 10) needs to modified to represent 2-dimensional groupings.
The algorithmic process presented here can also be used in the context of text analytics
(Poulos, 2017), where the possible relationship between the syntactic property of a text sample
and the stationary variation of the time series that produces the text, can be asserted. This can
inform additional dimensions, such as the case of recommendations based on semi-structured
data such as those on online reviews (Korfiatis and Poulos, 2013).
Therefore, application of Equations 1-13 to the data type (M) yields the new modified
time series A and B (see Equation 12 and 13), which in turn, leads us to the technique for
calculating the periodicity of various type of high-frequency data as the ones described above
(see Equations 14-18).
2.9 Implications for practice
Trust in evidence-based methods for evaluating marketing interventions, such as A/B testing,
is gaining momentum for both managers and marketers. However, the pitfalls associated with
misuse of this decision-making instrument are not well understood by managers and analytics
20
experts since the prevalence of software tools provides an out-of-the-box solution, which may
not be optimal (Dmitriev, Frasca, Gupta, Kohavi, & Vaz, 2016). Anecdotal examples of
negative results induced by Type I and II errors are known in the industry, and careful
consideration of the time-dependent properties of marketing metrics (e.g., stationarity) by
decision-makers is important. Making a healthy choice between alternative interventions
guided by customer-driven interactions is an important example of analytical maturity
(Davenport & Harris, 2007) and is independent of the organizational size. Several examples of
A/B testing scenarios consider interventions on web spaces owned by small- and medium-sized
companies. As such, being able to reliably ascertain the impact of these interventions on
conditioned time series can also give a competitive advantage in capturing consumer attention.
However, as Kohavi et al. (2012) state, experimentation is not a panacea for everyone, and its
assumptions should be well understood when interpreting results of high economic
significance. In this study, an experimental method is attempted to override the aforementioned
unsafe decision assumptions of the A/B method. To achieve the above objective, the data were
applied to the algorithm in Equations 4-18 using matched data pairs as described in section 4.2,
which yields a statistical significance test. In the analysis of the results, a 25-hour sample time
period has been derived for the data set. Furthermore, this study places a large emphasis on
examining the nature of the time series and the stage from which the data are retrieved. Practical
considerations such as the assumptions that accompany the time series data retrieved at the
initial stage, or the impacts of any demand peaks (e.g., due to marketing campaigns running in
parallel) can be validated through the procedure outlined here.
As discussed in the previous section, data sets from Google Trends and bounce rate
could yield this degree of periodicity. This consideration is based on the assumption that the
nature of the data is depicted in the local peaks of the transformed time series that come from
the stationarity process.
21
6. Conclusions, Limitations and Future Research
In this paper, the potential to extract periodical stationarity exhibited in a conditioned time
series of bounce rates was investigated and evaluated using a benchmark dataset. Controlling
for stationarity is a significant problem in analytics and forecasting, in which a time series is
analysed for the levels of differences. Using the appropriate transformations with a new
algorithm for calculating the stationary distance, our approach can be useful in the evaluation
of marketing interventions, such as those in A/B testing scenarios. This distance is based on a
novelty stationary ergodic process, which rests on the consideration that the stationary series
presents reversible symmetric features and is calculated using the dynamic time warping
(DTW) algorithm in a self-correlation procedure. The results of the benchmark test performed
in the experimental part of this paper present the very clear and logical periodicity of the
discussed method by utilizing the measures of differences in the positions of local maxima
points during the segmentation of the conditioned series.
While our approach was operationalized for a conditioned time series, our method does
not take into account causal influences from other time-dependent processes that may affect
the behaviour of the evaluated metric (in our case, bounce rates) or psychological cases related
with shopping cart abandonment (Huang et al., 2018). Such a case could arise when transitions
from stages are considered (e.g., bounces after the second click). In this aspect, our analysis is
therefore agnostic to important user characteristics, such as repeated visits and view-through
conversions, which require a higher-order data structure than that considered in this study.
In addition, our analysis places a large emphasis on the issue of finding the necessary
sample size that significantly satisfies the observed periodicity of a time series of bounce rates
in an e-commerce scenario. Future work on other types of conditioned time series represented
in web analytics, such as page views, pages/visits, percentages of new visits, and average times
22
on sites, is also important as well as demand patterns in supply chains (Zissis et al., 2015). This
work will involve studying more sophisticated time series processing and template matching
techniques as well understanding the distributional characteristics of these metrics.
Data accessibility
No data are provided together with the manuscript
Notes
*
For this study, the retailer has requested to remain anonymous.
23
References
Brodersen, K. H., Gallusser, F., Koehler, J., Remy, N., & Scott, S. L. (2015). Inferring causal
impact using Bayesian structural time-series models. The Annals of Applied Statistics, 9,
247274. doi:10.1214/14-aoas788
Clifton, B. (2012). Advanced web metrics with Google analytics. Indianapolis, Indiana: John
Wiley & Sons.
Cortez, P., Rio, M., Rocha, M., & Sousa, P. (2012). Multi-scale internet traffic forecasting using
neural networks and time series methods. Expert Systems, 29, 143155.
doi:10.1111/j.1468-0394.2010.00568.x
Danaher, P. J., & Rust, R. T. (1996). Determining the optimal return on investment for an
advertising campaign. European Journal of Operational Research, 95, 511521.
doi:10.1016/0377-2217(95)00319-3
Davenport, T. H., & Harris, J. G. (2007). Competing on analytics: The new science of winning.
Boston, MA: Harvard Business Press.
Dmitriev, P., Frasca, B., Gupta, S., Kohavi, R., & Vaz, G. (2016). Pitfalls of long-term online
controlled experiments. In 2016 IEEE international conference on big data (big data)
(pp. 13671376). Washington, DC, USA: IEEE.
Downing, D. J., Fedorov, V. V., Lawkins, W. F., Morris, M. D., & Ostrouchov, G. (2000). Large
data series: Modeling the usual to identify the unusual. Computational Statistics & Data
Analysis, 32, 245258. doi:10.1016/s0167-9473(99)00079-1
eCommerce Europe. (2016). E-commerce benchmark and retail report. Retrieved from
https://www.ecommerce-europe.eu/app/uploads/2016/06/Ecommerce-Benchmark-Retail-
Report-2016.pdf
24
Edvardsson, B., Kristensson, P., Magnusson, P., & Sundström, E. (2012). Customer integration
within service developmentA review of methods and an analysis of insitu and exsitu
contributions. Technovation, 32, 419429. doi:10.1016/j.technovation.2011.04.006
Hamilton, J. D. (1994). Time series analysis. Princeton, NJ: Princeton University Press.
Hoban, P. R., & Bucklin, R. E. (2015). Effects of internet display advertising in the purchase
funnel: Model-based insights from a randomized field experiment. Journal of Marketing
Research, 52, 375393. doi:10.1509/jmr.13.0277
Huang, G. H., Korfiatis, N., & Chang, C. T. (2018). Mobile shopping cart abandonment: The
roles of conflicts, ambivalence, and hesitation. Journal of Business Research, 85, 165
174.
Jeziorski, P., & Moorthy, S. (2017). Advertiser prominence effects in search advertising.
Management Science, 64, 13651383. doi:10.1287/mnsc.2016.2677
Kohavi, R., Deng, A., Frasca, B., Longbotham, R., Walker, T., & Xu, Y. (2012). Trustworthy
online controlled experiments: Five puzzling outcomes explained. In Proceedings of the
18th ACM SIGKDD international conference on knowledge discovery and data mining
(pp. 786794), Beijing, China: ACM.
Kohavi, R., Longbotham, R., Sommerfield, D., & Henne, R. M. (2009). Controlled experiments
on the web: Survey and practical guide. Data Mining and Knowledge Discovery, 18(1),
140181. doi:10.1007/s10618-008-0114-1
Korfiatis, N., Poulos, M. (2013). Using online consumer reviews as a source for demographic
recommendations: A case study using online travel reviews. Expert Systems with
Applications 40, 55075515.
25
Lindgaard, G., Fernandes, G., Dudek, C., & Brown, J. (2006). Attention web designers: You
have 50 milliseconds to make a good first impression! Behaviour & Information
Technology, 25(2), 115126. doi:10.1080/01449290500330448
Marr, B. (2010). The intelligent company: Five steps to success with evidence-based
management. New York, NY: John Wiley & Sons.
Mortenson, M. J., Doherty, N. F., & Robinson, S. (2015). Operational research from Taylorism
to Terabytes: A research agenda for the analytics age. European Journal of Operational
Research, 241, 583595. doi:10.1016/j.ejor.2014.08.029
Murthy, P., & Mantrala, M. K. (2005). Allocating a promotion budget between advertising and
sales contest prizes: An integrated marketing communications perspective. Marketing
Letters, 16(1), 1935. doi:10.1007/s11002-005-1138-6
Papavlasopoulos, S. (2019). Scientometrics analysis in Google trends. Journal of Scientometric
Research, 8(1), 2737. doi:10.5530/jscires.8.1.5
Pfeffer, J., & Sutton, R. I. (2006). Evidence-based management. Harvard Business Review,
84(1), 62.
Plaza, B. (2011). Google analytics for measuring website performance. Tourism Management,
32, 477481. doi:10.1016/j.tourman.2010.03.015
Poulos, M. (2016). Determining the stationarity distance via a reversible stochastic process.
PLoS One, 11(10), e0164110. doi:10.1371/journal.pone.0164110
Poulos, M. (2017). Definition text's syntactic feature using stationarity control. In 2017 8th
International conference on information, intelligence, systems & applications (IISA) (pp.
15), Larnaca, Cyprus: IEEE.
26
Poulos, M., Papavlasopoulos, S., Kostagiolas, P., & Kapidakis, S. (2017). Prediction of the
popularity from Google trends using stationary control: The case of STM publishers. In
2017 Fourth international conference on mathematics and computers in sciences and in
industry (MCSI) (pp. 159163), Corfu, Greece: IEEE.
Salvador, S., & Chan, P. (2007). Toward accurate dynamic time warping in linear time and
space. Intelligent Data Analysis, 11(5), 561580. doi:10.3233/ida-2007-11508
Sauter, V. L. (2014). Decision support systems for business intelligence. Hoboken, NJ.: John
Wiley & Sons.
Sculley, D., Malkin, R. G., Basu, S., & Bayardo, R. J. (2009). Predicting bounce rates in
sponsored search advertisements. In Proceedings of the 15th ACM SIGKDD international
conference on knowledge discovery and data mining (pp. 13251334), Paris, France:
ACM.
Sculley, D., Otey, M. E., Pohl, M., Spitznagel, B., Hainsworth, J., & Zhou, Y. (2011). Detecting
adversarial advertisements in the wild. In Proceedings of the 17th ACM SIGKDD
international conference on knowledge discovery and data mining (pp. 274282), San
Diego, California, USA: ACM.
Sharifdoost, M., Mahmoodi, S., & Pasha, E. (2009). A statistical test for time reversibility of
stationary finite state markov chains. Applied Mathematical Sciences, 52, 25632574.
Varian, H. R. (2014). Big data: New tricks for econometrics. Journal of Economic Perspectives,
28(2), 328. doi:10.1257/jep.28.2.3
Varian, H. R. (2016). Causal inference in economics and marketing. Proceedings of the National
Academy of Sciences U S A, 113, 73107315. doi:10.1073/pnas.1510479113
27
Vaughan, L., & Yang, R. (2013). Web traffic and organization performance measures:
Relationships and data sources examined. Journal of Informetrics, 7, 699711.
doi:10.1016/j.joi.2013.04.005
Wells, J. D., Valacich, J. S., & Hess, T. J. (2011). What signal are you sending? How website
quality influences perceptions of product quality and purchase intentions. MIS Quarterly,
35, 373396. doi:10.2307/23044048
Yang, S., & Ghose, A. (2010). Analyzing the relationship between organic and sponsored search
advertising: Positive, negative, or zero interdependence? Marketing Science, 29, 602
623. doi:10.1287/mksc.1090.0552
Zissis, D., Ioannou, G., & Burnetas, A. (2015). Supply chain coordination under discrete
information asymmetries and quantity discounts. Omega,53, 21-29.
28
Tables
Table 1. Results of the experimental procedure. l corresponds to the sampling length and N(Xi) and N(Yi) are the
numbers of peak values of X and Y, respectively, with μ and e corresponding to the mean and standard error,
respectively.
l
Range of x
Range of y
N(Xi)
µ(X)
N(Yi)
µ(Y )
e
60
[11813,11873]
[5391,5451]
2
29
2
28
0.5
65
[16154,16219]
[586,651]
3
26.5
3
27
0.25
70
[7459, 7529]
[6487,6557]
3
27
3
25.5
0.75
75
[13014,13089]
[13519,13594
3
25
3
25.5
0.25
80
[12060,12140
[12830,12910]
3
25.5
3
26.5
0.5
85
[4693,4778
[11555,11640]
3
26
3
25.5
0.25
90
[11137,11227
[2765,2855]
4
25
4
25.33
0.17
95
[2023,2118
[8473,8568]
4
25.33
4
25.67
0.17
100
[16316,16416
[5787,5887]
4
26.33
4
25.33
0.5
105
[9950,10055
[3805,3910]
4
25.33
4
25
0.17
110
[12772,12882
[4337,4447]
4
25.67
4
25.67
0
115
[8602,8717
[11885,12000]
5
25
5
25.75
0.38
120
[15146,15266
[16308,16428]
5
25.25
5
25.25
0
125
[14293,14418
[4323,4448]
5
25
5
25
0
130
[13843,13973
[4140,4270]
5
24.75
5
25
0.13
135
[15798,15933
[5950,6085]
5
25.5
5
25.25
0.13
140
[3343,3483
[4269,4409]
6
25.2
5
24.8
0.2
145
[10473,10618
[8046,8191]
6
25
6
25
0
150
[5979,6129
[14125,14275]
6
24.8
6
24.6
0.1
155
[9950,10105
[9346,9501]
6
24.6
6
24.8
0.1
160
[15593,15753
[4860,5020]
7
24.67
7
24.83
0.08
165
[13554,13719
[13492,13657]
7
24.67
7
24.67
0
170
[6810,6980]
[10165,10335]
7
24.83
7
24.67
0.08
175
[1358,1533]
[966,1141]
7
24.5
7
24.67
0.08
180
[1768,18048]
[1011,1191]
7
24.67
7
24.67
0
185
[16719,16904]
[2326,2511]
8
24.57
8
24.71
0.07
190
[17000,17190]
[15200,15390]
8
24.57
8
24.57
0
29
195
[214,409]
[6035,6230]
8
24.86
8
24.57
0.14
200
[2904,3104]
[14218,14418]
8
24.57
8
24.57
0
30
Index of Figures
Figure 1. Flow of data processing
Figure 2. Hourly bounces for our dataset. The horizontal axis represents the index ×102
Figure 3. Graphical depiction of matrix M (blue line) with its mirror N (red line).
Figure 4. Identification of the start (a) and end (b) of decomposition for the experimental
dataset.
Arrows indicate peak points for the original(x) and the reverse(y) time series.
31
Figure 1
32
Figure 2
33
Figure 3
34
Figure 4
35
Appendix: Graphical Transformation for the experimental section
The segmentation of the resulted time series and the identification of the local maxima as presented
in Table 1, is performed with a step of size of s = 5. For each step, the resulted length (l) transforms
the data series and its reverse, as depicted in the subsequent panels.
Transformation sequence for steps 60 ≤ l ≤ 105
l = 60 l = 65 l = 70 l = 75 l = 80
l = 85 l = 90 l = 95 l = 100 l = 105
Transformation sequence for steps 110 ≤ l ≤ 155
l = 110 l = 115 l = 120 l = 125 l = 130
36
l = 135 l = 140 l = 145 l = 150 l = 155
Transformation sequence for steps 160 ≤ l ≤ 200
l = 160 l = 165 l = 170 l = 175 l = 180
l = 185 l = 190 l = 195 l = 200
... Similarly, it is essential to control and reduce the bounce rate (Hasan et al., 2009), which indicates whether Internet users, once they land on a specific website, continue to interact with that website; engage in activities such as browsing other subdomains; visit different sections, contents, or sections; and participate in proposed activities (e.g., signing up for newsletters, downloading documents, accessing videos, participating in surveys, providing suggestions, or performing other actions (Poulos et al., 2020). The "bounce" (a user's return to the website of origin, usually to the search engine) occurs when users interpret that what they are looking for on the website of an institution or organization does not match what they want or expect to find (Chaffey & Patron, 2012;Poulos et al., 2020;Welling & White, 2006). ...
... Similarly, it is essential to control and reduce the bounce rate (Hasan et al., 2009), which indicates whether Internet users, once they land on a specific website, continue to interact with that website; engage in activities such as browsing other subdomains; visit different sections, contents, or sections; and participate in proposed activities (e.g., signing up for newsletters, downloading documents, accessing videos, participating in surveys, providing suggestions, or performing other actions (Poulos et al., 2020). The "bounce" (a user's return to the website of origin, usually to the search engine) occurs when users interpret that what they are looking for on the website of an institution or organization does not match what they want or expect to find (Chaffey & Patron, 2012;Poulos et al., 2020;Welling & White, 2006). Citizens' disaffection in surfing the Internet to search for information from institutions should be avoided by making the information clear and concise and supporting it with attractive texts, images, or videos (Bowden, 2009;Goldfarb, 2002;Halbheer et al., 2014). ...
Article
Society as a whole is undergoing digital transformation. Public and private institutions face the challenge of digitalization, including the unequal development of their presence in the digital ecosystem. The main objective of this research is to present a model for measuring the digital presence of public and private institutions, that is, institutions’ digital development index. The main variables for measuring web traffic are analyzed by identifying the factors that best explain the digital presence of the selected institutions and subsequently establishing an index for the classification of the institutions and the comparison and evaluation of their presence. Results show the medium–high digital development of the institutions analyzed. This study suggests that institutions’ web managers should measure their digital marketing actions.
... At first, the impact of DeFi social media analytics on their own digital performance should be measured. This leads to the first 2 research hypotheses (H1, H2), where the analytic metric of DeFi's organic traffic [48] and website bounce rate [49] are used as dependent factors. Regarding the digital performance metrics of supply chain companies, their connection with the variation in DeFi platforms' social media is next to be examined. ...
Article
Full-text available
Emerging technologies in the digital context can favor industrial sector firms in their aim to improve their performance. Digitalization is mainly expressed through the utilization of big data that originate from various sources. Blockchain technology has led to the extended adoption of capitalization of Decentralized Finance (DeFi) services, such as cryptocurrency trade platforms. Supply chain firms, in their quest to exploit any means and collaborations available to promote their services, could place advertisements on DeFi's social media profiles to boost their financial performance. Social media analytics, as a part of the big data family, are an emerging tool for promoting a firm's digital transformation, based on the plethora of customer behavioral data they provide. This study aims to examine whether the social media analytics of DeFi platforms are capable of affecting their website visibility, as well as the financial performance of supply chain firms. To do so, the authors collected data from the social media profiles of the most-known DeFi platforms and web analytics from the most significant supply chain firms' websites. For this purpose, proper statistical analysis, Fuzzy Cognitive Mapping, Hybrid Modeling, and Cognitive Neuromarketing models were adopted. Throughout the present research, it has been discerned that from an increase in the social media analytics of DeFi platforms, their website visibility increases, while the organic and paid traffic costs of supply chain firms decrease. Supply chain firms' website customers tend to increase at the same time.
... Closely associated with the assessment of customers' engagement and performance of marketing touchpoints, bounce rate is a strong indicator that online users do not interact with a Fintech company's website or social media content (Drivas et al., 2021;Poulos et al., 2020). During periods of high uncertainty, FVC needs helpful human-centered content that raises emotional awareness (Yaghtin et al., 2022) to develop brand engagement and built trust on digital platforms (Hollebeek and Macky, 2019). ...
Article
Purpose The paper’s main goal is to examine the relationship between the video marketing of financial technologies (Fintechs) and their vulnerable website customers’ brand engagement in the ongoing coronavirus disease 2019 (COVID-19) crisis. Design/methodology/approach To extract the required outcomes, the authors gathered data from the five biggest Fintech websites and YouTube channels, performed multiple linear regression models and developed a hybrid (agent-based and dynamic) model to assess the performance connection between their video marketing analytics and vulnerable website customers’ brand engagement. Findings It has been found that video marketing analytics of Fintechs’ YouTube channels are a decisive factor in impacting their vulnerable website customers’ brand engagement and awareness. Research limitations/implications By enhancing video marketing analytics of their YouTube channels, Fintechs can achieve greater levels of vulnerable website customers’ engagement and awareness. Higher levels of vulnerable customers’ brand engagement and awareness tend to decrease their vulnerability by enhancing their financial knowledge and confidence. Practical implications Fintechs should aim to increase the number of total videos on their YouTube channels and provide videos that promote their customers’ knowledge of their services to increase their brand engagement and awareness, thus reducing their vulnerability. Moreover, Fintechs should be aware not to over-post videos because they will be in an unfavorable position against their competitors. Originality/value This research offers valuable insights regarding the importance of video marketing strategies for Fintechs in promoting their vulnerable website customers’ brand awareness during crisis periods.
... Making an appointment is easy, just click on this link', might have possibly led to more visitors clicking on the link to the page on STI-testing and to more impact of this web-based intervention. A/B tests available in Matomo could be used to test whether such an addition has Evaluating use of web-based interventions impact on the percentage of visitors transferring to the STI-test page (Poulos et al., 2020). ...
Article
Full-text available
With the current increase in web-based interventions, the question of how to measure, and consequently improve engagement in such interventions is gaining more importance. Modern day web analytics tools make it easy to monitor use of web-based interventions. However, in this article, we propose that it would be more meaningful to first examine how the developers envisioned the use of the intervention to establish behavior change (i.e. intended use), before looking into how the intervention is ultimately used with web analytics (i.e. actual use). Such an approach responds to the regularly expressed concern that behavioral interventions are often poorly described, leading to less meaningful evaluations as it is not clear what exactly is being evaluated. Using a page on chlamydia prevention (104 557 pageviews in 2020) from a Dutch sexual health intervention (Sense), we demonstrate the value of acyclic behavior change diagrams (ABCDs) as a method to visualize intended use of an intervention. ABCDs show at a glance how behavior change principles are applied in an intervention and target determinants of behavior. Based on this ABCD, we investigate actual use of the intervention, using web analytics tool Matomo. Despite being intended to stimulate STI-testing, only 14% of the 35 347 transfers from this page led to the STI-testing page and a high bounce rate (79%) and relatively high exit rate were reported (69%). Recommendations to further interpret the data are given. This real-life example demonstrates the potential of combining ABCDs and Matomo as methods to gain insight into use of web-based interventions.
... Finally, for email address registration, verification simply consisted of sending automated messages and checking whether they were opened, click rates and other metrics. The bounce rate was measured to estimate fraudulent email addresses, aggregating soft bounces and hard bounces (Poulos et al., 2020). While hard bounces occur when the e-mail indicator is incorrect and/or the user's name before the @ is false, soft bounces occur when, for example, a user cannot receive emails because their inbox is full, the sender's address has been blocked as spam, or the mail server is temporarily down (Maaß et al., 2021). ...
Article
Full-text available
The digital environment, which includes the Internet and social networks, is propitious for digital marketing. However, the collection, filtering and analysis of the enormous, constant flow of information on social networks is a major challenge for both academics and practitioners. The aim of this research is to assist the process of filtering the personal information provided by users when registering online, and to determine which user profiles lie the most, and why. This entailed conducting three different studies. Study 1 estimates the percentage of Spanish users by stated sex and generation who lie the most when registering their personal data by analysing a database of 5,534,702 participants in online sweepstakes and quizzes using a combination of error detection algorithms, and a test of differences in proportions to measure the profiles of the most fraudulent users. Estimates show that some user profiles are more inclined to make mistakes and others to forge data intentionally, the latter being the majority. The groups that are most likely to supply incorrect data are older men and younger women. Study 2 explores the main motivations for intentionally providing false information, and finds that the most common reasons are related to amusement, such as playing pranks, and lack of faith in the company's data privacy and security measures. These results will enable academics and companies to improve mechanisms to filter out cheaters and avoid including them in their databases.
... Bounce rate mewakili tolok ukur yang signifikan untuk penilaian nilai keterlibatan interaksi yang di berbagai konten dalam website [2]. Ini yang menjadi permasalahan sebuah website karena metriks ini mampu menunjukkan seberapa banyak pengguna berinteraksi dengan website. ...
Article
Full-text available
Gapura Studio merupakan salah satu pengembang aplikasi dan konsultan teknologi yang ada di Kota Yogyakarta. Sebagai pengembang aplikasi, perusahaan ini menjual produknya melalui situs web yang dimiliki (online) . Gapura Studio merasa bahwa performa situs web yang dimiliki kurang baik sehingga berpengaruh dengan proses salah satunya adalah kontrol interaksi di situs web yang mengakibatkan pengguna yang meminimalkan perintah. Dalam penyelidikan menyelesaikan masalah tersebut dengan melakukan interaksi di situs web Gapura Studio dengan terlebih dahulu membuat arsitektur perusahaan . Penelitian ini menghasilkan cetak birupengembangan sistem informasi manajemen perusahaan dan juga perancangan chatbot berdasarkan proses bisnis yang dijabarkan dalam proses perancangan arsitektur perusahaan untuk mengatasi permasalahan interaksi pada situs web Gapura Studio.
Thesis
This thesis addresses the misalignment of learning with mobiles approaches as they are applied to rural communities of adult learners in East Africa. Most models of learning with mobiles do not work well for rural adult learners: they predominantly focus on the capabilities of the technology and not the available affordances, a crucial oversight in communities where smart phone and internet access is limited. Existing models are also misaligned with dialogic indigenous traditions of learning: they tend to function as derivatives of formal classroom environments and do not account for the pedagogical needs of rural adult learners accustomed to non-formal small group dialogic education rooted in the social sphere. This misalignment frames the key research question at the foundation of this report: Can learning with mobiles approaches adapt to the technological and pedagogical needs of adult learners in rural East Africa and enhance non-formal dialogic education? I approach this question through a Design Based Research methodology involving a mixed-method research design. By utilising the subsistence farmer network of my research partner The International Small Group and Tree Planting Program, I worked with 3,216 rural adults to complete a survey and conduct semi-structured interviews to thematically frame the intersecting dimensions of technological affordances, mobile learning pedagogy, and non- formal dialogic learning. This thematic analysis guided the iterative development of a mobile learning platform used by rural learners across Kenya, Tanzania, and Uganda. Four iterative design cycles of this platform provided insights as to how mobile technology can support small group-based dialogic education within a rural East African context. Analyses of these insights using a pre-post survey with 136 learners, learner data from the 640 users of the mobile learning platform, and Kearney and Burden’s iPAC framework for mobile pedagogy ultimately demonstrate that it is possible to adapt a learning with mobiles approach to meet the technological and pedagogical needs of rural learners. These findings are generalised into a series of Design Principles and a corresponding Techno-Pedagogical framework which incorporates a technological affordance and pedagogical perspective on learning with mobiles for non-formal small group dialogic education. The Design Principles and accompanying framework address the identified misalignment of mobile learning platforms in rural communities of East Africa and will assist learning with mobiles researchers and practitioners operating in similar contexts throughout the Global South.
Chapter
Full-text available
Recently, the Tanzanian government has started making m-government initiatives. However, little is known about the factors and conditions surrounding m-government adoption in Tanzania. Consequently, some m-government services have been successfully adopted while others are still struggling (having a low level of adoption). This study investigates critical success factors (CSFs) that led m-government services belonging to the same family to have varying degrees of adoption level. The study employs a set of web analytics tools that monitored and analyzed the traffic data of the selected three m-government services. The results show that inspecting the web analytics data from multiple viewpoints and varying levels of detail gives insights on the CSFs towards the adoption of m-government services. The findings suggest that perceived usefulness, user needs, and usability favor the adoption of one m-government service over the other.
Article
Full-text available
Though several industry reports have suggested that the rate of shopping cart abandonment is high in the mobile channel, the reasons for such abandonment remain relatively unexplored. Drawing on the cognition-affect-behavior (CAB) paradigm, this study aims to provide a conceptual framework explaining why consumers hesitate to use mobile channels for shopping and thus abandon their mobile shopping carts. Results from two studies show that mobile shopping cart abandonment is positively influenced by emotional ambivalence, a result of consumers' conflicting thoughts. More specifically, emotional ambivalence amplifies consumers' hesitation at the checkout stage, leading to cart abandonment. However, if hesitant consumers are satisfied with the choice process during shopping, they are less likely to give up their mobile shopping carts. Based on the findings, this mobile channel study provides practical and theoretical implications for marketers and e-cart abandonment researchers, respectively.
Conference Paper
Full-text available
Online controlled experiments (e.g., A/B tests) are now regularly used to guide product development and accelerate innovation in software. Product ideas are evaluated as scientific hypotheses, and tested on web sites, mobile applications, desktop applications, services, and operating system features. One of the key challenges for organizations that run controlled experiments is to select an Overall Evaluation Criterion (OEC), i.e., the criterion by which to evaluate the different variants. The difficulty is that short-term changes to metrics may not predict the long-term impact of a change. For example, raising prices likely increases short-term revenue but also likely reduces long-term revenue (customer lifetime value) as users abandon. Degrading search results in a Search Engine causes users to search more, thus increasing query share short-term, but increasing abandonment and thus reducing long-term customer lifetime value. Ideally, an OEC is based on metrics in a short-term experiment that are good predictors of long-term value. To assess long-term impact, one approach is to run long-term controlled experiments and assume that long-term effects are represented by observed metrics. In this paper we share several examples of long-term experiments and the pitfalls associated with running them. We discuss cookie stability, survivorship bias, selection bias, and perceived trends, and share methodologies that can be used to partially address some of these issues. While there is clearly value in evaluating long-term trends, experimenters running long-term experiments must be cautious, as results may be due to the above pitfalls more than the true delta between the Treatment and Control. We hope our real examples and analyses will sensitize readers to the issues and encourage the development of new methodologies for this important problem
Article
Full-text available
The problem of controlling stationarity involves an important aspect of forecasting, in which a time series is analyzed in terms of levels or differences. In the literature, non-parametric stationary tests, such as the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) tests, have been shown to be very important; however, they are affected by problems with the reliability of lag and sample size selection. To date, no theoretical criterion has been proposed for the lag-length selection for tests of the null hypothesis of stationarity. Their use should be avoided, even for the purpose of so-called 'confirmation'. The aim of this study is to introduce a new method that measures the distance by obtaining each numerical series from its own time-reversed series. This distance is based on a novel stationary ergodic process, in which the stationary series has reversible symmetric features, and is calculated using the Dynamic Time-warping (DTW) algorithm in a self-correlation procedure. Furthermore, to establish a stronger statistical foundation for this method, the F-test is used as a statistical control and is a suggestion for future statistical research on resolving the problem of a sample of limited size being introduced. Finally, as described in the theoretical and experimental documentation, this distance indicates the degree of non-stationarity of the times series.
Article
Full-text available
An important problem in econometrics and marketing is to infer the causal impact that a designed market intervention has exerted on an outcome metric over time. This paper proposes to infer causal impact on the basis of a diffusion-regression state-space model that predicts the counterfactual market response in a synthetic control that would have occurred had no intervention taken place. In contrast to classical difference-in-differences schemes, state-space models make it possible to (i) infer the temporal evolution of attributable impact, (ii) incorporate empirical priors on the parameters in a fully Bayesian treatment, and (iii) flexibly accommodate multiple sources of variation, including local trends, seasonality and the time-varying influence of contemporaneous covariates. Using a Markov chain Monte Carlo algorithm for posterior inference, we illustrate the statistical properties of our approach on simulated data. We then demonstrate its practical utility by estimating the causal effect of an online advertising campaign on search-related site visits. We discuss the strengths and limitations of state-space models in enabling causal attribution in those settings where a randomised experiment is unavailable. The CausalImpact R package provides an implementation of our approach.
Article
Search advertising is the ordered list of advertisements that appears when a user searches for something in an online search engine. By construction, these ads differ in prominence: ads higher up the list are more prominent than ads lower down the list. However, search ads also differ in prominence in another way: prominence of advertiser. This paper examines how these two types of prominence interact in determining the click-through rate (CTR) of these ads. Using individual-level click-stream data from Microsoft’s Live Search platform and measures of advertiser prominence from Alexa.com, we find that ad position and advertiser prominence are substitutes. Specifically, in searches for camera brands, a retailer not in the top 100 of Alexa rankings has a 30%–50% higher CTR in position 1 than in position 2, whereas a retailer in the top 100 of Alexa rankings has only a 0%–13% higher CTR for the same position improvement. Qualitatively similar results are obtained for several other search strings. These findings demonstrate, first, that advertiser brand matters even for search ads, and, second, the way it matters is the opposite of what is usually assumed in the theoretical literature on search advertising. The online appendix is available at https://doi.org/10.1287/mnsc.2016.2677. This paper was accepted by Matthew Shum, marketing.
Article
This is an elementary introduction to causal inference in economics written for readers familiar with machine learning methods. The critical step in any causal analysis is estimating the counterfactual-a prediction of what would have happened in the absence of the treatment. The powerful techniques used in machine learning may be useful for developing better estimates of the counterfactual, potentially improving causal inference.
Article
This study examines the effects of Internet display advertising using cookie-level data from a field experiment at a financial tools provider. The experiment randomized assignment of cookies to treatment (firm ads) and control conditions (charity ads), enabling the authors to handle different sources of selection bias, including targeting algorithms and browsing behavior. They analyze display ad effects for users at different stages of the company's purchase funnel (i.e., nonvisitor, visitor, authenticated user, and converted customer) and find that display advertising positively affects visitation to the firm's website for users in most stages of the purchase funnel, but not for those who previously visited the site without creating an account. Using a binary logit model, the authors calculate marginal effects and elasticities by funnel stage and analyze the potential value of reallocating display ad impressions across users at different stages. Expected visits increase almost 10% when display ad impressions are partially reallocated from nonvisitors and visitors to authenticated users. The authors also show that results from the controlled experiment data differ significantly from those computed using standard correlational approaches.