The validity of the SERVQUAL
and SERVPERF scales
A meta-analytic view of 17 years of research
across ﬁve continents
Franc¸ois A. Carrillat
Department of Marketing, University of Texas at Arlington, Arlington,
Texas, USA, and
Jay P. Mulki
Marketing Group, Northeastern University, Boston, Massachusetts, USA
Purpose – The purpose is to investigate, the difference between SERVQUAL and SERVPERF’s
predictive validity of service quality.
Design/methodology/approach – Data from 17 studies containing 42 effect sizes of the
relationships between SERVQUAL or SERVPERF with overall service quality (OSQ) are
Findings – Overall, SERVQUAL and SERVPERF are equally valid predictors of OSQ. Adapting the
SERVQUAL scale to the measurement context improves its predictive validity; conversely, the
predictive validity of SERVPERF is not improved by context adjustments. In addition, measures of
services quality gain predictive validity when used in: less individualistic cultures, non-English
speaking countries, and industries with an intermediate level of customization (hotels, rental cars, or
Research limitations/implications – No study, that were using non-adapted scales were
conducted outside of the USA making it impossible to disentangle the impact of scale adaptation vs
contextual differences on the moderating effect of language and culture. More comparative studies on
the usage of adapted vs non-adapted scales outside the USA are needed before settling this issue
Practical implications – SERVQUAL scales require to be adapted to the study context more so
than SERVPERF. Owing to their equivalent predictive validity the choice between SERVQUAL or
SERVPERF should be dictated by diagnostic purpose (SERVQUAL) vs a shorter instrument
Originality/value – Because of the high statistical power of meta-analysis, these ﬁndings could be
considered as a major step toward ending the debate whether SERVPERF is superior to SERVQUAL
as an indicator of OSQ.
Keywords Services, SERVQUAL, Culture, Quality
Paper type Research paper
Over the years, marketing researchers have reached consensuses on several issues related
to the domain of services. First, as the economy has become mostly service-based,
researchers now consider the marketing discipline as being service dominated.
The current issue and full text archive of this journal is available at
Received 9 January 2006
Revised 4 February 2007
Accepted 17 May 2007
International Journal of Service
Vol. 18 No. 5, 2007
qEmerald Group Publishing Limited
Consumers in OECD countries spend more on services than for tangible goods (Martin,
1999). Indeed, service activities constitute about 70 percent of OECD (2005) countries GDP,
and this trend is expected to continue in the coming decade. The globalization of services
marketing has presented both academics and practitioners challenges and opportunities
in this area (Javalgi et al., 2006). Reﬂecting this changing emphasis services marketing
has become a wellestablished ﬁeld of academic inquiry and now represents an alternative
paradigm to the marketing of goods (Lovelock and Gummesson, 2004).
Researchers also agree that a central topic in service research is service quality (SQ),
which is a critical determinant of business performance as well as ﬁrms’ long-term
viability (Bolton and Drew, 1991; Gale, 1994). This is because SQ leads to customer
satisfaction which in turn has a positive impact on customer word-of-mouth,
attitudinal loyalty, and purchase intentions (Gremler and Gwinner, 2000). The view
that SQ results from customers’ evaluation of the service encounter prevails in the
literature (Cronin and Taylor, 1992; Parasuraman et al., 1985). Under this perspective,
researchers further agree that SQ is best represented as an aggregate of the discrete
elements from the service encounter such as reliability, responsiveness, competence,
access, courtesy, communication, credibility, security, understanding, and tangible
elements of the service offer (Cronin and Taylor, 1992; Dabholkar et al., 2000;
Parasuraman et al., 1985).
On the other hand, the question of the operationalization of SQ has continued to evoke
discussion. This discussion has been primarily centered on two important issues. The
ﬁrst relates to the debate of whether SERVQUAL or SERVPERF should be used for
measuring SQ (Cui et al., 2003; Hudson et al., 2004; Jain and Gupta, 2004; Kettinger and
Lee, 1997; Mukherje and Nath, 2005; Quester and Romaniuk, 1997). SERVQUAL,
grounded in the Gap model, measures SQ as the calculated difference between customer
expectations and performance perceptions of a service encounter (Parasuraman et al.,
1988, 1991). Cronin and Taylor (1992) challenged this approach and developed the
SERVPERF scale which directly captures customers’ performance perceptions in
comparison to their expectations of the service encounter. In spite of recent attempts in
the literature toward settling this issue, the SERVQUAL-SERVPERF debate has never
been so relevant. In fact, numerous authors have supported the view that SERVPERF is
a better alternative than SERVQUAL (Babakus and Boller, 1992; Brady et al., 2002;
Brown et al., 1993; Zhou, 2004) while, on the other hand, SERVQUAL has enjoyed and
continues to enjoy widespread acceptance as a measure of SQ (Chebat et al., 1995; Furrer
et al., 2000; Zeithaml and Bitner, 2003). In addition, the web of science reveals that the
original SERVQUAL paper published in 1988, as well as the following 1991 scale
reﬁnement paper has both received more than 46 percent of their total citations within
the last ﬁve years. The same is true of SERVPERF, which also received more than
46 percent of its citations within the last ﬁve years. This indicates that Cronin and
Taylor’s (1994) conceptual arguments in favor of SERVPERF, while it may have
contributed to SERVPERF popularity, have not reduced SERVQUAL’s usage among
scholars. In addition, it suggests that the multilevel scale, offered by Brady and Cronin
(2001) as a reconciling perspective, has not moved researchers away from either
SERVQUAL to SERVPERF. Therefore, shedding light on whether one scale is better
than the other remains a very important question to be answered.
The second issue centers on the trade-off between the generalizability – and
speciﬁcity level of the SERVQUAL and SERVPERF scales (Asubonteng et al., 1996).
The validity of
A scale can be applied in more diversiﬁed contexts as its items become more abstract
(Babakus and Boller, 1992; Dabholkar et al., 2000). However, this limits the scale’s
ability to capture speciﬁc context elements (Babakus and Boller, 1992; Dabholkar et al.,
2000). There is a general acceptance of the need to modify scale items to suit study
context. However, empirical investigation regarding the impact of item adaptation on
scale validity (i.e. when original SERVQUAL/SERVPERF items versus modiﬁed items
are used) has not been undertaken. In addition, research is needed to assess the
appropriateness of the SERVQUAL/SERVPERF scales when they are used outside
the USA. This is because differences in national culture or language require not only
modiﬁcation of items but also create distortions in how respondents perceive the
construct under investigation (Herk et al., 2005).
The above discussion raises several important research questions. First, are
SERVQUAL and SERVPERF adequate predictors of SQ? And, as proposed by Cronin
and Taylor (1992), is SERVPERF a better predictor of SQ than SERVQUAL? Second, is
there an improvement in the predictive validity of the SERVQUAL and SERVPERF
measures when the scale items are adapted to the study context? Third, does the
predictive power of SERVQUAL and SERVPERF depend on national culture or scale
language? Finally, is the predictive validity of SERVQUAL and SERVPERF inﬂuenced
by the type of industry in which the study is conducted?
The current study addresses these research questions by meta-analyzing empirical SQ
research. Meta-analysis is appropriate for addressing these research questions because it
systematically integrates ﬁndings across studies, controls for statistical artifacts, and
provides very robust answers about relationships among variables (Arthur et al.,2001;
Hunter and Schmidt, 2004). Our meta-analytic framework relies on 42 effect sizes from 17
empirical studies conducted across ﬁve continents spanning 17 years.
Previous research has already attempted to compare SERVQUAL and SERVPERF
(Brady et al., 2002; Cronin and Taylor, 1992; Cui et al., 2003; Hudson et al., 2004; Jain and
Gupta, 2004; Kettinger and Lee, 1997; Quester and Romaniuk, 1997). However,
considering these studies individually provide dispersed evidence that might add, rather
than subtract, ambiguity surrounding the measurement debate. For instance, Jain and
Gupta (2004) as well as Kettinger and Lee (1997) found that SERVPERF was more
strongly correlated to overall service quality (OSQ) than SERVQUAL whereas Quester
and Romaniuk (1997) reported that SERVQUAL exhibited a stronger relationship with
OSQ than SERVPERF. In some cases, studies comparing SERVQUAL and SERVPERF
focus on dimensionality issues without considering predictive validity (Cui et al., 2003;
Hudson et al., 2004). Furthermore, the aforementioned studies rely on one or two samples
at most which prevent them from drawing robust conclusions and from testing the
impact of contingency factors such as country, language, or industry. Therefore,
the current research constitutes a signiﬁcant contribution to the service literature
because it provides answers to the SERVQUAL/SERVPERF validity debate tackled by
Cronin and Taylor (1992, 1994), Brady et al. (2002), and Parasuraman et al. (1994).
In addition, because meta-analysis is based on the accumulation of empirical evidence
over the years, it allows investigating moderating factors by comparing sub-groups of
studies that share a similar characteristic – e.g. the country where the sample was
drawn (Lipsey and Wilson, 2001).
This paper is organized as follows. First, a review of the literature is presented
and hypotheses are developed. Second, a description of the meta-analytic
procedure is provided. Third, results, as well as implications and suggestions for
further research, are discussed.
Both SERVQUAL and SERVPERF’s operationalizations relied on the conceptual
deﬁnition that SQ is an attitude toward the service offered by a ﬁrm resulting from a
comparison of expectations with performance (Parasuraman et al., 1985, 1988; Cronin
and Taylor, 1992). However, SERVQUAL directly measures both expectations – and
performance perceptions whereas SERVPERF only measures performance
perceptions. SERVPERF uses only performance data because it assumes that
respondents provide their ratings by automatically comparing performance
perceptions with performance expectations. Thus, SERVPERF assumes that directly
measuring performance expectations is unnecessary.
Research comparing the predictive validity of SERVQUAL with SERVPERF has
been based on assessing which of the two measures is a better predictor of OSQ. OSQ
has been used as the criterion because it is a global representation of the quality of the
service offered by an organization (Cronin and Taylor, 1992, 1994; Jain and Gupta,
2004; Kettinger and Lee, 1997; Quester and Romaniuk, 1997). In their comparison of
SERVQUAL with SERVPERF, Cronin and Taylor (1992) built their argument for the
superiority of SERVPERF over SERVQUAL by empirically showing that SERVPERF
is a better predictor of OSQ than SERVQUAL. Also, Parasuraman et al. (1988) assessed
the construct validity of SERVQUAL by evaluating whether the scale was an adequate
predictor of OSQ. In view of this, the predictive validity of SERVQUAL and
SERVPERF is assessed by meta-analyzing extant empirical research on the strength of
the relationship between each scale and OSQ.
The predictive validity of SERVQUAL and SERVPERF
SERVQUAL and SERVPERF are based on rigorous scale development procedures
(Parasuraman et al., 1988, 1991) and have been widely used by researchers. Therefore,
it is expected that both the SERVQUAL and SERVPERF measures of SQ will be
strongly related to OSQ. The literature on scale development does not speciﬁcally point
to a particular correlation value with a criterion against which the predictive validity of
a scale can be assessed. However, it is possible to turn to less formal guidelines
formulated by researchers. According to Cohen’s (1992) rule of thumb, a “small” effect
size is observed when the correlation is 0.10, a “medium” effect size is obtained when
the correlation is 0.30, and a “large” effect size corresponds to a correlation of 0.50.
These guidelines have been previously used to qualify the strength of meta-analytic
correlations (Jaramillo et al., 2005). Therefore, the following is hypothesized:
H1. The correlation between SERVQUAL or SERVPERF and OSQ will be strong
and above 0.50.
The disconﬁrmation vs performance-only debate
In Parasuraman et al.’s (1985) “disconﬁrmation” perspective, the SQ construct is seen as
an attitude resulting from customers’ comparison of their expectations about the service
encounter with their perceptions of the service encounter. The SERVQUAL instrument
operationalizes this construct as the difference between expected and actual (perceived)
performance (Parasuraman et al., 1988, 1991). Alternatively, SERVPERF is based on the
The validity of
“performance only” perspective and operationalizes SQ as customers’ evaluations of
the service encounter. As a result, SERVPERF uses only the performance items of the
SERVQUAL scale (Brady et al., 2002; Cronin and Taylor, 1992, 1994).
In discussing the relative merits of each scale, the debate has been primarily
centered on predictive validity and speciﬁcally on whether SERVQUAL or SERVPERF
better captures SQ. First, some researchers have argued that SERVPERF is a better
measure because it does not depend on ambiguous customers’ expectations.
Arguments in favor of SERVPERF are based on the notion that performance
perceptions are already the result of customers’ comparison of the expected and actual
service (Babakus and Boller, 1992; Oliver and DeSarbo, 1988). Therefore, performance
only measures should be preferred to avoid redundancy. Second, as Teas (1993)
points out, Parasuraman et al.’s (1991) conceptualization of SQ is inconsistent with its
operationalization. Teas (1993) argues that, since Parasuraman et al. (1991) deﬁne
expectations as a type of attitudes, customer expectations must be considered as ideal
points. Hence, the Gap model implication that superior perceptions of SQ occur when
performance increasingly exceeds expectations is theoretically inconsistent. The
classical attitudinal perspective suggests that positive attitudes are formed when
evaluations of an object are close to an expected ideal point. Therefore, SQ should peak
when perceptions equal expectations (Teas, 1993).
Parasuraman et al. (1994) defended SERVQUAL by demonstrating that there was
virtually no difference in predictive power between SERVQUAL and SERVPERF.
Although, discussions have continued on whether disconﬁrmation-based measures are
superior to performance-only based measures (Dabholkar et al., 2000; Hudson et al.,
2004; Jain and Gupta, 2004), the above discussed arguments point toward the
superiority of SERVPERF over SERVQUAL. Thus:
H2. The relationship between SQ and OSQ is stronger when SQ is measured with
SERVPERF than with SERVQUAL.
Any scale represents a compromise between relevance and the extent to which it can be
applied in a wide array of contexts (Babakus and Boller, 1992). Scale modiﬁcation is
done by adding, deleting or rewording items to ensure suitability for a particular
research context. SERVQUAL and SERVPERF scale modiﬁcations have led to
.the universal versus context speciﬁc character of the scales; and
.whether changes to ﬁt a speciﬁc context result in better predictive validity.
It is important to mention that in their original development, SERVQUAL and
SERVPERF were purported to be universal measures of SQ because the scale
development process relied on samples from multiple industries (Cronin and Taylor,
1992; Parasuraman et al., 1988). However, Parasuraman et al. (1988) recognize that
SERVQUAL can be adapted to the speciﬁc research needs of a particular organization.
As Rossiter (2002) indicates, the speciﬁcities of the measurement context play an
important role in construct validity.
Researchers are particularly concerned about the effect of environmental factors on
the validity of SQ scales (Babin et al., 2004). In fact, researchers have failed to replicate
the ﬁve original dimensions of the SERVQUAL/SERVPERF scales, namely tangibility,
reliability, responsiveness, assurance, and empathy (White and Schneider, 2000).
Based on this, researchers have noted that SQ scales need to be adapted to the study
context (Carman, 1990). For instance, tangibility might not be relevant for a cable
company because the customer might never see the facilities of the service provider,
whereas it may be critical for a healthcare facility customer. In their study on the
photography industry, Dabholkar et al. (2000) dropped items related to physical
facilities (tangibility) from the original SERVQUAL because customers did not have to
visit the company’s site; however, they added items related to “salespeople pressure”
that are absent from SERVQUAL. The above discussion suggests that context adapted
versions of SERVQUAL and SERVPERF, hereinafter referred to as MQUAL and
MPERF, will have a better predictive validity than non-modiﬁed versions (QUAL or
PERF, respectively). Thus:
H3a. The relationship between SQ and OSQ will be stronger when SQ is measured
with MQUAL rather than with QUAL.
H3b. The relationship between SQ and OSQ will be stronger when SQ is measured
with MPERF rather than with PERF.
Studies using SERVQUAL and SERVPERF have been conducted across more than 17
countries and on each and every continent. The use of these scales in an international
context raises a legitimate concern about validity across borders because research has
shown that cultural values inﬂuence customer responses on measures of SQ (Laroche
et al., 2004; Zhou, 2004). According to Herk et al. (2005), research conducted
internationally can be affected both by construct bias (i.e. the construct studied differs
across countries) and item bias (i.e. items are distorted when used internationally). For
instance, Sultan et al. (2000) found signiﬁcant differences across US and European
passengers on their expectations and performance perceptions of airlines SQ. In
addition, Mattila (1999) found that Western customers are more likely than their Asian
counterparts to rely on tangible cues from the physical environment, which evidences
that the tangibility dimension of SERVQUAL is more important for them.
Researchers have found that cultural differences can also create item bias.
Steenkamp and Baumgartner (1998) show that both:
(1) the metric invariance (i.e. the interpretation of the distance between the scale
(2) the scalar invariance (i.e. whether scale latent means have systematic biases) of
items become uncertain when scales are used across cultures.
In fact, Diamantopoulos et al. (2006) found that international differences in response
styles (i.e. item wording, type of scale, etc.) generate item bias. Therefore, we propose
that SERVQUAL and SERVPERF are likely to be affected by construct and item biases
when used in international settings.
In order to account for cultural differences, it was decided to rely on Hofstede’s (1997)
individualism/collectivism (IDV) measure of national culture. IDV is useful and
parsimonious for explaining cross-cultural differences in attitudes and behaviours. Also,
IDV has satisfactory reliability and uni-dimensionality (Cano et al., 2004; Triandis, 1995).
The validity of
Research indicates that IDV may affect perceptions of OSQ and its dimensions. For
instance, Furrer et al. (2000) argue that, in high individualistic cultures, consumers tend
to be independent, have an ethic of self-responsibility and demand a higher level of SQ.
Furrer et al. (2000) also note that individualistic consumers prefer to maintain a
signiﬁcant distance between themselves and the service provider. In addition, their
study results show that consumers with a high degree of individualism considered
“responsiveness” and “tangibles” dimensions as more important compared to
consumers from collectivistic cultures. Individualistic customers tend focus on their
own beneﬁts and interests, and expect the service providers to do the best in catering to
their needs (Donthu and Yoo, 1998). Thus, individualistic customers pay careful
attention to the service provided and are not likely to accept lower SQ. Donthu and Yoo’s
(1998) study showed that individualistic customers have higher OSQ expectations,
higher empathy and assurance expectations from their service providers compared to
customers from collectivistic societies. SERVQUAL and SERVPERF were developed in
the USA, a country with the highest IDV level (Hofstede, 1997). In view of this, the
existing dimensions of the SERVQUAL and SERVPERF scales should match more
closely with the expectations of consumers from individualistic countries. As a result, it
is expected that the predictive validity of SERVQUAL will be diminished in countries
with a lower IDV level:
H4a. The strength of the relationship between SERVQUAL or SERVPERF and
OSQ decreases as the degree of individualism of the country decreases.
It is generally known that language translation can be a worsening factor of cultural
bias. Even when scales are carefully translated and closely checked by experts
(Witkowski and Wolﬁnbarger, 2002; Zhou, 2004), the absence of a concept in a language
does not permit a perfect accuracy in scale translation (Herk et al., 2005). Thus, scale
translation can result in higher measurement error which attenuates relationships
among constructs (Hunter and Schmidt, 2004). Therefore, the following is hypothesized:
H4b. The strength of the relationship between SERVQUAL or SERVPERF and
OSQ is stronger when SERVQUAL or SERVPERF is administered in English
than when translated.
Type of services
It is expected that SERVQUAL or SERVPERF will perform differently depending on
the industry in which they are used. This is because the relevance of the scale
dimensions depends on the study setting (White and Schneider, 2000). Many
categorizations of services have been proposed in the literature (Bitner, 1992; Lovelock,
1983; Silvestro et al., 1992). Among the numerous service classiﬁcations, Silvestro
et al.’s (1992) production perspective has emerged as integrative of other service
typologies. Silvestro et al. (1992) divide service providers into the following three
groups that range from lower to higher intensity of customer processing:
(1) Professional services (PS) (i.e. low customer processing intensity) include services
provided by lawyers, business consultants, or ﬁeld engineering. Some
characteristics of this group are: few transactions, highly customized,
process-oriented and long customer contact times. Value is added by front ofﬁce
service employees who rely extensively on their own judgment to perform
(2) Service shops (SS) (i.e. intermediate customer processing intensity) such as
hotels, rental cars, or banks. This group has an intermediate level of
customization and judgment from service employees. Value added is generated
in both the back and front ofﬁces.
(3) Mass services (MS) (i.e. high customer processing intensity) such as provided
by retailer, transportation, or confectionery. The group has many customer
transactions, few contact opportunities, and limited customization. Value added
comes from the back ofﬁce and service employees use little judgment.
According to Silvestro et al. (1992), as the intensity of custome r-processing decreases, the
emphasis on process rather than product intensiﬁes. The process elements of a service
are by nature intangible while the product elements are more tangible (Zeithaml and
Bitner, 2003). Therefore, less customer-processing-oriented service industries will have
more intangible service offers. Because, SERVQUAL is purp orted to measure the service
aspects of the quality of customer experience, it is expected to perform better when
customer-processing intensity decreases while intangibility increases. Thus:
H5. The strength of the relationship between SERVQUAL and OSQ decreases as
the service category moves from PS to services shop and to MS.
All studies containing an effect size (
) that measures the strength of the relationship
between SQ (SERVQUAL, SERVPERF) and OSQ were eligible for inclusion.
Valid statistics included Pearson’s correlation coefﬁcients (r) or any other statistics
that could be converted to r, such as F-value, t-value, p-value, and
studies published in 1988 or after and available before May 30, 2005 were included in this
meta-analysis. This timeframe is used since SERVQUAL was ﬁrst published in 1988.
The following procedure was used to obtain an ample collection of studies reporting
the desired effect sizes. First, an electronic search of the following databases was
conducted: Direct Science,Emerald,ProQuest (ABI/INFORM Global and dissertation
abstracts). Second, a manual examination of the articles identiﬁed from the
computer-based searches was carried out. Third, manual searches of leading
marketing and service journals were conducted. To contact marketing researchers, a
call for working papers, forthcoming articles, conference papers, and unpublished
research was posted on ELMAR-AMA (,5,000 subscribers). The search process
yielded a total of 17 studies containing 42 effect sizes resulting from studying 9,880
respondents (Table I).
Meta-analyses can be conducted using either a ﬁxed-effect (FE) or a random-effect (RE)
model (Hunter and Schmidt, 2004). A FE model assumes that the same
underlies the observed effect sizes in all the studies, whereas the RE model allows for
The validity of
Country IDV score
Angur et al. (1999) QUAL USA 91 English Mass 143 0.70
Angur et al. (1999) PERF USA 91 English Mass 143 0.72
Babakus and Boller (1992) PERF USA 91 English Shop 520 0.66
Bojanic (1991) PERF USA 91 English Pro 32 0.57
Brady et al. (2002) MPERF USA 91 English *1548 0.62
Cronin and Taylor (1992) PERF USA 91 English *660 0.60
Cronin and Taylor (1992) QUAL USA 91 English *660 0.54
Dabholkar et al. (2000) MQUAL USA 91 English Mass 397 0.78
Dabholkar et al. (2000) MPERF USA 91 English Mass 397 0.65
Freeman and Dart (1993) MQUAL Canada 80 English Pro 217 0.63
Jabnoun and Al-Tamimi (2003) MQUAL UAI 38 Non-English Mass 462 0.82
Lam (1995) PERF Hong Kong 25 English Mass 214 0.82
Lam (1995) QUAL Hong Kong 25 English Mass 214 0.69
Lam (1997) PERF Hong Kong 25 English Pro 82 0.71
Lee et al. (2000) MQUAL USA 91 English Shop 196 0.75
Lee et al. (2000) MQUAL USA 91 English Pro 128 0.59
Lee et al. (2000) MPERF USA 91 English Mass 197 0.72
Lee et al. (2000) MPERF USA 91 English Pro 128 0.71
Lee et al. (2000) MPERF USA 91 English Shop 196 0.81
Lee et al. (2000) MQUAL USA 91 English Mass 197 0.47
Mehta et al. (2000) MPERF Singapore 20 Non-English Shop 161 0.63
Mehta et al. (2000) MPERF Singapore 20 Non-English Shop 161 0.75
Mittal and Lassar (1996) MQUAL USA 91 English Pro 123 0.79
Mittal and Lassar (1996) QUAL USA 91 English Pro 123 0.77
Mittal and Lassar (1996) MQUAL USA 91 English Pro 110 0.86
Mittal and Lassar (1996) QUAL USA 91 English Pro 110 0.85
Coding of effect sizes
included in the
Country IDV score
Pariseau and McDaniel (1997) MQUAL USA 91 English Mass 39 0.71
Quester and Romaniuk (1997) PERF Australia 90 English Pro 182 0.55
Quester and Romaniuk (1997) QUAL Australia 90 English Pro 182 0.51
Smith (1999) MQUAL UK 89 English Pro 177 0.38
Smith (1999) MPERF UK 89 English Pro 177 0.36
Wal et al. (2002) QUAL South Africa 65 English Shop 583 0.08
Witkowski and Wolﬁnbarger (2002) MQUAL Germany 67 Non-English Mass 101 0.63
Witkowski and Wolﬁnbarger (2002) MQUAL USA 91 English Shop 86 0.62
Witkowski and Wolﬁnbarger (2002) MQUAL USA 91 English Shop 75 0.62
Witkowski and Wolﬁnbarger (2002) MQUAL Germany 67 Non-English Shop 114 0.56
Witkowski and Wolﬁnbarger (2002) MQUAL Germany 67 Non-English Mass 132 0.54
Witkowski and Wolﬁnbarger (2002) MQUAL USA 91 English Mass 81 0.59
Witkowski and Wolﬁnbarger (2002) MQUAL USA 91 English Pro 103 0.59
Witkowski and Wolﬁnbarger (2002) MQUAL Germany 67 Non-English Pro 105 0.58
Witkowski and Wolﬁnbarger (2002) MQUAL USA 91 English Shop 105 0.57
Witkowski and Wolﬁnbarger (2002) MQUAL Germany 67 Non-English Shop 119 0.57
QUAL ¼original SERVQUAL, MQUAL ¼modiﬁed SERVQUAL, PERF ¼original SERVPERF, MPERF ¼modiﬁed SERVPERF;
type of service industry based on Silvestro et al. (1992);
observed effect size; *these studies relied on multiple
industries spanning across service types and were not included in this moderator analysis
The validity of
variation of the population parameter
across studies. Credibility intervals (i.e. the
distribution of population parameter values) were computed in addition to conﬁdence
intervals (i.e. the range of the true population value) (Hunter and Schmidt, 2004).
Hunter and Schmidt’s (2004) RE model was used as it accounts for both random and
systematic variance and has been shown to yield very accurate credibility intervals in
simulation studies (Hall and Brannick, 2002). Also, both the observed mean
correlations (r) and the corrected mean correlations (r
) were estimated by following
Arthur et al.’s (2001) procedure to account for measurement error.
Test of moderators
When estimating the signiﬁcance of nominal moderator variables with two categories,
we relied on the “standard method” as advised by Schenker and Gentleman (2001) and
implemented in a recent marketing meta-analysis (Jaramillo et al., 2005). The “standard
method” consists of building only one interval around the difference between the
two-point estimates by adding and subtracting the appropriate z-value multiplied by
the square root of the sum of the squared SE of each point estimate. If that interval does
not include zero, the difference between the two point estimates is statistically
signiﬁcant. The standard method is preferred to comparisons of conﬁdence intervals
since testing of moderating hypotheses has greater statistical power (Jaramillo et al.,
2005; Schenker and Gentleman, 2001). Note that since all the moderator hypotheses
were directional, the z-value used for computing the interval around the difference
between the point estimates corresponded to a 90 percent conﬁdence level to generate
level of 0.05 as in a one-tailed test (Jaramillo et al., 2005).
When testing for continuous moderators, or nominal moderators with more than two
categories, the weighted regression approach of Lipsey and Wilson (2001) was adopted.
This procedure consists in regressing the disattenuated effect sizes on independent
variables (continuous or dummy coded) with w
(the inverse variance component which
gives more weight to effect sizes coming from homogeneous distributions) as the weight
for each observation. The moderation effect of IDV is tested using weighted regression
analysis. Weighted regression analysis is adequate to test the moderating effect of IDV
since it is a continuous variable (Cano et al., 2004; Lipsey and Wilson, 2001).
Table II presents the results of the meta-analysis. The overall strength of the relationship
between SERVQUAL and OSQ is larger than 0.50 (r¼0.58; r
CI90percent ¼0:50 20:66). The average SERVPERF and OSQ correlation is also larger
than 0.50 (r¼0.64; r
¼0.75; CI90percent ¼0:52 20:77). Since, the lower bound values
of the 90 percent conﬁdence intervals for both SERVQUAL and SERVPERF are above
0.50, the grand mean correlations can be interpreted as large (Cohen, 1992). This indicates
that both SERVQUAL and SERVPERF are valid measures of SQ, thus bringing support
for H1. The presence of moderators of the SERVQUAL-OSQ and SERVPERF-OSQ
relationships is evidenced in statistically signiﬁcant Q-statistics (Table II). The Q-statistic
is distributed as a
with k21 degrees of freedom and is compared to the corresponding
statistic. A signiﬁcant Q-statistic demonstrates that the effect size distribution
is heterogeneous and indicates that the population varies systematically according to
some factors other than subject level sampling and measurement errors (Lipsey and
Percentage of variance
Overall 42 9.880 0.61 0.71 288 0.56-0.66 0.44-0.98 67.9 2.982
SERVQUAL 27 5.082 0.58 0.68 241 0.50-0.66 0.33-1.03 70.9 1.836
QUAL 7 2.015 0.46 0.54 56 0.27-0.66 0.11-0.96 71.6 378
MQUAL 20 3.067 0.66 0.77 57 0.60-0.72 0.57-0.97 64.4 1.540
SERVPERF 15 4.798 0.64 0.75 38 0.52-0.77 0.34-1.15 68.7 1.125
PERF 7 1.751 0.65 0.73 11 0.59-0.71 0.62-0.84 64.2 511
MPERF 8 2.965 0.64 0.75 29 0.57-0.70 0.64-0.87 42.9 600
English speaking 34 8.525 0.60 0.70 260 0.54-0.66 0.42-0.97 67.8 2.380
Non English speaking 8 1.355 0.69 0.79 19 0.60-0.77 0.63-0.96 61.6 632
Number of effect sizes;
attenuated mean effect size;
disattenuated (i.e. corrected) mean effect size;
critical values range from 12.59
at the 95 percent level;
at the 90 percent level;
variance explained by sample and measurement artifact;
fail-safe N: number of studies with an
effect size of zero (r
¼0) needed to reduce the mean effect size (r
) to 0.01
results and categorical
moderators for the
relationship between SQ
The validity of
H2 posited that the relationship between SERVPERF and OSQ is stronger than
the SERVQUAL-OSQ relationship. However, a comparison of the strength of these
relationships reveals no signiﬁcant difference. As shown in Table II, although the
mean SERVPERF-OSQ correlation (r
¼0.75) is larger than the SERVQUAL-OSQ
¼0.68), the difference is not statistically signiﬁcant. In effect, the
90 percent conﬁdence interval for the difference between the two point estimates
¼0.75 and r
¼0.68) includes zero (CI90percent ¼20:06 to 0:19), indicating that
there is no signiﬁcant difference between the predictive validity of SERVQUAL versus
SERVPERF (Schenker and Gentleman, 2001).
H3a and H3b stated that the modiﬁed SERVQUAL or SERVPERF scales would be
more strongly related to OSQ than the original scales. The observed difference between
the predictive validity of the original SERVQUAL and its modiﬁed version was
statistically signiﬁcant (QUAL r
¼0.54 vs MQUAL r
CI90percent ¼0:06 20:40). This suggests that the predictive validity of SERVQUAL
increases when it is adapted to the study context. However, the observed difference
between the predictive validity of the original version of SERVPERF and its modiﬁed
version was not statistically signiﬁcant (PERF r
¼0.73 vs MPERF r
¼0.02, CI90percent ¼20:04 20:09). This suggests that the predictive validity of
SERVPERF does not change when the scale is modiﬁed.
According to H4a and H4b, the predictive validity of SERVQUAL on OSQ
.as the individualism of the country sample decreases; and
.when the study is conducted in a non-English speaking country.
A weighted regression with the disattenuated correlations between SQ and OSQ as the
dependent variable, and IDV as the independent variable, revealed that a country’s
individualism negatively impacts the predictive validity of SERVQUAL (B¼20.001,
p,0.05), which is contrary to what was hypothesized in H4a (Table III). In addition, the
mean effect size for English speaking countries was smaller than the mean effect size for
non-English speaking countries (non-English speaking r
¼0.79 vs English speaking
¼0.70; CI90percent ¼0:04 20:16); thus, not providing support for H4b (Table II).
According to H5, when moving from lower to higher levels of customer processing
intensity, the predictive validity of SERVQUAL on OSQ should decrease. H5 implied
that the SERVQUAL-OSQ relationships should be strongest for PS followed by SS, and
weakest for MS. As shown in Table III, the strongest SERVQUAL-OSQ relationships
¼20.001 0.0004 22.68 *
¼0.096 0.035 2.78 *
¼20.12 0.037 23.28 *
Notes: *Signiﬁcant at 1¼0.05;
when applied in a meta-analytic study, although the
estimates are accurate, their standard errors need to be adjusted; Lipsey and Wilson (2001) indicate
that the standard errors of the
coefﬁcients need be divided by the square root of the mean square
residuals of the regression model in order to yield z-value used for signiﬁcance testing;
services is the base level; B
corresponds to service shops and B
to mass services
Results for continuous
are for SS (B
¼0.096, p,0.05), followed by PS (base line), and then MS (B
p,0.05). Hence, H5 is not supported.
The study results have important implications because they question isolated ﬁndings
from earlier studies. In spite of the discussions and several arguments provided by
researchers about the superiority of SERVPERF over SERVQUAL (Cronin and Taylor,
1992, 1994), the results of this meta-analysis suggest that both scales are adequate and
equally valid predictors of OSQ. Because of the high statistical power of meta-analysis
(Cohn and Becker, 2003), these ﬁndings could be considered as a major step toward ending
the debate whether SERVPERF is superior to SERVQUAL as an indicator of OSQ.
As Parasuraman et al. (1994) pointed out, the use of performance-only (SERVPERF)
vs the expectation/performance difference scale (SERVQUAL) should be governed by
whether the scale is used for a diagnostic purpose or for establishing theoretically
sound models. We believe that the SERVQUAL scale would have greater interest for
practitioners because of its richer diagnostic value. By comparing customer
expectations of service versus perceived service across dimensions, managers can
identify service shortfalls and use this information to allocate resources to improve SQ
(Parasuraman et al., 1994).
Our ﬁndings also reveal that the need to adapt the measure to the context of the
study is greater when SERVQUAL rather than SERVPERF is used. In effect, the
original versions of SERVQUAL had a signiﬁcantly lower OSQ predictive validity
than the modiﬁed versions. However, both the original and modiﬁed versions of
SERVPERF had the same level of OSQ predictive validity. This has important
implications for both practitioners and academics. Practitioners using SERVQUAL for
OSQ diagnostic purposes need to spend greater effort in modifying the scale for
context than SERVPERF users.
Our results also show an interesting pattern. Since, SERVQUAL and SERVPERF
were originally developed in the USA, we expected that the predictive validity of these
instruments would be higher when used in countries with national cultures and
languages similar to the US. However, results show that the predictive validity of
SERVQUAL and SERVPERF on OSQ was higher for non-English speaking countries
and for countries with lower levels of individualism. A closer examination of the
sample used in our study revealed that all studies conducted in non-English speaking
countries as well as those conducted in less individualistic countries relied on modiﬁed
versions of the SERVQUAL scale. Hence, scale modiﬁcation rather than cultural
context could be driving the results. Since, there were no studies conducted outside the
US using non-modiﬁed scales, it was not possible to isolate the effect of national culture
and language. Further, research is needed to address this important issue. An
interesting avenue would be an experimental design where respondents outside the US,
would be given a modiﬁed scale (i.e. adapted to the industry context) and others would
be given the original items; this would allow teasing apart the effects of culture and
scale adaptation on the scale’s validity.
Finally, results suggest the predictive validity of SERVQUAL on OSQ is highest in
medium customer processing intensity contexts with an intermediate degree of
intangibility (SS) followed by low customer processing intensity (PS) and high
customer processing intensity (MS).
The validity of
A plausible explanation for this ﬁnding is that SERVQUAL was developed as a
scale generalizable across service contexts. Hence, predictive validity peaks in the
category that represents a compromise between the emphasis on process and product
(i.e. service shop). Another reason could be the varying degree of importance of the
service used in the analysis to the customer. Additional research is needed for a better
understanding of this result.
With the growing proliferation of technology based self-service (SST) encounters,
factors that contribute to satisfaction and dissatisfaction in the SST customer
interaction have drawn considerable interest from researchers and practitioners
(Meuter et al., 2000). Further, research could explore the degree of predictive validity of
SERVQUAL on OSQ in SST customer interactions.
Like any other meta-analysis, this study is subject to the ﬁle drawer problem which
prevents the true effect size from being uncovered (Lipsey and Wilson, 2001). However,
as shown in Table II, the fail-safe Nstatistic reveals that several hundred studies
unaccounted for, with an effect size of zero, would be necessary to nullify the effect
sizes computed. This strengthens the conﬁdence in the results obtained. Finally, in this
study, SERVQUAL and SERVPERF were only assessed through their predictive
validity of OSQ. A future meta-analysis could employ additional validation techniques.
For example, meta-analysis can be used to construct a broader nomological network
that includes constructs related to SQ such as customer satisfaction, customer loyalty,
purchase intention, and word-of-mouth (Zeithaml, 2000). Researchers could then assess
whether using SERVQUAL or SERVPERF affects the effect of SQ on the above
Angur, M.G., Nataraajan, R. and Jahera, J.S. Jr (1999), “Service quality in the banking industry:
an assessment in a developing economy”, The International Journal of Bank Marketing,
Vol. 17 No. 3, pp. 116-25.
Arthur, W.J., Bennet, W. and Huffcutt, A.I. (2001), Conducting Meta Analysis Using SAS,
Lawrence Earlbaum Associates, Mahwah, NJ.
Asubonteng, P., McCleary, K.J. and Swan, J.E. (1996), “SERVQUAL revisited: a critical review of
service quality”, Journal of Services Marketing, Vol. 10 No. 6, pp. 62-70.
Babakus, E. and Boller, G.W. (1992), “An empirical assessment of the SERVQUAL scale”,
Journal of Business Research, Vol. 24 No. 3, pp. 253-68.
Babin, B., Chebat, J-C. and Michon, R. (2004), “Perceived appropriateness and its effect on quality,
affect and behavior”, Journal of Retailing & Consumer Services, Vol. 11 No. 5, pp. 287-98.
Bitner, M.J. (1992), “Servicescapes: the impact of physical surroundings on customers and
employees”, Journal of Marketing, Vol. 56 No. 2, pp. 57-71.
Bojanic, D.C. (1991), “Quality measurement in professional services ﬁrms”, Journal of
Professional Services Marketing, Vol. 7 No. 2, pp. 27-36.
Bolton, R.N. and Drew, J.H. (1991), “A multistage model of customers’ assessments of service
quality and value”, Journal of Consumer Research, Vol. 17 No. 4, pp. 375-84.
Brady, M.K. and Cronin, J.J. Jr (2001), “Some new thoughts on conceptualizing perceived service
quality: a hierarchical approach”, Journal of Marketing, Vol. 65 No. 3, pp. 34-49.
Brady, M.K., Cronin, J.J. Jr and Brand, R.R. (2002), “Performance-only measurement of service
quality: a replication and extension”, Journal of Business Research, Vol. 55 No. 1, pp. 17-31.
Brown, T.J., Churchill, G.A. Jr and Peter, P.J. (1993), “Improving the measurement of service
quality”, Journal of Retailing, Vol. 68 No. 1, pp. 127-39.
Cano, C.R., Carrillat, F.A. and Jaramillo, F. (2004), “A meta-analysis of the relationship between
market orientation and business performance: evidence from ﬁve continents”,
International Journal of Research in Marketing, Vol. 21 No. 2, pp. 179-200.
Carman, J.M. (1990), “Consumer perceptions of service quality: an assessment of the SERVQUAL
dimensions”, Journal of Retailing, Vol. 66 No. 1, pp. 33-55.
Chebat, J-C., Filiatrault, P., Gelinas-Chebat, C. and Vaninsky, A. (1995), “Impact of waiting
attribution and consumer’s mood on perceived quality”, Journal of Business Research,
Vol. 34 No. 3, pp. 191-6.
Cohen, J. (1992), “A power primer”, Psychological Bulletin, Vol. 112 No. 1, pp. 155-9.
Cohn, L.D. and Becker, B.J. (2003), “How meta-analysis increases statistical power”, Journal of
Applied Psychology, Vol. 8 No. 3, pp. 243-53.
Cronin, J.J. Jr and Taylor, A.S. (1992), “Measuring service quality: a reexamination and an
extension”, Journal of Marketing, Vol. 56 No. 3, pp. 55-67.
Cronin, J.J. Jr and Taylor, A.S. (1994), “SERVPERF versus SERVQUAL: reconciling performance
based and perception based – minus – expectation measurements of service quality”,
Journal of Marketing, Vol. 58 No. 1, pp. 125-31.
Cui, C.C., Lewis, B.R. and Park, W. (2003), “Service quality measurement in the banking sector
Korea”, International Journal of Bank Marketing, Vol. 21 No. 4, pp. 191-201.
Dabholkar, P.A., Shepherd, C.D. and Thorpe, D.I. (2000), “A comprehensive framework for
service quality: an investigation of critical conceptual and measurement issues through a
longitudinal study”, Journal of Retailing, Vol. 76 No. 2, pp. 139-73.
Diamantopoulos, A., Reynolds, N.L. and Simintiras, A.C. (2006), “The impact of response styles
on the stability of cross-national comparisons”, Journal of Business Research, Vol. 59 No. 8,
Donthu, N. and Yoo, B. (1998), “Cultural inﬂuences on service quality expectations”, Journal of
Service Research, Vol. 1 No. 2, pp. 178-86.
Freeman, K.D. and Dart, J. (1993), “Measuring the perceived quality of professional business
services”, Journal of Professional Services Marketing, Vol. 9 No. 1, pp. 27-47.
Furrer, O., Liu, B.S-C. and Sudharshan, D. (2000), “The relationships between culture and service
quality perceptions: basis for cross-cultural market segmentation and resource allocation”,
Journal of Service Research, Vol. 2 No. 4, pp. 355-71.
Gale, B.T. (1994), Managing Customer Value: Creating Quality and Service that Customers can
See, The Free Press, New York, NY.
Gremler, D.D. and Gwinner, K.P. (2000), “Customer-employee rapport in service relationships”,
Journal of Service Research, Vol. 3 No. 1, pp. 82-104.
Hall, S.M. and Brannick, M.T. (2002), “Comparison of two random-effects methods of
meta-analysis”, Journal of Applied Psychology, Vol. 87 No. 2, pp. 377-89.
Herk, H.V., Poortinga, Y.H. and Verhallen, T.M.M. (2005), “Equivalence of survey data: relevance
for international marketing”, European Journal of Marketing, Vol. 39 Nos 3/4, pp. 351-64.
Hofstede, G. (1997), Cultures and Organizations: Software of the Mind, McGraw-Hill, Berkshire.
Hudson, S., Hudson, P. and Miller, G.A. (2004), “The measurement of service quality in the tour
operating sector: a methodological comparison”, Journal of Travel Research, Vol. 42 No. 3,
The validity of
Hunter, J.E. and Schmidt, F.L. (2004), Methods of Meta-Analysis: Correcting Error and Bias in
Research Findings, 2nd ed., Sage, Thousand Oaks, CA.
Jabnoun, N. and Al-Tamimi, H.A.H. (2003), “Measuring perceived service quality at UAE
commercial banks”, The International Journal of Quality & Reliability Management, Vol. 20
Nos 4/5, pp. 458-72.
Jain, S.K. and Gupta, G. (2004), “Measuring service quality: SERVQUAL vs SERVPERF scales”,
The Journal for Decision Makers, Vol. 29 No. 2, pp. 25-37.
Jaramillo, F., Carrillat, F.A. and Locander, W.B. (2005), “A meta-analytic comparison of
managerial ratings and self-evaluations”, Journal of Personal Selling & Sales Management,
Vol. 25 No. 4, pp. 315-29.
Javalgi, R.R.G., Martin, C.L. and Young, R.B. (2006), “Marketing research, market orientation and
customer relationship management: a framework and implications for service providers”,
Journal of Services Marketing, Vol. 20 No. 1, pp. 12-23.
Kettinger, W.J. and Lee, C.C. (1997), “Pragmatic perspectives on the measurement of information
systems service quality”, MIS Quarterly, Vol. 21 No. 2, pp. 223-41.
Lam, S.S.K. (1995), “Measuring service quality: an empirical analysis in Hong Kong”,
International Journal of Management, Vol. 12 No. 2, pp. 182-8.
Lam, S.S.K. (1997), “SERVQUAL: a tool for measuring patients’ opinions of hospital service
quality in Hong Kong”, Total Quality Management, Vol. 8 No. 4, pp. 152-4.
Laroche, M., Ueltschy, L.C., Abe, S., Cleveland, M. and Yannopoulos, P.P. (2004), “Service quality
perceptions and customer satisfaction: evaluating the role of culture”, Journal of
International Marketing, Vol. 12 No. 3, pp. 58-85.
Lee, T., Lee, Y. and Yoo, D. (2000), “The determinants of perceived service quality and its
relationship with satisfaction”, The Journal of Services Marketing, Vol. 14 No. 3, pp. 217-31.
Lipsey, M.W. and Wilson, D.B. (2001), Practical Meta-Analysis, Sage, Thousand Oaks, CA.
Lovelock, C.H. (1983), “Classifying services to gain strategic marketing insights”, Journal of
Marketing, Vol. 47 No. 3, pp. 9-21.
Lovelock, C.H. and Gummesson, E. (2004), “Whither services marketing? In search of a new
paradigm and fresh perspectives”, Journal of Service Research, Vol. 7 No. 1, pp. 20-41.
Martin, C.L. (1999), “The history, evolution and principles of services marketing: poised for the
new millennium”, Marketing Intelligence & Planning, Vol. 17 No. 7, pp. 324-8.
Mattila, A.S. (1999), “The role of culture in the service evaluation process”, Journal of Service
Research, Vol. 1 No. 3, pp. 250-61.
Meuter, M.L., Ostrom, A.L., Roundtree, R.I. and Bitner, M.J. (2000), “Self-service technologies:
understanding customer satisfaction with technology-based service encounters”, Journal
of Marketing, Vol. 64 No. 3, pp. 50-64.
Mehta, S.C., Ashok, K.L. and Han, S.L. (2000), “Service quality in retailing: relative efﬁciency of
alternative measurement scales for different product-service environments”, International
Journal of Retail & Distribution Management, Vol. 28 No. 2, pp. 62-72.
Mittal, B. and Lassar, W.M. (1996), “The role of personalization in service encounters”, Journal of
Retailing, Vol. 72 No. 1, pp. 95-110.
Mukherje, A. and Nath, P. (2005), “An empirical assessment of comparative approach to service
quality measurement”, Journal of Services Marketing, Vol. 19 No. 3, pp. 174-84.
OECD (2005), available at: http://ocde.P4.Siteinternet.Com/publications/doiﬁles/012005061t009.Xls
Oliver, R.L. and DeSarbo, W.S. (1988), “Response determinants in satisfaction judgments”,
Journal of Consumer Research, Vol. 14 No. 4, pp. 495-507.
Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1985), “A conceptual model of service
quality and its implications for future research”, Journal of Marketing, Vol. 49 No. 4,
Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1988), “SERVQUAL: a multiple-item scale for
measuring consumer perception of service quality”, Journal of Retailing, Vol. 64 No. 1,
Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1991), “Reﬁnement and reassessment of the
SERVQUAL scale”, Journal of Retailing, Vol. 67 No. 4, pp. 420-51.
Parasuraman, A., Zeithaml, V.A. and Berry, L.L. (1994), “Reassessment of expectations as a
comparison standard in measuring service quality: implications for further research”,
Journal of Marketing, Vol. 58 No. 1, pp. 111-24.
Pariseau, S.E. and McDaniel, J.R. (1997), “Assessing service quality in schools of business”,
International Journal of Quality & Reliability Management, Vol. 14 No. 3, pp. 204-18.
Quester, P.G. and Romaniuk, S. (1997), “Service quality in the Australian advertising industry:
a methodological study”, Journal of Services Marketing, Vol. 11 No. 3, pp. 180-92.
Rossiter, J.R. (2002), “The C-OAR-SE procedure for scale development in marketing”,
International Journal of Research in Marketing, Vol. 19 No. 4, pp. 305-35.
Schencker, N. and Gentleman, J.F. (2001), “On judging the signiﬁcance of difference by
examining the overlap between conﬁdence intervals”, The American Statistician, Vol. 55
No. 3, pp. 182-6.
Silvestro, R., Fitzgerald, L. and Johnston, R. (1992), “Towards a classiﬁcation of
services processes”, International Journal of Services Industry Management, Vol. 3 No. 2,
Smith, A.M. (1999), “Some problems when adopting Churchill’s paradigm for the development of
service quality measurement scales”, Journal of Business Research, Vol. 46 No. 2,
Steenkamp, J-B.E.M. and Baumgartner, H. (1998), “Assessing measurement invariance in
cross-national consumer research”, Journal of Consumer Research, Vol. 25 No. 1, pp. 78-90.
Sultan, F., Merlin, C. and Simpson, J. (2000), “International service variants: airline passenger
expectations and perceptions of service quality”, Journal of Services Marketing, Vol. 14
No. 3, pp. 188-96.
Teas, K.R. (1993), “Expectations, performance evaluation, and consumers’ perception of quality”,
Journal of Marketing, Vol. 57 No. 4, pp. 18-34.
Triandis, H.C. (1995), Individualism and Collectivism, Westview Press, Boulder, CO.
Wal, van der R.W.E., Pampallis, A. and Bond, C. (2002), “Service quality in a cellular
telecommunications company: a South African experience”, Managing Service Quality,
Vol. 12 No. 5, pp. 323-35.
White, S.S. and Schneider, B. (2000), “Climbing the commitment ladder: the role of expectations
disconﬁrmation on customers’ behavioral intentions”, Journal of Service Research, Vol. 2
No. 22, pp. 240-53.
Witkowski, T.H. and Wolﬁnbarger, M.F. (2002), “Comparative service quality: German and
American ratings across service settings”, Journal of Business Research, Vol. 55 No. 11,
Zeithaml, V.A. (2000), “Service quality, proﬁtability, and the economic worth of customers: what
we know and what we need to learn”, Journal of the Academy of Marketing Science, Vol. 28,
The validity of
Zeithaml, V.A. and Bitner, M.J. (2003), Services Marketing: Integrating Customer Focus across the
Firm, 3rd ed., Irwin McGraw-Hill, Boston, MA.
Zhou, L. (2004), “A dimension-speciﬁc analysis of performance-only measurement of service
quality and satisfaction in china’s retail banking”, The Journal of Services Marketing,
Vol. 18 Nos 6/7, pp. 534-46.
Franc¸ois A. Carrillat can be contacted at: firstname.lastname@example.org
To purchase reprints of this article please e-mail: email@example.com
Or visit our web site for further details: www.emeraldinsight.com/reprints