Content uploaded by Florian Schuberth
Author content
All content in this area was uploaded by Florian Schuberth on Jun 27, 2019
Content may be subject to copyright.
Contents lists available at ScienceDirect
Information & Management
journal homepage: www.elsevier.com/locate/im
How to perform and report an impactful analysis using partial least squares:
Guidelines for confirmatory and explanatory IS research
Jose Benitez
a,b,⁎
, Jörg Henseler
c,d
, Ana Castillo
b
, Florian Schuberth
c
a
Rennes School of Business, Rennes, France
b
Department of Management, School of Business, University of Granada, Granada, Spain
c
Department of Design, Production and Management, Faculty of Engineering Technology, University of Twente, Enschede, the Netherlands
d
Nova Information Management School, Universidade Nova de Lisboa, Lisbon, Portugal
ARTICLE INFO
Keywords:
Partial least squares path modeling
Guidelines
Model validation
Composite model
Confirmatory and explanatory information
systems research
ABSTRACT
Partial least squares path modeling (PLS-PM) is an estimator that has found widespread application for causal
information systems (IS) research. Recently, the method has been subject to many improvements, such as
consistent PLS (PLSc) for latent variable models, a bootstrap-based test for overall model fit, and the heterotrait-
to-monotrait ratio of correlations for assessing discriminant validity. Scholars who would like to rigorously apply
PLS-PM need updated guidelines for its use. This paper explains how to perform and report empirical analyses
using PLS-PM including the latest enhancements, and illustrates its application with a fictive example on
business value of social media.
1. Introduction
Structural equation modeling (SEM) has become an important sta-
tistical tool in social and behavioral sciences. It is capable of modeling
nomological networks by expressing theoretical concepts through
constructs and connecting these constructs via a structural model to
study their relationships [1]. In doing so, random measurement errors
can be taken into account and empirical evidence for postulated the-
ories can be obtained by means of statistical testing.
Two kinds of estimators for SEM can be distinguished: covariance-
based and variance-based estimators. While covariance-based estima-
tors minimize the discrepancy between the empirical and model-im-
plied variance–covariance matrix of the observable indicators to obtain
the model parameter estimates, variance-based estimators create linear
combinations of the indicators as stand-ins for the theoretical concepts
and subsequently estimate the model parameters. A widely used var-
iance-based estimator is partial least squares path modeling (PLS-PM).
Originally developed by Herman O.A. Wold [2] to analyze high-di-
mensional data in a low-structure environment, PLS-PM has become a
full-fledged estimator for SEM over the past decade [3]. Consequently,
PLS-PM has been applied in various fields of business administration
research such as strategy [4], marketing [5], operations management
[6], human resource management [7], finance [8], tourism [9], and
family business [10].
For decades, PLS-PM has been the predominant estimator for
structural equation models in the field of information systems (IS) (e.g.,
[11–15]). IS research usually incorporates complex research problems
and questions that require conceptualization and operationalization of
theoretical concepts, and investigation of their relationships. Current
literature suggests two types of theoretical concepts: concepts from
behavioral sciences and concepts from design science [16]. Theoretical
concepts from behavioral research are assumed to cause observable
indicators and their relationships, i.e., the theoretical concept is the
common cause of observable indicators [17]. Typically, these concepts
are operationalized by a measurement model. Extant literature suggests
two types of measurement models: the reflective [1] and the cau-
sal–formative measurement model [18]. Both types of measurement
models assume a causal relationship between the indicators and their
construct, i.e., the latent variable. In contrast, theoretical concepts from
design science, so-called artifacts, are human-made creations that are
shaped and built by their ingredients to serve a certain goal [19]. Due to
the constructivist nature of this type of theoretical concept, recent lit-
erature suggests to operationalize artifacts by the composite model
[16]. In contrast to the measurement models, in the composite model,
the indicators do not cause the construct, but combine to compose the
construct. To highlight this aspect and to pronounce the difference to
the latent variable, we refer to constructs that are composed of in-
dicators emergent variables [20,21]. In summary, IS scholars can
https://doi.org/10.1016/j.im.2019.05.003
Received 24 October 2017; Received in revised form 29 April 2019; Accepted 18 May 2019
⁎
Corresponding author at: Rennes School of Business, Rennes, France.
E-mail addresses: jose.benitez@rennes-sb.com,joseba@ugr.es (J. Benitez), j.henseler@utwente.nl (J. Henseler), anacastillo@ugr.es (A. Castillo),
f.schuberth@utwente.nl (F. Schuberth).
Information & Management xxx (xxxx) xxx–xxx
0378-7206/ © 2019 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license
(http://creativecommons.org/licenses/BY/4.0/).
Please cite this article as: Jose Benitez, et al., Information & Management, https://doi.org/10.1016/j.im.2019.05.003
operationalize a theoretical concept in three different ways in their
model: reflective and causal–formative measurement model (usually
employed for behavioral concepts), and composite model (usually
employed for artifacts). See Table 1 in Henseler (2017) for a detailed
explanation of the different types of operationalization.
In recent years, PLS-PM has become the subject of scholarly debate.
Proponents called PLS-PM a “silver bullet”[38], while opponents cri-
ticized PLS-PM’s inconsistency for latent variable models and the ab-
sence of a test for overall model fit (e.g. [39],). This debate has sti-
mulated the development of several enhancements to PLS-PM. These
include consistent PLS (PLSc) to consistently estimate linear and non-
linear latent variable models ([40][41],); a bootstrap-based test to
statistically assess overall model fit[42]; measures of overall model fit,
such as the standardized root mean squared residual (SRMR), based on
heuristic rules to evaluate overall model fit[22]; and the heterotrait-to-
monotrait (HTMT) ratio of correlations as a criterion to assess dis-
criminant validity [35]. As a result, PLS-PM has become a full-fledged
estimator to SEM that can deal with reflective and causal–formative
measurement models as well as composite models. Moreover, it can be
applied to confirmatory, explanatory, exploratory, descriptive, and
predictive research [24].
For the field of IS to benefit from these methodological and con-
ceptual achievements in PLS-PM, IS scholars need guidelines for their
empirical studies that incorporate all these new developments and re-
cently obtained insights. Some of the guidelines papers on PLS-PM in
the IS literature were published before 2013 –i.e., before the debate
and resulting enhancements (e.g., [12,43–45]). Although several re-
cently published textbooks and articles (e.g. [46–48],) have provided
guidelines for causal research that cover some of the latest
enhancements in PLS-PM, neither of these prior PLS-PM guidelines for
causal research covered the full range of recent developments.
To fill this gap on guidelines in the current IS literature, this study
provides updated guidelines for using PLS-PM in causal research
(confirmatory and explanatory research), employing all the most re-
cently proposed standards. In so doing, the paper addresses why and
how to perform and report PLS-PM estimation in confirmatory and ex-
planatory IS research. In confirmatory IS research, the scholar aims to
understand the causal relationships between theoretical concepts of
interest for the IS community. In doing so, the scholar aims to confirm a
postulated theory, i.e., obtain empirical evidence for his/her descrip-
tion of the working mechanism of the world. This is tried to be achieved
by imposing testable restrictions on the indicator variance–covariance
matrix, e.g., by fixing path coefficients to a certain value, assuming that
the correlation between two indicators is the result of an underlying
latent variable like in the classical reflective measurement model, or in
the composite model that the correlations of the indicators forming an
emergent variable with a variable not forming the emergent variable
are proportional. The dominant statistical tool in the context of con-
firmatory research is the test for overall model fit. Testing model fit
only makes sense if the number of correlations among observable
variables exceeds the number of model parameters to be estimated, i.e.,
it is indispensable to have a certain amount of parsimony (a positive
number of degrees of freedom in the sense of SEM).
As in confirmatory research, in explanatory IS research, the analyst
aims to understand the causal relationship among the theoretical con-
cepts. However, this type of research primarily focuses on the ex-
planation of a specific phenomenon which is treated as a dependent
variable in the model. In doing so, the primary focus is on the
Table 1
Substantial changes in understanding of PLS-PM.
Traditional view of PLS-PM Up-to-date view of PLS-PM
1. PLS-PM should be used primarily for exploratory and early-stage research PLS-PM can be used for various types of research, e.g., confirmatory and explanatory or
predictive ([22], [23][24],)
2. PLS-PM has advantages over covariance-based estimators when the sample size is
small
PLS-PM can produce estimates even for very small sample sizes. However, as for other
estimators, these estimates are generally less accurate than those obtained by a larger
sample ([25][26],). Hence, the justification of using PLS-PM with small sample sizes
should be considered cautiously
3. PLS-PM can only estimate recursive structural models PLS-PM can also consistently estimate non-recursive structural models by using, e.g.,
2SLS or 3SLS instead of OLS (Dijkstra and Henseler 2015b, [27])
4. Model identification plays no role when employing PLS-PM First, PLS-PM always estimates an underlying composite model, regardless of whether
the model consists of latent variables; identification rules of composite models must thus
be taken into account ([28][134],). Consequently, model identification is also important
in the case of PLS-PM
5. PLS-PM has greater statistical power than the maximum-likelihood (ML) estimator This statement is based on inconsistent parameter estimates and has been shown to be
invalid [30]. Furthermore, an estimator has no statistical power (one refers to its
efficiency, or accuracy, in estimating parameters, usually expressed by the standard
error); only a statistical test can be assessed in terms of its statistical power
6. Mode A can be used to consistently estimate reflective measurement models Regardless of the mode used, PLS-PM creates linear combinations of observed indicators
(composites) as proxies for the theoretical concepts ([31][32],). Therefore, to
consistently estimate models containing latent variables, one must correct for
attenuation of the construct scores correlations. In the context of PLS-PM, this procedure
is known as PLSc (Dijkstra and Henseler 2015a, 2015b)
7. Mode B can be used to estimate causal–formative measurement models consistently Mode B is just another way to obtain weights to build composites; hence, it does not
consistently estimate causal–formative measurement models. However, causal–formative
measurement models can be estimated by means of a MIMIC model ([18][3],)
8. The overall fit of models estimated by PLS-PM cannot be assessed The overall model can be assessed in two non-exclusive ways ([22], Dijkstra and
Henseler 2015b): (1) bootstrap-based tests for overall model fit, and (2) measures of
overall model fit. Both assess the discrepancy between the empirical and the model-
implied indicator variance–covariance matrix. While the latter is based on heuristic
rules, the former is based on statistical inferences
9. Reliability of construct scores obtained by PLS-PM should be assessed using
Cronbach’sαand Dillon–Goldstein’sρ(also called Jöreskog’sρor composite
reliability)
Currently, Dijkstra–Henseler’sρ
A
is the only consistent reliability coefficient for PLS-PM
construct scores [16]. Dillon–Goldstein’sρand Cronbach’sαindicate the reliability of
sum scores. While Cronbach’sαis based on the indicator variance–covariance matrix,
Dillon–Goldstein’sρis based on the factor loadings. Therefore, for the calculation of
Dillon–Goldstein’sρ, consistent factor loading estimates should be used. Moreover,
Cronbach’sαassumes equal population covariances among the indicators of one block;
an assumption that is likely not met in empirical research. However, it can be used as a
lower bound to reliability ([33][34],)
10. Discriminant validity should be examined using the Fornell–Larcker criterion The HTMT [35] should be considered to assess discriminant validity ([36][37],)
J. Benitez, et al. Information & Management xxx (xxxx) xxx–xxx
2
coefficient of determination (R²) and the significance of path coefficient
estimates. Such models can be saturated (i.e., have zero degrees of
freedom in the sense of SEM). Although the two types of research can be
theoretically distinguished, in empirical IS research, scholars very often
combine confirmatory and explanatory IS research, e.g., testing the
measurement model (confirmatory research) and focusing on the ex-
planation of a specific construct in structural model (explanatory re-
search). This paper explains why and how to perform and report em-
pirical analyses using PLS-PM in causal IS research following the latest
enhancements, and illustrates this analysis with a fictive example on
business value of social media.
2. Foundations of PLS-PM
In its current form, PLS-PM is a full-fledged variance-based esti-
mator for SEM that can estimate linear, non-linear, recursive, and non-
recursive structural models ([40][42],). Moreover, it is capable of
dealing with models that contain emergent and latent variables [41],
second-order emergent variables built by latent variables [49], and
ordinal categorical indicators [29]. It can incorporate sampling weights
known as weighted partial least squares (WPLS, see [50]), deal with
correlated measurement errors within a block of indicators [51], and
address multicollinearity among the constructs in the structural model
[52]. It can also be used for multiple group comparison ([53], [54],),
and potential sources of endogeneity can be addressed [55]. Finally,
important-performance map analysis can be used to illustrate the re-
sults of the structural model [56]. For a recent overview of the meth-
odological research on PLS-PM, we refer to [57].
2.1. Model specification
To employ PLS-PM, scholars must transfer their proposed theory
into a statistical model [58]. In the context of SEM, this means that the
theoretical concepts and their hypothesized relationships must be
transferred into a structural model. “Theoretical concepts refer to ideas
that have some unity or something in common. The meaning of a
theoretical concept is spelled out in a theoretical definition”[59]. We
distinguish between two types of theoretical concepts: behavioral
concepts, and design concepts, so-called artifacts. Typically, theoretical
concepts are represented by constructs in the structural model [32].
Although constructs and latent variables are often equated [60], we
deliberately distinguish between a latent variable, i.e., a construct that
represents a behavioral concept, and an emergent variable, i.e., a
construct that represents an artifact. The operationalization of theore-
tical concepts, i.e., the specification of the theoretical concepts in the
structural model, requires special attention because estimates are likely
to be inconsistent if a concept’s operationalization is not in accordance
with the concept’s nature [61].
PLS-PM can deal with two kinds of constructs: emergent variables
and latent variables. Latent variables refer to variables that are not
directly observed but instead inferred through a measurement model
from other observed variables (directly measured; [46,62]). They
usually represent theoretical concepts of behavioral research such as
personality traits, individual behavior, and individual attitude [63].
This theoretical reasoning rests on the assumption that behavioral
concepts of interest exist in nature, irrespective of scholarly investiga-
tion [64]. The existing literature proposes two ways to measure beha-
vioral concepts [59]: reflective and causal–formative measurement
model.
The reflective measurement model –also known as the common
factor model –is grounded in the true score theory [65]. It assumes that
a set of indicators is a measurement error–prone manifestation of an
underlying latent variable [66]. Some indicators can thus be inter-
changed without altering the meaning of the latent variable. As the
measurement errors of a block of indicators are usually assumed to be
uncorrelated and independent of the latent variable, the reflective
measurement model imposes restrictions on the variance–covariance
matrix of indicators belonging to one latent variable. In its classical
form, the correlations among the indicators of one block are zero when
controlled for the latent variable, also known as the axiom of local
independence [67]. This fact is typically exploited to draw conclusions
about the existence of the latent variable.
Besides the reflective measurement model, the literature proposes
the causal–formative measurement of behavioral concepts [68,69]. In
contrast to the reflective measurement model, the causal–formative
measurement model reverses the direction of causality between the
indicators and the construct and assumes that the observed indicators
cause the latent variable. This model thus does not restrict the covar-
iances of the indicators belonging to one block. The remaining causes
not represented by the indicators are captured in an error term, which is
by assumption uncorrelated with the causal indicators. Although a
violation of this assumption, i.e., omission of causal indicators, leads to
biased parameter estimates of the causal indicators, recent literature
shows that the meaning of the latent variable is not affected by omitting
causal indicators and the remaining model parameters can be con-
sistently estimated [70]. However, the causal–formative measurement
model is not identified on its own, i.e., the model parameters cannot be
uniquely retrieved from the population indicator variance–covariance
matrix [71,72]. To obtain an identified causal–formative measurement
model, the latent variable must be connected to at least two other
variables not affecting the latent variable [18], for example, using a
multiple-indicators, multiple-causes (MIMIC) model.
Typical examples of behavioral concepts in IS that have been op-
erationalized by a measurement model are behavioral intention to use
information technology (IT), and IT interaction behavior. Behavioral
intention to use IT indicates the degree to which a person has for-
mulated conscious plans to perform or not to perform a specified future
behavior involving IT use. This concept has been operationalized by a
reflective measurement model in past IS research using the following
items: user’s intention, prediction, and plan to use IT in future months
(e.g., [73,74]). IT interaction behavior refers to the user’s interaction
with IT to accomplish an individual or organizational task. This concept
has been operationalized by a causal–formative measurement model in
past IS research. For example, Barki et al. [75] employed a MIMIC
model to operationalize IT interaction behavior using six tasks (causes)
that motivated users to interact with IT (problem solving, justifying
decisions, exchanging information with people, planning or following
up, coordinating activities, and serving customers); and two measure-
ments of this behavior using two reflective indicators (importance of IT
and time invested using IT).
Emergent variables are an alternative representation of theoretical
concepts [20,21]. They have been recently referred in empirical IS re-
search as “composite constructs”(e.g. [76],). Although these labels
could be used interchangeably, we recommend using the term “emer-
gent variable”to highlight that the construct emerges from the in-
dicators. Emergent variables can help model artifacts [3,16]. An artifact
is a human- or firm-made object composed of its ingredients. Thus, in
contrast to behavioral concepts, they are not assumed to exist in nature,
but are products of theoretical thinking and/or theoretically justified
constructions usually made to fulfill a certain purpose. To oper-
ationalize these human- or firm-made concepts, the composite model
can be employed [16]. Examples from the IS research are IT capability
and IT ambidexterity [77,78].
The composite model can be understood as a recipe for how in-
gredients (the components) should be mixed and matched to build the
artifact. The composite model assumes a definitorial rather than a
causal relationship between indicators and the emergent variable ([63],
2017). In the classical composite model, the indicators forming an
emergent variable are assumed to be free of measurement errors. In
contrast to the reflective measurement model, the composite model
imposes no restrictions on the covariance structure of indicators be-
longing to the same construct. The reflective measurement model is
J. Benitez, et al. Information & Management xxx (xxxx) xxx–xxx
3
thus nested within the composite model, as the composite model relaxes
the assumption that all covariation among a block of indicators is ex-
plained by one latent variable [22]. Yet, the composite model con-
straints the correlations between the indicators forming an emergent
variable and variables not forming the emergent variable, i.e., they are
proportional [29]. Similar to the causal–formative measurement model,
the composite model is not identified when isolated in the structural
model. To ensure identification, a necessary condition is that each
emergent variable must be linked to at least one variable not forming
the emergent variable [28,134].
Because the artifact as a type of theoretical concept was introduced
only recently, it is helpful to illustrate this type of theoretical concept
with an example. Based on theory, bread is made from wheat, water,
salt, and yeast. Although the correlations between the amounts of
wheat, water, salt, and yeast in a sample of loaves of bread are likely to
be high, one would not conclude that bread is something that should be
measured, i.e., that bread causes (or is caused by) wheat, water, salt,
and yeast. Rather, wheat, water, salt, and yeast are the simple entities
(ingredients) combined to form the emergent variable representing the
artifact we call bread. Clearly, the temporal precedence of the in-
gredients also suggests that bread cannot be the common cause of its
ingredients.
Because IS Science analyzes and aims at explaining how IT affects
organizations, individuals, and society, artifacts play a pivotal role in IS
research. For example, the theoretical concept IT infrastructure cap-
ability refers to a firm’s ability to use and leverage its IT resource in-
frastructure for business activities [79–82]. IT infrastructure capability
isa“human-made/firm-made”concept that can be operationalized by
the composite model [76,81,83]. Of course, no single “true”recipe
exists for creating this artifact. Just as different bakeries can produce
different types of bread or different breweries produce different types of
beer, different scholars can produce different recipes for the same
concept. The beer analogy can be extraordinarily instructive. Different
recipes exist worldwide to design and manufacture beer. For example,
Spanish breweries use one recipe, German breweries another. Recipes
can even vary by region within a country. Such diversity makes each
recipe an idiosyncratic way to understand and design beer, but all of
these recipes ultimately produce beer.
For example, based on Melville et al. [84] study, Ajamieh et al. [81]
define the artifact IT infrastructure capability as composed of IT tech-
nological infrastructure capability, IT managerial infrastructure cap-
ability, and IT technical infrastructure capability. Further, some prior IS
research [85,86] considers IT capability –a concept similar to IT in-
frastructure capability –as composed of IT technical infrastructure,
human IT resources, and IT-enabled intangibles. IT infrastructure flex-
ibility and post-merger and acquisition (M&A) IT integration capability
are two examples of artifacts recently considered in IS research [83]. IT
infrastructure flexibility refers to the capability of the infrastructure to
adapt to environmental changes. A flexible firm IT infrastructure has
the following characteristics: IT compatibility, IT connectivity, mod-
ularity, and IT personnel skills flexibility [83]. Similarly, post-M&A IT
integration capability is the firm’s ability to integrate the IT technical
infrastructure, IT personnel, and IT and business processes of the
target/acquired firm with the IT technical infrastructure, IT personnel,
and IT and business processes of the acquirer after an M&A [83]. Thus,
post-M&A IT integration capability can be understood as an artifact
built by integrating IT technical infrastructure, IT personnel, and IT and
business processes. These are two examples of artifacts that have re-
cently been examined in the field of IS.
While this study argues for the use of the composite model to op-
erationalize artifacts, recently it has been suggested to employ the
composite model to operationalize behavioral concepts [58]. This no-
tion assumes that both latent and emergent variables serve as a proxy
for behavioral concepts [61]. Following this reasoning, the validity gap
occurs between the concept and its construct and not between the
construct and the observable variables [32].
Once the theoretical concepts are operationalized, the constructs
representing the theoretical concepts can be related via the structural
model. The structural model typically represents the core of the theory
proposed. The structural model generally consists of a set of regression
equations, illustrating the relationship hypothesized between the the-
oretical concepts. In each equation, a dependent construct is explained
by one or more independent constructs. Because a dependent construct
is typically not fully explained by its independent constructs, an error
term accounts for the remaining variance in the dependent construct.
By assumption, the error term is independent of the explanatory con-
structs of its equation. To avoid violating this assumption, in causal IS
research, the scholar should make every effort to include all relevant
constructs (those that affect the dependent construct and correlate with
at least one explanatory construct in the corresponding equation).
Otherwise, the path coefficient estimates obtained by ordinary least
squares (OLS) suffer from omitted variable bias [87]. One potential way
to address this problem of endogeneity is to use the two-stage least
squares (2SLS) estimator for the structural model ([42][27,55],). In the
following, we consider only recursive structural models, structural
models without feedback loops, and/or correlated error terms.
2.2. Parameter estimation
In its current form, PLS-PM estimates model parameters in three
steps. In the first, the iterative PLS-PM algorithm determines the
weights to create scores for each construct (latent variables and emer-
gent variables; [88]). As construct scores of latent variables contain
measurement errors, the second step corrects for attenuation in corre-
lations between latent variables. In doing so, PLSc divides the construct
scores correlations by the geometric mean of the constructs’reliabilities
[41], making the main outcome of the second step a consistent con-
struct correlation matrix. Finally, the third step estimates the model
parameters (weights, loadings, and path coefficients). Based on the
consistent construct correlation matrix, OLS can be used to estimate the
path coefficients of recursive structural models. In case of non-recursive
structural models, the 2SLS or three-stage least squares (3SLS) esti-
mator can be used, instead of the OLS estimator, to obtain consistent
path coefficient estimates ([42][27,83],).
2.3. Substantial changes in the understanding of PLS-PM
In recent years, PLS-PM practices have been examined, debated, and
improved. The recent literature on PLS-PM has been thus substantially
changed and improved, requiring that we identify the changes in the
understanding and practice of PLS-PM. Table 1 summarizes these
changes in the understanding of PLS-PM in the context of confirmatory
and explanatory research.
Traditional view 1: PLS-PM should be used primarily for ex-
ploratory and early-stage research. Although PLS-PM was originally
developed for exploratory research [2], enhancements such as PLSc and
the bootstrap-based test for overall model fit make PLS-PM suitable for
causal research, i.e., confirmatory and explanatory research. However,
as originally developed, PLS-PM can also be applied in descriptive and
predictive research [23,24].
Traditional view 2: PLS-PM has advantages over covariance-based
estimators in the case of small sample sizes. The application of PLS-PM
has often been justified by the size of the investigated sample [26]. It is
true that PLS-PM is capable of estimating models with more parameters
than observations because it only estimates partial model structures,
but as with every other statistical method, the standard errors of the
estimates increase as the sample size decreases. Therefore, justifying
the use of PLS-PM due to small sample sizes should be considered
cautiously. In this sense, claiming that PLS-PM is particularly suitable
for small sample sizes can be regarded as problematic [26]. However, in
case of pure emergent variable models and small sample size con-
stellations, PLS-PM performs superior focusing on accuracy in the
J. Benitez, et al. Information & Management xxx (xxxx) xxx–xxx
4
estimation of path coefficients compared to other variance-based esti-
mators [89], i.e., generalized structured component analysis (GSCA
[90],) and regression with sum scores.
Traditional view 3: PLS-PM cannot be used for non-recursive
models. Although current user-friendly software packages do not yet
implement approaches to analyzing non-recursive models, the as-
sumption of recursivity can be relaxed by estimating the structural
model parameters using 2SLS or 3SLS instead of OLS ([42,27]). Another
approach to estimate non-recursive structural models involves using the
construct scores (in the case of emergent variables) or the disattenuated
construct correlation matrix (in the case of latent variables) obtained by
PLS-PM as input for the full-information maximum-likelihood (FIML)
estimator (e.g. [83],).
Traditional view 4: Model identification plays no role when em-
ploying PLS-PM. PLS-PM always employs composites to estimate the
model, whether or not the theoretical concepts are operationalized by a
measurement model or a composite model. Therefore, the identification
rules for composite models must be applied [28,134]. In addition to
normalization of the weight vector such as fixing the variance of each
composite to one, it must be ensured that each construct is connected
(by means of a non-zero path) with at least one other construct in the
model to ensure that the weights can be uniquely retrieved from the
population indicator variance–covariance matrix. Although all weight
vectors are scaled and no construct is isolated in the structural model,
the signs of the weights of a block of indicators are still ambiguous. The
dominant indicator approach is thus recommended to fix construct
scores’orientation and thereby uniquely determine the weights [3].
Additionally, the structural model must be identified. As long as one
only considers recursive structural models with uncorrelated error
terms, identification is straightforward as they are always identified
[1].
Traditional view 5: PLS-PM has greater statistical power than the
ML estimator. In estimating latent variable models, Reinartz et al. [91]
claim that the power of statistical testing is higher when PLS-PM esti-
mates are employed than when ML estimates are used. However, these
findings are highly questionable, as they are based on traditional PLS-
PM, which is known to produce inconsistent parameter estimates for
latent variable models. In line with Goodhue et al. [30], who show that
this alleged higher power goes along with an inflated type I error, we
conclude that preferring PLS-PM over the ML estimator due to effi-
ciency is not a valid argument for latent variable models. Similar
findings observed for GSCA are applicable to PLS-PM [92]. For emer-
gent variable models, however, traditional PLS-PM has shown favorable
properties among variance-based estimators, i.e., GSCA, sum scores,
and PLS-PM [89].
Traditional view 6: Mode A can be used to consistently estimate
reflective measurement models. In its most modern appearance, PLS-
PM can deal with models containing both emergent and latent vari-
ables. Because PLS-PM inherently estimates composite models [31,32],
it is the estimator of choice for models containing emergent variables
only [61]. In PLS-PM, composite models can be consistently estimated
by Mode B [28]. To obtain consistent parameter estimates for reflective
measurement models, PLSc should be used. Estimates obtained by tra-
ditional Mode A, or Mode B and C as well, suffer from the attenuation
bias [42]. In contrast, PLSc produces consistent and asymptotically
normal estimates for reflective measurement models by combining
Mode A estimates with a correction for attenuation [41]. Consequently,
the development of PLSc [41] enables PLS-PM to analyze models con-
taining both emergent and latent variables. However, scholars should
use Mode B in PLS-PM (instead of PLSc) when they estimate pure
emergent variable models as PLSc has shown to produce biased esti-
mates in this situation [61].
That being said, in the case of pure latent variable models, covar-
iance-based estimators are preferred, as they are consistent and
asymptotically efficient. However, the availability of asymptotically
efficient estimators does not mean that scholars cannot use PLS-PM to
estimate models of this kind. First simulation studies have investigated
the performance of PLSc to other estimators in this situation and in fact,
its usage for pure latent models is considered as acceptable and its bias
for finite samples has been evidenced of little practical relevance as-
suming that the model is correctly specified ([42,61,93]).
Traditional view 7: Mode B can be used to estimate cau-
sal–formative measurement models consistently. Mode B, or so-called
regression weights, cannot be used to consistently estimate cau-
sal–formative measurement models, as this kind measurement model is
not identified by its own [94]. However, by the development of PLSc,
PLS-PM can consistently estimate causal–formative measurement
models by means of the MIMIC model [16].
Traditional view 8: Overall fit of models estimated by PLS-PM
cannot be assessed. Due to recent developments in the context of PLS-
PM, the overall fit of models estimated by PLS-PM can be assessed in
two non-exclusive ways: (1) by a bootstrap-based test for overall model
fit[42], and (2) by measures of overall model fit such as the SRMR
[22]. Both ways assess the difference between the empirical indicator
variance–covariance matrix and the estimated model-implied counter-
part. While the empirical indicator variance–covariance matrix con-
tains the variances and covariances of the indicators based on the
sample, the estimated model-implied counterpart contains the var-
iances and covariances of the indicators implied by the model structure
based on the estimated model parameters. Typically, the discrepancy
between the two matrices is measured by the squared Euclidean dis-
tance (d
ULS
), the geodesic distance (d
G
), and the SRMR. The bootstrap-
based test for overall model fit relies on a bootstrap procedure to obtain
the reference distribution of the distance measures under the null hy-
pothesis that the population indicator variance–covariance matrix
equals the model-implied counterpart [95]. Assuming a 5% level of
significance, a discrepancy value larger than the 95% quantile of the
corresponding reference distribution leads to rejection of the null hy-
pothesis. In addition to the bootstrap-based test for overall model fit,
the values of the distance measures can be compared to threshold va-
lues recommended by the literature to assess overall model fit. Mea-
sures of fit are thus based on heuristic rules rather than on statistical
inference. Moreover, the suggested thresholds for the measures of
overall model fit in the context of PLS-PM, e.g., 0.080 for the SRMR, are
preliminary and need to be examined in more detail in future research.
Traditional view 9: Reliability of the construct scores obtained by
PLS-PM should be assessed using Cronbach’sαand Dillon–Goldstein’sρ
(also called Jöreskog’sρor composite reliability). Traditionally, the
literature recommended determining the reliability of PLS-PM construct
scores through Cronbach’sαand Dillon–Goldstein’sρ[12]. However,
considering this recommendation, several aspects of these two mea-
sures have been widely neglected. First, Cronbach’sαand Dillon–-
Goldstein’sρboth assess the reliability of sum scores (construct scores
obtained by equally weighted indicators) created for the latent variable.
However, PLS-PM allows the indicator weights used for the calculation
of the construct scores to vary such that indicators with a smaller
amount of random measurement error take on greater weight than in-
dicators containing a larger amount of random measurement error.
Consequently, the PLS-PM construct scores contain less measurement
error and are generally more reliable than sum scores [22]. Second,
Cronbach’sαassumes tau-equivalence, i.e., equal population covar-
iances among the indicators belonging to one latent variable, an as-
sumption that is rarely met in empirical research [34]. While Cron-
bach’sαcan be calculated based on the sample variance–covariance
matrix, Dillon–Goldstein’sρis based on factor loadings. Because tra-
ditional PLS-PM is known to produce inconsistent factor loading esti-
mates, Dillon–Goldstein’sρshould be based on consistent factor loading
estimates obtained by PLSc. Furthermore, as the assumptions of Cron-
bach’sαand Dillon–Goldstein’sρare likely to be violated in empirical
research, their use cannot be recommended. However, the reliability
obtained by Cronbach’sαcan be regarded as a lower bound [33]. To
consistently estimate the reliability of latent variable scores obtained by
J. Benitez, et al. Information & Management xxx (xxxx) xxx–xxx
5
PLS-PM’s Mode A, Dijkstra–Henseler’sρ
A
should be used [41].
Traditional view 10: Discriminant validity should be examined by
the Fornell–Larcker criterion. Although the Fornell–Larcker criterion
[96] had been long recommended to assess discriminant validity of
latent variables [12], it is ineffective in combination with traditional
PLS-PM because it relies on consistent factor loading estimates [22]. To
overcome this drawback, the HTMT was developed to assess dis-
criminant validity in the case of variance-based estimators [35]. The
HTMT can be assessed in two ways: (1) by comparing it to a threshold
value, and (2) by constructing a confidence interval to examine whether
HTMT is significantly smaller than a certain threshold value ([35]
[37],). For the first approach, simulation studies suggest a threshold
value of 0.90 if constructs are conceptually very similar or 0.85 if the
constructs are conceptually more distinct ([35–37]). For the second
approach, prior methodological research has suggested to examine
whether HTMT is significantly smaller than 1 [35] or below other
smaller values, e.g., 0.85 or 0.90 [37][37]. conclude that HTMT is a
reliable tool for assessing discriminant validity, whereas the For-
nell–Larcker criterion has limitations that do not justify its reputation
for rigor and its widespread use in empirical research.
3. An illustrative example
3.1. Description of the example
We provide an illustrative IS example to present the latest en-
hancements of PLS-PM. Fig. 1 displays the proposed research model to
be estimated and tested. For this purpose, we use a simulated dataset of
300 observations, where each observation represents a firm –the unit of
analysis in the example. Because we use a simulated dataset, the ob-
tained results are not scientifically relevant and any comparison of our
results to results of other empirical studies is only made for purely il-
lustrative purposes.
Social executive behavior is the positive/negative behavior of the
firm’s top managers towards the firm’s use of social media for business
activities. Social employee behavior is the positive/negative behavior of
the firm’s employees towards the firm’s use of social media for business
activities. Social media capability refers to the firm’s ability to use and
leverage external social media platforms purposefully to execute busi-
ness activities [77,97]. Business process performance is the firm’s
relative performance in key business processes as compared with its key
competitors [98]. Fig. 1 presents the research model of the example.
Based on prior IS research on social media in organizations [77,99], it is
assumed that social executive behavior and social employee behavior
positively affect development of a firm’s social media capability, which,
in turn, may positively influence firm’s business processes performance.
The research model represents the theory proposed by an author/
team to be tested empirically. It illustrates how the theoretical concepts
are operationalized, i.e., how the indicators are related to the constructs
representing the theoretical concepts, and how these constructs are
connected. It usually includes several hypotheses to be tested. Based on
prior literature and anecdotal evidence from the real world, authors
should explain one by one why the hypothesized relationships are in-
cluded and state expectations about their signs. These explanations are
omitted from this article because theoretical explanation of the re-
lationships included in the example is beyond the paper’s scope. In our
example, the following three hypotheses are tested:
Hypothesis 1 (H1). Social executive behavior has a positive impact on the
development of social media capability.
Hypothesis 2 (H2). Social employee behavior has a positive impact on the
development of social media capability.
Hypothesis 3 (H3). Social media capability has a positive impact on
business process performance.
Although prior IS studies using PLS-PM have investigated more
complex models (e.g., including a greater number of constructs, second-
order constructs, moderation effects), the presented research model
seems reasonable for our purposes due to the following three reasons:
(1) the goal of our study is to provide guidelines for using PLS-PM in
causal IS research (confirmatory and explanatory), employing the most
recently proposed standards. In sake of brevity, parsimony, and peda-
gogical illustration for IS scholars, we think, in line with Occam’s razor,
the simpler the research model is, the better. "Parsimonious yet well-
fitting models are more likely to be scientifically replicable, explain-
able" [100]. “Parsimony is also regarded by many social scientists as an
important ingredient in theory development (e.g. [101,102],), precisely
because it ‘explains much by little’([103]; p.153)”[100]; (2) the
considered model contains both latent variables (ovals) and emergent
variables (hexagons), and therefore, presents a situation in which PLS-
Fig. 1. Research model (CV = Control variables).
J. Benitez, et al. Information & Management xxx (xxxx) xxx–xxx
6
PM can leverage its full capacities; and (3) the research model is the-
oretically positioned in IS literature on business value of IT, where the
research models are usually parsimonious (e.g. [104,105],).
Theoretical concepts of behavioral research such as personality
traits, individual behavior, and individual attitude are usually re-
presented as latent variables [106]. Because social executive behavior
and social employee behavior indicate types of individual behavior and
attitude, the two theoretical concepts were operationalized by reflective
measurement models. The ovals represent the latent variables and the
connected rectangles their indicators. Social executive behavior and
social employee behavior were each measured by four indicators,
(SEXB1-SEXB4) and (SEMB1-SEMB4), respectively. To obtain con-
sistent estimates, the reflective measurement models were estimated by
PLSc [41].
In contrast, the theoretical concepts social media capability and
business process performance were considered as artifacts designed by
firms, executives, and/or employees. To operationalize these theore-
tical concepts, the composite model was employed. In doing so, social
media capability is assumed to be composed of the following in-
gredients: Facebook, Twitter, corporate blog(s), and LinkedIn cap-
abilities [77], which are the ingredients that shape social media cap-
abilities and are lower-order capabilities. IS scholars and analysts from
other contexts (e.g., China) might consider the social media WeChat
(capability) as a key ingredient of social media capability and might
remove other, less relevant social media capabilities for Chinese firms.
This illustrates the potential of including/studying different artifacts to
investigate the same phenomenon of interest for firms and society. The
artifact business process performance was also operationalized by a
composite model. It comprises supplier relations, product and service
enhancement, production and operations, marketing and sales, and
customer relations (Tallon and Pinsoneault 2011). Fig. 2 illustrates how
the artifact social media capability was operationalized. The hexagon
represents the construct, i.e., the emergent variable, while the rec-
tangles represent ingredients forming the construct.
Besides the variables of main interest, firm size and industry were
included as control variables in the structural model to control for ef-
fects of extraneous variables [80,81]. Firm size was modeled as a single-
indicator composite to account for the role of different firm sizes in
explaining business process performance through the natural logarithm
of the number of employees [76]. Due to the skewed distribution, it is
advisable to also apply the logarithm when the firm size is measured
through sales or total assets. Industry was incorporated as a composite
to control for an overall industry effect on business process performance
and was shaped by three indicators, i.e., industry groups 1–3. The three
industry group dummies indicate whether an observation belongs to
industry 1, 2, or 3. Each industry assigns 0 if the observation does not
belong to the industry and 1 if the observation does. For example, the
variable industry group 1 will have a value of 0 for firms that do not
belong to industry group 1 and a value of 1 for firms that belong to
industry group 1. Although the dataset consists of four different in-
dustries, industry group 4 was not included to avoid perfect multi-
collinearity. Therefore, group 4 became the reference category. The
weights of the industry composite can be interpreted as a simple con-
trast, i.e., the difference in contribution to the total industry effect
between the industry considered and the reference industry. Fig. 3
presents how industry, a nominal control variable, was included in the
structural model. IS scholars can use the dominant or the most im-
portant industry as the reference group.
3.2. Statistical power analysis
A power analysis should typically be conducted before data col-
lection. It gives insight into the minimum sample size required to obtain
sufficient statistical accuracy to detect effects of interest existing in the
population. The power of a statistical test is the probability of rejecting
the false null hypothesis correctly, that is, of finding an effect in the
sample if it indeed exists in the population [107]. Power analysis can be
conducted in two ways: (1) using heuristic rules such as Cohen’s power
tables and the inverse square root method [108,109], and (2) con-
ducting a Monte Carlo simulation study [110]. The 10-times rule [111]
or the minimum R
2
rule is no longer recommended to estimate the
minimum sample size [26,46,109].
To apply Cohen’s power tables for multiple regression analysis, four
parameters must be considered: effect size (the extent to which the path
coefficient/weight exists in the population), power (probability of re-
jecting the false null hypothesis correctly), significance level (prob-
ability of rejecting the true null hypothesis incorrectly), and the number
of independent variables of the equation containing the considered path
coefficient/weight. Once these values are determined, Cohen’s power
tables can be used to approximate the minimum required sample size in
order to achieve a certain power level. To determine the number of
required observations, analysts can assume a small effect size (0.020 ≤
f
2
< 0.150) for a more conservative approximation or a medium to
large effect size (0.150 ≤f
2
< 0.350 or f
2
≤0.350) for a more opti-
mistic approximation of the required sample size. The statistical power
is usually set to 0.8, and a significance level of 0.05 is assumed [107].
Often the equation with the highest number of independent variables is
considered to determine the minimum number of observations to re-
liably detect an effect. In our example, the composite model for busi-
ness process performance has the highest number of independent
variables (supplier relations, product and service enhancement, pro-
duction and operations, marketing and sales, and customer relations) in
an equation. Cohen’s power tables suggest a minimum sample size of 91
observations assuming a medium effect size (f
2
= 0.150), statistical
power of 0.8, and significance level of 0.05 [108]. Considering the
outcomes of the power analyses for our example, a sample size of 300
seems adequate to detect the effects of interests. The inverse square root
method assumes that the estimates are standard normally distributed,
and approximates the standard error using
N
. Assuming a 5% sig-
nificance level, the required sample size to obtain a statistically sig-
nificant effect (
N
ˆ
), if it exists in the population, can be approximated by
>
()
N
ˆβ
2.486
||
2
min , where β
|
|min represents the minimum magnitude of the
coefficient considered.
In addition to considering heuristic rules, IS scholars can conduct a
Fig. 2. Operationalization of the artifact social media capability.
Fig. 3. Modeling a nominal control variable.
Note: Industry group 4 is the reference group.
J. Benitez, et al. Information & Management xxx (xxxx) xxx–xxx
7
Monte Carlo simulation to examine the sample size required to reliably
detect effects that exist in the population. A population model with the
same structure as the estimated model must be specified and all po-
pulation parameter values need to be determined. In the second step,
the model is estimated several times and the rejection rates of the null
hypothesis significance test for the coefficients under examination are
considered, i.e., the statistical power. The appealing property of this
approach is that it can take into account various aspects of the model
and indicators incorporated, such as sample size, number of indicators,
their distribution, and magnitude of the effect. Moreover, sensitivity
analyses can be conducted by changing the assumed population model
to see how these changes affect the statistical power. While guidelines
have been proposed for pure latent variable models in the context of
PLS-PM [110], development of guidelines for models containing
emergent variables is still an open issue.
3.3. Estimation
Various software packages –such as PLS-Graph [112], SmartPLS
[113], WarpPLS [114], XLSTAT-PLS [115], and ADANCO [116]–can
be used to estimate the model with PLS-PM. We used ADANCO 2.0.1
Professional for Windows (http://www.composite-modeling.com/)
[116] to estimate the empirical example. In the following, we used
Mode B to estimate composite models and PLSc to estimate reflective
measurement models. Moreover, we used the factor weighting scheme
for inner weighting and statistical inferences were based on the boot-
strap procedure, relying on 4999 bootstrap runs.
Prior to model estimation, analysts should set a dominant indicator
in each composite and reflective measurement model. As the signs of
the weight and factor loading estimates of a block of indicators are
ambiguous, the dominant indicator is used to dictate the orientation of
a construct. A dominant indicator that is expected to positively corre-
late with the construct is preferable. Face validity can be used to select
the dominant indicator –the indicator that is theoretically most re-
levant and thus expected to positively correlate with the construct. In
our example, we chose SEXB2, SEMB2, SMC1, and BPP4 as dominant
indicators.
Before the model assessment, the researcher has to ensure that the
estimation is technically valid, i.e., that the estimation is admissible and
no Heywood case has occurred [117]. In doing so, he/she needs to
investigate whether the PLS-PM algorithm has properly converged.
Additionally, in particular in the context of PLSc, he/she needs to en-
sure that the construct correlation and the model-implied indicator
correlation matrix are valid, i.e., positive semi-definite. To assess the
definiteness of a matrix, user-written Excel plugins for the calculation of
Eigenvalues can be used. A symmetric matrix is positive semi-definite if
all Eigenvalues are larger or equal to 0. Finally, all absolute factor
loading estimates and reliability estimates must be smaller or equal to
1. For our example, the solution was technically valid.
3.4. Assessment of reflective measurement and composite models
3.4.1. Evaluation of overall fit of the saturated model
Table 4 summarizes the steps to assess reflective measurement and
composite models. Joint assessment should begin with the evaluation of
the overall fit of a model with a saturated structural model [16,118],
that is, with confirmatory factor/composite analysis. The estimated
model is as specified by analysts [118]. The saturated model corre-
sponds to a model in which all constructs are allowed to be freely
correlated, whereas the concept’s operationalization is exactly as spe-
cified by the analyst. The evaluation of the overall model fit of the
saturated model is useful to assess the validity of the measurement and
the composite models, because potential model misfit can be entirely
attributed to misspecifications in the composite and/or measurement
models. Therefore, empirical support can be obtained for the con-
structs, i.e., “Does a latent variable exist?”,or“Do the indicators form
an emergent variable?”Table 2 contains the values of the discrepancy
measures and 95% quantiles of their corresponding reference dis-
tribution for our example. The value of the SRMR was below the re-
commended threshold value of 0.080 [22,119]. However, the thresh-
olds for the overall model fit in the context of PLS-PM should be
considered cautiously as they are preliminary and need to be examined
in more detail in future methodological research. Moreover, all dis-
crepancy measures were below the 95% quantile of their reference
distribution (HI
95
). Empirical evidence was thus obtained for the latent
variables (social executive behavior and social employee behavior) as
well as the emergent variables (social media capability and business
process performance) incorporated in the model. In case of contra-
dictory results for the measure of fit (SRMR) and the test of overall
model fit(d
ULS
, and d
G
), the test for overall model fit is preferred, as it is
based on statistical inference rather than heuristic rules. Moreover, if
none of the discrepancies was below the 95% quantile of the corre-
sponding reference distribution (HI
95
), analysts can evaluate whether
the discrepancies are at least below the 99% quantile (HI
99
) before fi-
nally rejecting the model. In the next step, each measurement and
composite model must be examined separately. Authors of future stu-
dies in IS research are encouraged to report a table like Table 2.
Scholars should assess content validity for both kinds of constructs,
i.e., latent variables and emergent variables, by carefully considering
each type of construct and how the according concept has been oper-
ationalized in prior research. In the case of emergent variables, how-
ever, it might be desirable to modify the weighting scheme, number of
indicators, and content of the indicators as illustrated in the bread and
beer example. Finally, construct validity should be assessed. Depending
on the concept’s operationalization, this can be done in several non-
exclusive ways.
3.4.2. Assessment of the reflective measurement model
For reflective measurement models in which latent variables re-
present behavioral concepts such as social executive behavior and so-
cial employee behavior, composite reliability, convergent validity, in-
dicator reliability, and discriminant validity should be evaluated.
Dijkstra–Henseler’sρ
A
should be considered in assessing composite re-
liability (the correlation between latent variable and construct scores).
A value of Dijkstra–Henseler’sρ
A
larger than 0.707 can be regarded as
reasonable, as more than 50% of the variance in the construct scores
can be explained by the latent variable [120]. Table 3 shows that the
values of Dijkstra–Henseler’sρ
A
for social executive behavior and social
employee behavior. Both are 0.938 and 0.913, and thus above the
suggested threshold of 0.707, indicating reliable construct scores.
Convergent validity is the extent to which the indicators belonging
to one latent variable actually measure the same construct. The average
variance extracted (AVE), typically used to assess convergent validity
[121], indicates how much of the indicators’variance can be explained
by the latent variable. An AVE larger than 0.5 has been suggested to
provide empirical evidence for convergent validity, as the corre-
sponding latent variable explains more than half of the variance in the
belonging indicators, and consequently, all other latent variables ex-
plain less than a half [96]. In our example, all AVE values are above 0.5
(0.788 and 0.716), indicating convergent validity (see Table 3).
Indicator reliability can be assessed through the factor loading es-
timates. As factor loading estimates are standardized in PLS-PM, the
Table 2
Results of the confirmatory factor/composite analysis.
Discrepancy Overall saturated model fit evaluation
Value HI
95
Conclusion
SRMR 0.030 0.049 Supported
d
ULS
0.210 0.546 Supported
d
G
0.049 0.221 Supported
J. Benitez, et al. Information & Management xxx (xxxx) xxx–xxx
8
squared factor loading estimate equals the estimated indicator relia-
bility. It is generally advisable for factor loadings to be greater than
0.707, indicating that more than 50% of the variance in a single in-
dicator can be explained by the corresponding latent variable. In this
context, also the significance of the factor loading estimates should be
investigated. Somewhat lower values are not really problematic as long
as the construct validity and reliability criteria are met. Table 3 presents
the factor loading estimates from our example. They range from 0.769
to 0.912 and are all significant on a 1‰level, suggesting that the
measures are reliable.
Discriminant validity entails that two latent variables that are meant
to represent two different theoretically concepts are statistically suffi-
ciently different. To obtain empirical evidence for discriminant validity,
IS scholars should consider the HTMT [35]. The HTMT should be lower
than 0.85 (more strict threshold) or 0.90 (more lenient threshold) or
significantly smaller than 1 ([36,37]). In our example, the HTMT of
social executive behavior to social employee behavior is 0.322, and
thus below the recommended threshold of 0.85 (and of 0.90). More-
over, the one-sided 95% percentile confidence interval of HTMT does
not cover 1, that is, it is significantly different from 1. Scholars can also
follow [37] suggestion to test whether the HTMT is significantly smaller
than 0.85 or 0.90.
3.4.3. Assessment of the composite model
The composite model requires an evaluation sui generis –an ex-
amination of the composite model with respect to multicollinearity,
weights, composite loadings, and their significances [43,76,77,83,122].
As composite models are typically estimated by Mode B (regression
weights) in PLS-PM, collinearity among indicators forming an emergent
variable should be investigated by means of the variance inflation
factor (VIF), as high multicollinearity can lead to insignificant estimates
and unexpected signs of the weights. Traditionally, VIF values above 5
are regarded as indications of problematic multicollinearity [38,46].
Yet, typical phenomena of multicollinearity can also occur in case of
VIF values far below 5. For weights estimated by Mode A, an assessment
of multicollinearity is not necessary as these equal scaled covariances,
and therefore, ignores multicollinearity [32].
While weights show the relative contribution of an indicator to its
construct, composite loadings represent the correlation between the
indicator and the corresponding emergent variable; a loading shows the
absolute contribution of an indicator to its construct [122]. As weights
show the degree of importance of each indicator (ingredient) to the
construct, analysts should examine whether all indicator weight esti-
mates are significant. For indicators with non-significant weight esti-
mates, one must investigate whether composite loading estimates are
statistically significant and consider dropping any indicators with non-
significant weight and loading estimates. However, content validity
must be considered as well, because dropping an indicator may alter the
meaning of the emergent variable. IS scholars can thus decide to keep
an indicator with non-significant weight and loading to preserve the
construct’s content validity [46].
Table 3 shows that the VIF values for the indicators of the composite
models range from 1.020 to 1.134, suggesting that multicollinearity is
not a problem in our data. Moreover, all weight and composite loading
estimates show the expected sign and are significant at a 5% sig-
nificance level except one (estimated weight of the indicator production
and operations of the construct business process performance). The
weight estimate of this indicator is 0.108, and its composite loading
estimate is 0.203
†
(close to be significant). Considering content validity,
the indicator production and operations may include some of the firm’s
key business processes. Therefore, we decided to keep the indicator in
the empirical analysis to preserve content validity and avoid altering
the meaning of the emergent variable business process performance. In
this type of situation, analysts can also repeat the analysis, dropping the
questionable indicators to explore whether the decision to keep or drop
these indicators affects the results. We dropped BPP3 and repeated the
empirical analysis. The results obtained were qualitatively identical,
suggesting that this decision does not affect the research findings. One
might ask why BPP3 should be included when the results do not
change. This is theoretically justified, as it is difficult to imagine a
company’s business process performance without production and op-
erations processes, as these often are the heart of a company. Because
all reflective measurement and composite models from our example
show desirable properties, we proceed to evaluate the structural model.
Table 3
Measurement model evaluation.
Code Construct/indicator ρ
A
AVE VIF Weight Loading
Social executive behavior (1: Strongly disagree, 5: Strongly agree) (reflective measurement model, Mode A consistent (PLSc), SEXB2
as dominant indicator)
0.938 0.788
SEXB1 Behavior of top business executives towards adoption of social media is positive 0.278
***
0.905
***
SEXB2 Top business executives are positive in adopting social media for business activities 0.269
***
0.877
***
SEXB3 Top business executives support adoption of social media for business activities 0.263
***
0.856
***
SEXB4 Top business executives are willing to support adoption of social media in the firm 0.280
***
0.912
***
Social employee behavior (1: Strongly disagree, 5: Strongly agree) (reflective measurement model, Mode A consistent, SEMB2 as
dominant indicator)
0.913 0.716
SEMB1 Employee behavior towards adoption of social media is positive 0.301
***
0.901
***
SEMB2 Employees are positive to adopt social media in the firm 0.274
***
0.820
***
SEMB3 Employees support adoption of social media in the firm 0.257
***
0.769
***
SEMB4 Employees are willing to support adoption of social media in the firm 0.296
***
0.888
***
Social media capability: My firm has deliberately used and leveraged…for business activities (1: Strongly disagree, 5: Strongly agree)
(composite model, Mode B, SMC1 as dominant indicator)
SMC1 Facebook 1.037 0.229
***
0.397
***
SMC2 Twitter 1.032 0.489
***
0.627
***
SMC3 Corporate blog(s) 1.059 0.601
***
0.751
***
SMC4 LinkedIn 1.020 0.333
***
0.455
***
Business process performance: Relative to your key competitors, what is your performance in last three years in the following
business processes (1: Significantly worse, 5: Significantly better than my key competitors) (composite model, Mode B, BPP4 as
dominant indicator)
BPP1 Supplier relations 1.022 0.285
**
0.397
**
BPP2 Product and service enhancement 1.134 0.553
***
0.307
**
BPP3 Production and operations 1.105 0.108 0.203
†
BPP4 Marketing and sales 1.064 0.609
***
0.531
***
BPP5 Customer relations 1.063 0.629
***
0.591
***
Note:
†
p < 0.10,
*
p < 0.05,
**
p < 0.01,
***
p < 0.001, one-tailed test.
J. Benitez, et al. Information & Management xxx (xxxx) xxx–xxx
9
Table 4
Steps to assess common factor and composite models.
Steps Type of construct Description Assessment
criterion
Decision criterion Interpretation
Testing the adequacy of reflective
measurement and composite
models
Latent and emergent
variable
Evaluate the overall fit of the model with a saturated
structural model by investigating discrepancy between
empirical and model-implied indicator variance–covariance
matrix
SRMR SRMR < 0.080 SRMR <
HI
95
A SRMR value smaller than 0.080 indicates an acceptable
model fit[22]; however, these thresholds are preliminary and
need to be investigated in more detail
d
ULS
d
ULS
<HI
95
The null hypothesis that the population indicator
variance–covariance matrix equals the model-implied
counterpart is not rejected. Hence, empirical evidence for the
model is given when the value of the discrepancy measure is
below the 95% quantile of its corresponding reference
distribution
d
G
d
G
<HI
95
Evaluating content validity Latent and emergent
variable
How the corresponding theoretical concepts have been
operationalized (measured or built) in prior research
Flexibility in the case of artifacts represented by an emergent variable (bread and beer analogy)
Evaluating reliability of construct
scores
Latent variable Evaluating whether the construct scores reliably represent
the underlying construct
ρ
A
ρ
A
> 0.707 More than 50% of the variance in the construct scores can be
explained by the underlying latent variable
Evaluating indicator reliability Evaluating whether indicators are reliable Factor loading
estimates
Factor loading estimates >
0.707
More than 50% of the indicator’s variance is explained by the
latent variable
Factor loading
significance
Significant at 5%
significance level
Evaluating convergent validity Latent variable Evaluating the share of variance in the indicators that is
explained by the underlying latent variable
AVE AVE > 0.5 More than 50% of indicators’variance is explained by the
underlying latent variable
Evaluating discriminant validity Latent variable Evaluating whether two latent variables are statistically
different
HTMT HTMT < 0.85 (or whether
the HTMT is significantly
smaller than 1)
Factors are statistically different and thus have discriminant
validity
Multicollinearity Emergent variable
(estimated by Mode
B)
Evaluating how the standard errors of the weight estimates
are affected by the correlations of the indicators
VIF VIF < 5 If the estimates suffer from multicollinearity, weights obtained
by Mode A or predetermined weights can be used
Weights Emergent variable Evaluating relative contribution of an indicator to its
construct
Weights’value and
significance
Significant at 5%
significance level
Each indicator contributes significantly to the emergent
variable
Loadings Emergent variable Evaluating absolute contribution of an indicator to its
construct
Loading
significance
Significant at 5%
significance level
Each indicator contributes to the emergent variable in a
statistically significant way
J. Benitez, et al. Information & Management xxx (xxxx) xxx–xxx
10
3.5. Assessment of the structural model
In evaluating the structural model, the analyst should examine the
overall fit of the estimated model, the path coefficient estimates, their
significance, the effect sizes (f
2
), and the coefficient of determination
(R
2
,[3,123]). Analysts should focus specifically and primarily on
overall model fit in confirmatory research and primarily on R
2
, the path
coefficient estimates, and the effect sizes in explanatory research [24].
Table 7 summarizes the steps to follow in evaluating the structural
model.
3.5.1. Evaluation of the overall fit of the estimated model
First, analysts should evaluate the overall fit of the estimated model
through the bootstrap-based test of overall model fit and the SRMR as a
measure of approximate fit to obtain empirical evidence for the pro-
posed theory. Analysis in confirmatory research without assessing the
overall model would be incomplete as this means ignoring empirical
evidence for and also against the proposed model and the postulated
theory [124]. Without assessing the model fit, a researcher would not
obtain any signal if he or she had incorrectly omitted an important
effect in the model. Because the test for overall model fit was in-
troduced only recently in the context of PLS-PM, the vast majority of
models estimated by PLS-PM in past IS research has not been evaluated
in this respect. However, because the overall model fit can now be
tested in the context of PLS-PM, we encourage IS scholars to take this
evaluation very seriously in causal research. In our example, all values
of discrepancy measures were below the 95% quantile of their corre-
sponding reference distribution (HI
95
), indicating that the estimated
model was not rejected at a 5% significance level (see Table 5).
Moreover, the SRMR was below the preliminary suggested threshold of
0.080, indicating an acceptable model fit. This result suggests that the
proposed model is well suited for confirming and explaining the de-
velopment of social media capability and business process performance
among firms. While the model fit suggests that there is a possibility that
the world functions according to the specified model, the model can
still be misspecified in the sense of over-parameterization, i.e., the
model contains superfluous zero-paths [22]. Neither the bootstrap-
based test of model fit nor the SRMR punishes for unnecessary paths,
i.e., neither of them rewards parsimony. Regardless of whether one
conducts confirmatory or explanatory research, it remains indis-
pensable to assess all path coefficients and their significance. Table 6
presents the construct correlation matrix.
3.5.2. Evaluation of path coefficients and their significance levels
The path coefficient estimates are essentially standardized regres-
sion coefficients, whose sign and absolute size can be assessed. These
coefficients are interpreted as the change in the dependent construct
measured by standard deviations, if an independent construct is in-
creased by one standard deviation while keeping all other explanatory
constructs constant (ceteris paribus consideration). For example, in-
creasing social media capability by one standard deviation will increase
business process performance by 0.515 standard deviations if all other
variables are kept constant. Statistical tests and confidence intervals can
be used to draw conclusions about the population parameters. For
confidence intervals, the percentile bootstrap confidence interval is
recommended [125]. As shown in Fig. 4, the path coefficient estimates
for the hypothesized relationships included in the example range from
0.396 to 0.515, and are all significant at a 5% significance level except
the effect of the two control variables, firm size and industry. A path
coefficient estimate is considered as statistically significant different
from zero at a 5% significance level when its p-value is below 0.05 or
when the 95% bootstrap percentile confidence interval constructed
around the estimate does not cover the zero.
3.5.3. Evaluation of effect sizes
The practical relevance of significant effects should be investigated
by considering the effect sizes of the relationships between the con-
structs. The effect size is a measure of the magnitude of an effect that is
independent of sample size. The f
2
values ranging from 0.020 to 0.150,
0.150 to 0.350, or larger or equal to 0.350, indicating weak, medium, or
large effect size respectively [108]. Just as all actors in a movie cannot
play a leading role, it is unusual and unlikely that most constructs will
have a large effect size in the model. We provide this clarification be-
cause scholars often expect/self-demand that all/most of their effect
Table 5
Structural model evaluation.
Relationship Path coefficient
Social executive behavior →Social media capability (H1) 0.422
***
(8.830) [0.327, 0.512]
Social employee behavior →Social media capability (H2) 0.396
***
(8.052) [0.300, 0.490]
Social media capability →Business process performance (H3) 0.515
***
(10.232) [0.426, 0.609]
Firm size →Business process performance (control variable) 0.022 (0.305) [-0.128, 0.160]
Industry →Business process performance (control variable) 0.030 (0.312) [-0.161, 0.174]
Endogenous variable R
2
Social media capability 0.443
Business process performance 0.267
Overall fit of the estimated model Value HI
95
SRMR 0.032 0.049
d
ULS
0.232 0.558
d
G
0.052 0.222
Effect size f
2
Social executive behavior →Social media capability (H1) 0.286
Social employee behavior →Social media capability (H2) 0.252
Social media capability →Business process performance (H3) 0.362
Firm size →Business process performance (control variable) 0.001
Industry →Business process performance (control variable) 0.001
Note: t-values (one-tailed test) are presented in parentheses. Percentile bootstrap confidence intervals are presented in brackets.
Table 6
Construct correlation matrix.
123456
1. Social executive
behavior
1.000
2. Social employee
behavior
0.322 1.000
3. Social media capability 0.550 0.532 1.000
4. Business process
performance
0.216 0.309 0.515 1.000
5. Firm size −0.025 −0.048 −0.014 0.016 1.000
6. Industry 0.069 0.073 0.010 0.036 0.038 1.000
J. Benitez, et al. Information & Management xxx (xxxx) xxx–xxx
11
magnitude be large –an unrealistic expectation. This cautionary note
extends to supervisors’expectations for their Ph.D. students (as illu-
strated by [126]). In our sample, the f
2
values for the hypothesized
relationships range from 0.252 to 0.362 (medium to large).
3.5.4. Evaluation of R
2
R
2
is used to assess goodness of fit in regression analysis [87]. In the
case of models estimated by OLS, the R
2
value gives the share of var-
iance explained in a dependent construct. Thus, it provides insights into
a model’s in-sample predictive power [127]. Moreover, R
2
forms the
basis for several innovative model selection criteria ([37,100]). Re-
porting R² makes PLS-PM research future-proof in this regard, because
the new model selection criteria can still be calculated ex post as long as
the R² values are given.
The expected magnitude of R
2
depends on the phenomenon in-
vestigated. As some phenomena are already quite well understood, one
would expect a relatively high R². For phenomena that are less well
understood, a lower R² is acceptable. The R² values should be judged
relative to studies that investigate the same dependent variable. In our
example, the R
2
values for social media capability and business process
performance are 0.443 and 0.267, respectively. The study of social
media in organizations is in its initial stages [77]. Braojos et al. [128]
report an R
2
value of 0.541 for social media capability. In our example,
social executive behavior and social employee behavior explain 44.3%
of variance in development of social media capability, using two un-
explored exogenous variables for social media capability (social ex-
ecutive behavior and social employee behavior). Considering explained
variance in prior IS research and the originality of our two exogenous
variables in influencing social media capability, an R
2
of 0.443 seems to
be an excellent value.
The models of [129,130]) explain 49% and 43.9% of the variance in
business process outcomes. In our example, social media capability,
firm size, and industry explain 26.7% of the variance in business pro-
cess performance. Although this R
2
value is somewhat smaller than
those obtained by [129,130]), it can be considered as satisfactory be-
cause our model is the first using social media capability to explain
business process performance individually. The independent variables
explaining business process outcomes in [129,130]) work refer to other
IT resources (e.g., IT assets, enterprise resource planning capabilities)
different from social media capability. This subsection illustrates by our
fictive example how analysts can report and compare their R
2
values.
4. Discussion and conclusions
IS research often tackles complex research problems and questions
that require conceptualization and operationalization of different types
of theoretical concepts, i.e., behavioral concepts and artifacts, as well as
the estimation of their relationships. PLS-PM is a suitable estimator for
this purpose. How can one perform and report an impactful analysis
using PLS-PM in IS research following the recent improvements in PLS-
PM? This study provides thorough guidelines on PLS-PM in the fra-
mework of causal (confirmatory and explanatory) research, employing
the latest standards recommended. In doing so, it addresses the why
and how to perform and report a PLS-PM estimation in confirmatory
and explanatory IS research, illustrated by a fictive example on business
value of social media. This is the key contribution of this paper to the
methodological literature in IS empirical research.
In the last five years, methodologists have overcome major weak-
nesses of traditional PLS-PM, such as its inconsistency for latent vari-
able models and lack of a test for overall model fit. To benefit from all
these enhancements, IS scholars need new guidelines for empirical
studies that incorporate all these recent new developments and insights,
as most of the guidelines papers on PLS-PM in the IS research were
published before 2013 (e.g., [12,43–45]). Although several recent
scholarly textbooks and articles (e.g. [46–48],) have provided guide-
lines for causal research that cover some of the latest enhancements to
Table 7
Steps to follow in performing structural model evaluation.
Steps Description Criterion Suggested threshold Interpretation
Overall fit of estimated model Evaluating overall fit of the estimated model by evaluating
discrepancy between the empirical indicator
variance–covariance matrix and its model-implied counterpart
SRMR SRMR < 0.080 SRMR < HI
95
Value of discrepancy measure below the 95% quantile of
the corresponding reference distribution provides
empirical evidence for the postulated model. In other
words, it is possible that the empirical data stem from a
world that functions as theorized by the model
d
ULS
d
ULS
<HI
95
d
G
d
G
<HI
95
Consider path coefficient
estimates and their
significance levels
Standardized regression coefficients are interpreted as change
in standard deviations of the dependent variable if an
independent variable is increased by one standard deviation
while all other independent variables in the equation remain
constant
Path coefficient
estimates and their
significance level
Significant at 5% significance level, i.e., p-value
<5%
Effect of independent variables on dependent variables is
statistically significant
Consider effect sizes (f
2
) Measure of the magnitude of an effect that is independent of
sample size. Give an indication about the practical relevance of
an effect
f
2
value f
2
< 0.020: no substantial effect 0.020 ≤f
2
<
0.150: weak effect size 0.150 ≤f
2
< 0.350:
medium effect size f
2
≥0.350: large effect size
Degree of strength of an effect
Evaluate R
2
Explained variance of an dependent construct R
2
When the phenomena are already quite well
understood, one would expect a high R². When the
phenomena are not yet well understood, a lower
R² is acceptable
Degree of variance explained for phenomenon under
investigation
J. Benitez, et al. Information & Management xxx (xxxx) xxx–xxx
12
PLS-PM, neither of these PLS-PM guidelines for causal research covered
the full range of recent developments, nor did they introduce any new
framework for applying PLS-PM and reporting its outcomes. To address
this shortcoming in the existing IS literature, this paper provides up-
dated guidelines on the use of PLS-PM in assessment of reflective
measurement models, composite models, and structural models. To the
best of our knowledge, the proposed guidelines take into account all
recent enhancements. An application of the guidelines is illustrated
using a parsimonious IS research example on business value of social
media.
In contrast to prior guidelines [11,12], our article introduces the
artifact –a human-made/firm-made object –as a new kind of theore-
tical concept and shows how this type of theoretical concept can be
operationalized by means of the composite model. Because a significant
proportion of theoretical concepts in IS research are human-made/firm-
made, one can expect the composite model to become the dominant
conceptualization in IS research in the coming years. Against this
background, we highlight the usefulness of model testing in con-
firmatory and explanatory research using PLS-PM. Without considering
its results, it is hardly possible to obtain empirical evidence for or
against a scholar’s proposed theory. Finally, we strongly recommend
that scholars employ consistent estimators, using PLSc when the theo-
retical concept is operationalized by a measurement model.
As our article about the use of PLS-PM for causal research is limited
to linear, recursive models containing only first-order constructs, future
IS research should develop additional updated guidelines incorporating
recent developments for more complex models, such as models con-
taining moderation effects, second-order emergent variables of emer-
gent variables, and for composite models that account for more com-
plex relationships between the indicators and the emergent variable.
Although some steps have been made using PLS-PM to deal