ArticlePDF Available

Abstract and Figures

Recent estimates suggest that although a majority of funds in organizational training budgets tend to be allocated to leadership training (Ho, 2016; O'Leonard, 2014), only a small minority of organizations believe their leadership training programs are highly effective (Schwartz, Bersin, & Pelster, 2014), calling into question the effectiveness of current leadership development initiatives. To help address this issue, this meta-analysis estimates the extent to which leadership training is effective and identifies the conditions under which these programs are most effective. In doing so, we estimate the effectiveness of leadership training across four criteria (reactions, learning, transfer, and results; Kirkpatrick, 1959) using only employee data and we examine 15 moderators of training design and delivery to determine which elements are associated with the most effective leadership training interventions. Data from 335 independent samples suggest that leadership training is substantially more effective than previously thought, leading to improvements in reactions (δ = .63), learning (δ = .73), transfer (δ = .82), and results (δ = .72), the strength of these effects differs based on various design, delivery, and implementation characteristics. Moderator analyses support the use of needs analysis, feedback, multiple delivery methods (especially practice), spaced training sessions, a location that is on-site, and face-to-face delivery that is not self-administered. Results also suggest that the content of training, attendance policy, and duration influence the effectiveness of the training program. Practical implications for training development and theoretical implications for leadership and training literatures are discussed. (PsycINFO Database Record
Content may be subject to copyright.
Leadership Training Design, Delivery, and Implementation:
A Meta-Analysis
Christina N. Lacerenza, Denise L. Reyes,
and Shannon L. Marlow
Rice University
Dana L. Joseph
University of Central Florida
Eduardo Salas
Rice University
Recent estimates suggest that although a majority of funds in organizational training budgets tend to be
allocated to leadership training (Ho, 2016;O’Leonard, 2014), only a small minority of organizations
believe their leadership training programs are highly effective (Schwartz, Bersin, & Pelster, 2014),
calling into question the effectiveness of current leadership development initiatives. To help address this
issue, this meta-analysis estimates the extent to which leadership training is effective and identifies the
conditions under which these programs are most effective. In doing so, we estimate the effectiveness of
leadership training across four criteria (reactions, learning, transfer, and results; Kirkpatrick, 1959) using
only employee data and we examine 15 moderators of training design and delivery to determine which
elements are associated with the most effective leadership training interventions. Data from 335
independent samples suggest that leadership training is substantially more effective than previously
thought, leading to improvements in reactions (␦⫽.63), learning (␦⫽.73), transfer (␦⫽.82), and results
(␦⫽.72), the strength of these effects differs based on various design, delivery, and implementation
characteristics. Moderator analyses support the use of needs analysis, feedback, multiple delivery
methods (especially practice), spaced training sessions, a location that is on-site, and face-to-face delivery
that is not self-administered. Results also suggest that the content of training, attendance policy, and
duration influence the effectiveness of the training program. Practical implications for training devel-
opment and theoretical implications for leadership and training literatures are discussed.
Keywords: leadership training, leadership development, management, development, meta-analysis
Supplemental materials:
Leadership and learning are indispensable to each other
–(John F. Kennedy, 35th President of the United States).
In 2015, organizations in the United States spent an average of
$1,252 per employee on training, and the largest share of this
training budget was allocated to leadership training, making lead-
ership the greatest training focus for today’s organizations (Ho,
2016). As such, leadership development is an essential strategic
priority for organizations. Despite the number of organizations
devoted to leadership training (e.g., Harvard Business Publishing
Corporate Learning, Dale Carnegie Training, Wilson Learning)
and evidence suggesting that organizational funds spent on lead-
ership training are increasing over time (Gibler, Carter, & Gold-
smith, 2000), organizations continue to report a lack of leadership
skills among their employees; only 13% of organizations believe
they have done a quality job training their leaders (Schwartz et al.,
2014). Similarly, some have pointed out a substantial leadership
deficit (Leslie, 2009) and have noted that organizations are “. . .
not developing enough leaders” and “. . . not equipping the leaders
they are building with the critical capabilities and skills they need
to succeed” (Schwartz et al., 2014, p. 26). This calls into question
the general utility of current leadership development initiatives.
As Wakefield, Abbatiello, Agarwal, Pastakia, and van Berkel
(2016) note “simply spending more money on leadership programs
is unlikely to be enough. To deliver a superior return on investment
(ROI), leadership spending must be far more focused on and
targeted at what works...with a focus on evidence and results”
(p. 32). In response to this call for a focused investigation of
leadership training, the purpose of the current study is to provide
scientists and practitioners with data-driven recommendations for
effective leadership training programs that are based on a meta-
analytic investigation of 335 leadership training evaluation studies.
In doing so, we attempt to unpack the black box of what works in
This article was published Online First July 27, 2017.
Christina N. Lacerenza, Denise L. Reyes, and Shannon L. Marlow,
Department of Psychology, Rice University; Dana L. Joseph, Department
of Management, University of Central Florida; Eduardo Salas, Department
of Psychology, Rice University.
This work was supported, in part, by research grants from the Ann and
John Doerr Institute for New Leaders at Rice University. We also thank
Fred Oswald for his helpful comments on an earlier version of this article.
Correspondence concerning this article should be addressed to
Christina N. Lacerenza, who is now at the Leeds School of Business,
University of Colorado Boulder, 995 Regent Dr, Boulder, CO 80309.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Journal of Applied Psychology © 2017 American Psychological Association
2017, Vol. 102, No. 12, 1686–1718 0021-9010/17/$12.00
the design, delivery, and implementation of leadership training.
That is, we attempt to answer the questions (a) How effective are
leadership training programs? and (b) How should one design,
deliver, and implement a leadership training program to maximize
The current study addresses these two questions by meta-
analytically summarizing leadership training research. In our ex-
amination of the factors that contribute to leadership training
program effectiveness, we offer several contributions to the sci-
ence of leadership development and training. First, the current
study provides a meta-analytic estimate of the effectiveness of
leadership training across a wide span of years (1951–2014) and
organizations. We note that this literature has been previously
meta-analyzed (i.e., Avolio, Reichard, Hannah, Walumbwa, &
Chan, 2009;Burke & Day, 1986;Collins & Holton, 2004;Powell
& Yalcin, 2010;Taylor, Russ-Eft, & Taylor, 2009b); however, the
Burke and Day (1986) meta-analysis, which is arguably the most
comprehensive meta-analytic investigation of leadership training
to date, only included studies published through 1982, which
excludes the majority of available leadership training studies.
Relatedly, Collins and Holton (2004) only added 13 additional
studies to Burke and Day’s (1986) meta-analytic database after
conducting their own literature search. Powell and Yalcin’s (2010)
meta-analytic investigation only included private sector employ-
ees, thereby limiting the ability to generalize findings to other
populations. Moreover, Avolio, Reichard, et al. (2009) meta-
analysis was limited to 37 primary studies, and Taylor, Russ-Eft,
and Taylor’s (2009b) meta-analysis only included studies that
assessed training transfer.
It is clear that a plethora of research
within this area has yet to be meta-analyzed; thus, we have
included more recent publications to obtain an accurate account of
the current state of the field (i.e., our meta-analysis includes over
three times the amount of primary studies than reported in the
largest previously published meta-analysis).
Second, we empirically test moderators of leadership training
that have yet to be investigated in order to identify characteristics
of the most effective leadership training programs. Although ex-
isting work has investigated as many as eight moderators of
leadership training program effectiveness, the current study exam-
ines 15 moderators of leadership training program effectiveness
that will provide those who develop leadership training programs
a comprehensive understanding of how to design, deliver, and
implement effective programs.
Lastly, the current study makes use of updated meta-analytic
techniques when combining across study designs to accommo-
date different types of primary study designs (Morris & De-
Shon, 2002). Such methods were not used in previous meta-
analyses; past authors either excluded certain design types
(Burke & Day, 1986), or conducted separate meta-analytic
investigations for each design type (Avolio, Reichard, et al.,
2009;Collins & Holton, 2001;Powell & Yalcin, 2010). One
exception to this is the meta-analytic investigation by Taylor et
al. (2009b); however, this study only evaluated training transfer
and did not include trainee reactions, learning, or results as
outcomes. Using these updated techniques results in a more
accurate estimate of the effect of leadership training, and also
allows for stronger causal inferences than typical, cross-
sectional meta-analyses (because all the studies included in the
meta-analysis are either repeated measures or experimental
Leadership Training Defined
To begin, we define leadership training programs as programs
that have been systematically designed to enhance leader knowl-
edge, skills, abilities, and other components (Day, 2000). Parallel
to previous investigations, we include all forms of leader, mana-
gerial, and supervisory training/development programs and/or
workshops in our definition of leadership training programs (e.g.,
Burke & Day, 1986;Collins & Holton, 2004). Although we use the
term “leadership training” as an umbrella term to refer to many
forms of leader development/training, we discuss potential differ-
ences among various forms of leadership training below.
Leadership training is traditionally focused on developing “. . .
the collective capacity of organizational members to engage effec-
tively in leadership roles and processes” (Day, 2000, p. 582). Roles
refer to both formal and informal authority positions, and pro-
cesses represent those that facilitate successful group and organi-
zational performance (Day, 2000). Beyond this approach, recent
research has begun to distinguish between leadership development
and leader development (Day, 2000). Leader development repre-
sents training initiatives aimed at individual-level concepts,
whereas leadership development takes a more integrated approach
that involves the interplay between leaders and followers and
socially based concepts (Iles & Preece, 2006;Riggio, 2008).
Although this difference is recognized, it is often the case that the
terms are used interchangeably, and because of this, the current
study incorporates leader and leadership training/development
evaluation studies.
It is also important to address the distinction between manage-
rial training and leadership development. Managerial training and
development has been described as “the process by which people
acquire various skills and knowledge that increases their effective-
ness in a number of ways, which include leading and leadership,
guiding, organizing, and influencing others to name a few” (Klein
& Ziegert, 2004, p. 228). The objective of a managerial training or
development program involves teaching or enhancing managerial
skills with the purpose of improving job performance (Goldstein,
1980). Although, theoretically, there may be a distinction between
managerial training and leadership training, the terms are often
used interchangeably and in the current investigation, managerial
training programs are included within our examination of leader-
ship training programs. Executive coaching programs are also
included in the current meta-analysis because these programs aid
executives (who are leaders) in learning specific skills or behaviors
(Witherspoon & White, 1996), which is consistent with our defi-
nition of leadership training. In summary, the current study takes
a similar approach to prior work by including managerial, execu-
tive, leader, and leadership training/development programs in our
meta-analytic summary.
See Table A in the online supplementary material for more information
regarding previous meta-analyses.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
The Design, Delivery, and Implementation of
Leadership Training
According to Kirkpatrick (1959), when evaluating training ef-
fectiveness, outcomes of training can be categorized into one of
four criteria: reactions, learning, transfer, and results. This frame-
work has been adopted in previous leadership training meta-
analyses (e.g., Burke & Day, 1986) and other training meta-
analyses (e.g., Arthur, Bennett, Edens, & Bell, 2003), and is used
in the current study to evaluate training effectiveness. Below, we
develop hypotheses based on the extant training, learning, and
leadership literature.
Reactions reflect the attitudinal component of effectiveness and
consist of trainee attitudes toward the training (e.g., training utility
and satisfaction with the training/instructors). As an example,
Kohn and Parker (1972) evaluated the reactions of trainees to a
management development meeting by asking trainees to rate the
extent to which they felt the meeting was of value. According to
Patel (2010), 91% of organizational training evaluations collect
reaction data, although this is not necessarily reported in published
literature nearly as often as it is used in practice. In addition to the
popularity of reaction data, this evaluation technique is important
to consider when evaluating training effectiveness because it can
be a precursor to other desired training outcomes (Hughes et al.,
2016;Sitzmann, Brown, Casper, Ely, & Zimmerman, 2008). Ac-
cording to social learning theory (Bandura, 1977;Bandura &
Wood, 1989), an individual must be motivated to learn for actual
learning to occur, and trainee reactions may serve as an indicator
of motivation (i.e., if a trainee does not find a program to be useful,
s/he may not be motivated to learn). Similarly, Hughes et al.
(2016) tested a sequential model of general training outcomes
using meta-analytic data and found support that reactions set the
stage for more distal outcomes (i.e., transfer, results). Therefore,
reactions may be an important component of a training evaluation
because they signal trainee satisfaction, serve as indicators of
trainee motivation to learn, and can lead to additional outcomes.
Given the popularity and importance of trainee reactions, it is
critical to evaluate whether leadership training elicits positive
changes in employee reactions (i.e., Does leadership training im-
prove trainees’ perceptions of satisfaction and utility?). Although
the idea that employees typically dislike training has been preva-
lent in popular media (e.g., Kelly, 2012), training literature sug-
gests training generally produces positive reactions (e.g., Brown,
2005) that may stem from employees perceiving training as a form
of organizational support. Sitzmann, Brown, Casper, Ely, and
Zimmerman (2008) suggest that this may translate into motivation
and interest such that employees who perceive a high degree of
organizational support will exhibit increased motivation and inter-
est in training, as they believe the organization will subsequently
provide the support they need to apply training to the job. There-
fore, we argue that although employees’ pretraining perceptions of
training utility and satisfaction may be lower given that employees
tend to think they will dislike training (Kelly, 2012), these per-
ceptions will increase during training because of the organizational
support that is reflected in training, resulting in positive prepost
change in training reactions. Although primary studies have begun
to investigate the extent to which leadership training results in
positive trainee reactions, meta-analytic work has yet to provide an
estimate of this effect. As such, we examine this in the current
effort and hypothesize:
Hypothesis 1a: Leadership training programs have a positive
effect on trainee reactions.
Learning is “a relatively permanent change in knowledge or
skill produced by experience” (Weiss, 1990, p. 172) and it repre-
sents what trainees can do following training. According to
Kraiger, Ford, and Salas (1993), learning outcomes can be cate-
gorized as affective-, cognitive-, or skill-based. Affective learning
reflects the acquisition or change in internally based states. Cog-
nitive learning reflects a developmental change in intellectual or
mental-based skills. Skill-based, or psychomotor learning, refers to
the acquisition of technical or motor-skills.
By definition, leadership development programs are designed to
produce changes in the ability of trainees to engage in leadership
roles and processes by presenting new information (Day, 2000).
According to adult learning theory, knowledge acquisition and
learning during training may occur because training transforms
preexisting schemas, or mental constructions of the world, and
challenges assumptions (Mezirow & Taylor, 2009; see also Chen,
2014). For example, the leadership training program described in
a study conducted by Unsworth and Mason (2012) involved iden-
tifying dysfunctional thinking biases; this strategy encouraged
leaders to use more constructive thinking patterns to contend with
management challenges and was ultimately found to enhance
learning. Given the prior meta-analytic evidence supporting learn-
ing as an outcome of leadership training (i.e., Burke & Day, 1986;
Collins & Holton, 2001;Powell & Yalcin, 2010), we hypothesize:
Hypothesis 1b: Leadership training programs have a positive
effect on affective-, cognitive-, and skill-based learning
Transfer (behavior) outcomes represent what the trainee will do,
and can be conceptualized as the extent to which trainees utilize
the skills and abilities taught during training on-the-job, including
job performance (Alliger et al., 1997;Baldwin & Ford, 1988;
Kirkpatrick, 1959). For instance, Russell, Wexley, and Hunter
(1984) evaluated the transfer of trained behaviors in supervisors by
collecting ratings on a behaviorally anchored rating scale that
involved the quality of the supervisor’s work and how organized
s/he was. An obvious primary goal of leadership training is to
create a positive behavioral change in leaders on-the-job (Day,
2000). As such, transfer evaluation is critical for the assessment of
leadership training effectiveness. Interestingly, some scholars have
identified a “transfer problem” (Baldwin & Ford, 1988,p.63)
which refers to the tendency for targeted behaviors to fail to
transfer to the work environment (Goldstein, 1986). Indeed, some
studies have found that training in and of itself does not inherently
lead to transfer (e.g., May & Kahnweiler, 2000). However, some
degree of transfer is generally expected to occur as a function of
training, and the extent to which trained behavior fully transfers to
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
on-the-job behaviors is argued to be contingent upon various
training factors (e.g., the moderators discussed below; Baldwin &
Ford, 1988). Moreover, empirical and meta-analytic evidence in-
dicates that leadership training does, to some extent, generally
evoke transfer of training (Avolio, Rotundo, & Walumbwa, 2009;
Burke & Day, 1986). In line with this research, we hypothesize
Hypothesis 1c: Leadership training programs lead to the trans-
fer of trained affective-, cognitive-, and skill-based concepts.
According to Kirkpatrick (1959), results are evaluative methods
that reflect the training program’s effect on achieving organiza-
tional objectives, including costs, company profits, turnover, and
absenteeism. Results are also often defined in terms of the benefit
of the training compared to the program cost (e.g., ROI; Arthur et
al., 2003). For the current investigation, we categorize results as
either organizational outcomes or subordinate outcomes. For ex-
ample, DiPietro (2006) analyzed the ROI of a leadership training
program, which is an organizational result, whereas Kawakami,
Takao, Kobayashi, and Tsutsumi (2006) evaluated the degree to
which subordinates perceived the work environment as supportive
following the implementation of leadership training, which is a
subordinate result.
It is important to note that results represent the most distal of the
Kirkpatrick (1959,1994) criteria and it has been suggested that
“most training efforts are incapable of directly affecting results
level criteria” (Alliger et al., 1997, p. 6). Some studies have, in
accordance with this suggestion, found no improvement in results
criteria following leadership training. For instance, Lee and col-
leagues (2010) found that the self-reported emotional exhaustion
of subordinates did not change following leadership training. Yet,
this study is in the minority, as meta-analytic evidence indicates
that leadership training has a positive effect on results (Burke &
Day, 1986). Theoretically, scholars have long suggested that re-
sults are a by-product of improvements in learning and transfer
(Kirkpatrick, 1959;Tharenou, Saks, & Moore, 2007;Wright,
McCormick, Sherman, & McMahan, 1999) because positive
changes in employee knowledge and behavior may trickle-down to
affect subordinate performance and/or trickle-up to change orga-
nizational norms (e.g., a sales leader who undergoes training may
increase his or her subordinate’s performance and may provide
other leaders with normative examples of effective performance
behaviors to cause revenue increases in other sales leaders as well).
Therefore, given our expectation that learning and transfer occur as
a result of leadership training, we expect results to improve after
Hypothesis 1d: Leadership training programs positively influ-
ence organizational and subordinate outcomes.
Training Design, Delivery, and Implementation:
Moderator Analyses
In 2010, Barling, Christie, and Hoption (2010) called for a
rapprochement between the training and leadership development
sciences. They noted that a lost opportunity for leadership devel-
opment practitioners and scientists involves advancements within
the training domain that are not necessarily implemented within
leadership literature and practice. The current study responds to
this call by drawing on the sciences of learning and training to aid
in the explanation of leadership training effectiveness. Below, we
identify several design, delivery, and implementation features that
have strong theoretical and empirical support as moderators of
training effectiveness.
Training Design Characteristics
Needs analysis. A needs analysis is the process of identifying
organizational, group, or individual training needs and aligning a
program with these needs (Arthur et al., 2003). By conducting a
thorough needs analysis, developers are better able to provide
trainees with a program that parallels their training needs, thereby
increasing the appeal of the training to the trainee and subse-
quently enhancing results. However, training developers may ne-
glect to conduct a needs analysis because they feel as if it is a
waste of time or that it will not reveal any new information
(Goldstein & Ford, 2002). For example, the U.S. Merit Systems
Protection Board (2015) reported that only 16 out of 23 surveyed
federal government agencies conducted a needs analysis for the
training and development of their senior executives.
Despite the infrequent use of needs analyses, the benefits of
conducting a needs analysis have long been discussed within the
training literature (e.g., Goldstein & Ford, 2002). Collins and
Holton (2004) mention that a lack of a needs analysis can lead to
a generic training program that may not be suitable for the orga-
nization. For example, a training program might emphasize trans-
actional leadership style behaviors, which may not fit an organi-
zational culture that values a transformational leadership style. In
such a situation, trainees may feel the training program is not
relevant to their job (i.e., reduced reactions), and subsequently,
they may be less motivated to learn and transfer the training to the
job. Although no previous meta-analysis has examined the use of
a needs analysis as a moderator of leadership training effective-
ness, given the aforementioned theoretical support for needs anal-
yses, we hypothesize that:
Hypothesis 2: Leadership training programs that are based on
a needs analysis exhibit greater improvements in trainee re-
actions (H2a), learning (H2b), transfer (H2c), and results
(H2d) than programs that are not based on a needs analysis.
Training attendance policy. Generic training literature indi-
cates that trainees who exhibit high motivation and perceive value
in the training program are more likely to implement trained
concepts on-the-job (Blume, Ford, Baldwin, & Huang, 2010;Chia-
buru & Marinova, 2005;Tziner, Fisher, Senior, & Weisberg,
2007), thereby increasing training utility and effectiveness. To
increase trainees’ motivation to transfer, some researchers have
suggested creating voluntary training programs (Curado, Hen-
riques, & Ribeiro, 2015). For example, in a cross-sectional study
conducted on employees within an insurance company, results
suggested that voluntary training programs enhanced transfer mo-
tivation to a greater degree than mandatory programs (Curado et
al., 2015). These results may partially be explained by self-
determination theory (Ryan & Deci, 2000), which suggests that
autonomy fosters motivation; by providing trainees with the choice
to participate in training, the need for autonomy is satisfied,
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
thereby increasing trainee motivation to learn and transfer trained
concepts (Cohen, 1990). Despite the empirically validated benefits
of voluntary training programs, some researchers argue that train-
ing programs should be mandatory, thereby signaling to trainees
that the training is valued by their organization (Salas, Tannen-
baum, Kraiger, & Smith-Jentsch, 2012). Although attendance pol-
icy has not been meta-analyzed in the leadership training literature,
Blume, Ford, Baldwin, and Huang (2010) shed some light on this
issue in the general training literature and found a positive meta-
analytic correlation between transfer and voluntary attendance.
The current study assesses these effects within the realm of lead-
ership training, and we hypothesize:
Hypothesis 3: Voluntary leadership training programs en-
hance trainee reactions (H3a), learning (H3b), transfer (H3c),
and results (H3d) to a greater degree than involuntary
Spacing effect. Cognitive load theory (CLT) is a learning
efficiency theory (e.g., Paas, Renkl, & Sweller, 2004;Sweller, van
Merrienboer, & Paas, 1998;van Merrienboer & Sweller, 2005)
positing that learners have a finite working memory capacity, and
once this is met, processing and learning abilities are hindered or
lost entirely. If an excessive amount of information is presented to
a learner, although the information may enter working memory, it
may not be processed into long-term memory, thus inhibiting the
learners’ ability to access the information in the future (van Mer-
rienboer & Sweller, 2005). CLT highlights the need for training
programs that are designed to reduce extraneous cognitive load
while increasing learners’ ability to process salient information
and still presenting all of the relevant information. One way to do
so is to temporally space training sessions, a technique known as
spacing (Hintzman, 1974). For example, evidence suggests infor-
mation may be remembered at an increased rate (increasing learn-
ing and transfer) if the stimulus presentation sessions are tempo-
rally spaced rather than presented at once (Janiszewski, Noel, &
Sawyer, 2003). The consensus of research from the generic train-
ing literature shows that spaced training is superior to massed
training (Lee & Genovese, 1988). Further, meta-analytic evidence
also suggests that task performance is greater when individuals
practice in spaced intervals as compared to a single massed prac-
tice session (Donovan & Radosevich, 1999). The current meta-
analysis is the first to present a direct evaluation of how spaced
training sessions can affect leadership training program effective-
ness. We hypothesize:
Hypothesis 4: Leadership training programs spanning multiple
training sessions result in greater effects on reactions (H4a),
learning (H4b), transfer (H4c), and results (H4d) in compar-
ison with training programs with one massed training session.
Trainees’ level of leadership. Leadership training can be
administered to low-, middle-, or high-level leaders. It is possible
that the level of the leader can influence how receptive the indi-
vidual is to training. Middle- and high-level leaders may be more
resistant to change because they may feel that change is disruptive
(Hall, 1986). Similarly, because they have higher status, these
leaders might feel as if they do not require further development
because they have already succeeded as a leader (Guinn, 1999). It
has also been argued that leadership experience fosters leadership
skills (Arvey, Rotundo, Johnson, Zhang, & McGue, 2006;Arvey,
Zhang, Avolio, & Krueger, 2007;Riggio, 2008); as such, it could
be the case that low-level leaders who lack leadership experience
enter training with fewer leadership skills, allowing greater room
for improvement. Because of their reduced leadership skills, it
might be easier to garner desired outcomes in low-level leaders,
compared with high-level leaders who may experience a ceiling
effect during leadership training. In line with this theory, Avolio,
Reichard, et al.’s (2009) meta-analysis of leadership training con-
ducted a post hoc analysis on leader level and found that leadership
training had a greater effect on low-level leaders compared to
middle- and high-level leaders. We aim to replicate this finding
using Kirkpatrick’s (1959) evaluation criteria, and hypothesize:
Hypothesis 5: Leadership training programs administered to
low-level leaders will exhibit greater effects on reactions
(H5a), learning (H5b), transfer (H5c), and results (H5d) than
programs administered to middle- or high-level leaders.
Training instructor. According to Kalinoski et al. (2013), the
trainer’s background can influence trainee motivation such that a
program with a trainer from the trainee’s organization (i.e., internal
trainer) will result in increased levels of trainee motivation in
comparison to a program with a trainer outside of the organization
(i.e., external trainer), especially if the trainer is a direct manager
of the trainee. When participating in a leadership training program
that is facilitated by an internal trainer, trainees may perceive the
organization’s support for the training to be greater because they
have a dedicated person on staff who is responsible for the training
program. Conversely, trainees participating in a leadership training
program facilitated by an external trainer might also perceive the
organization as valuing training because they have paid to bring in
an expert (or paid to send the employee to a leadership center).
Empirical support for the effectiveness of both internal (May &
Dubois, 1963;Mccormick, 2000;Teckchandani & Schultz, 2014)
and external (e.g., Alsamani, 1997;Culpin, Eichenberg, Hayward,
& Abraham, 2014;Jorgensen & Els, 2013) instructors exists; thus,
internal and external trainers may be equally effective.
On the contrary, self-administered leadership training programs
might signify to trainees that the organization does not fully
support their training because they might perceive that fewer
resources are needed in comparison to training programs with an
instructor. Because trainees are required to complete the leadership
training on their own, they may be less motivated to exert effort as
they might believe the training is not valued by the organization,
leading to a reduction in positive training outcomes (Blume et al.,
2010). As such, we hypothesize that:
Hypothesis 6: Self-administered leadership training programs
exhibit weaker effects on reactions (H6a), learning (H6b),
transfer (H6c), and results (H6d) than programs facilitated by
an internal or external trainer.
Training Delivery and Implementation Characteristics
Delivery method. Training delivery methods can be catego-
rized into three broad categories based on their purpose: (a) to
deliver information (i.e., information-based); (b) to demonstrate
skills and abilities being trained (i.e., demonstration-based); or (c)
to offer practice opportunities (i.e., practice-based; Salas &
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Cannon-Bowers, 2000;Weaver, Rosen, Salas, Baum, & King,
2010). For example, lectures, presentations, and most text-based
training materials are considered information-based methods,
whereas demonstration-based methods provide trainees with either
negative or positive examples of the trained competency via in-
person, audio, video, or simulated mediums. Finally, practice-
based methods include role-play, simulations, in-basket exercises,
guided practice, and others.
Of the three methods, practice-based training methods are con-
sidered to be the most critical when influencing training outcomes
as they enable trainees to fully conceptualize the material and
implement it within a realistic environment (Weaver et al., 2010).
Not only does generic training literature support the use of prac-
tice, but traditional learning theory also supports this delivery
method. Specifically, constructivist learning theory (Piaget, 1952)
suggests learning is enhanced when the learner develops construc-
tions of the world through his or her experiences and reflects on
these experiences (i.e., constructivism reflects learning by doing).
For example, when leaders are provided with opportunities to
practice certain leadership competencies, they are able to actively
reflect on their leadership experience, encounter and solve prob-
lems within the training environment, and participate in the learn-
ing process, which accelerates the rate at which they learn from
their experience (McCauley & Van Velsor, 2004).
Although training scientists tend to agree that practice-based
methods are perhaps the most effective delivery method, it is
important to note that meta-analytic research indicates information
(e.g., lectures) can also be effective (Arthur et al., 2003). However,
we argue training programs that solely involve practice-based
methods are more effective than those that only utilize
information-based methods because practice-based methods are
more appropriate for leadership skills (e.g., given that leadership
training often involves skills related to interacting with others, a
training program that allows one to practice interacting with others
should be the most effective; Adair, 1983;Wexley & Latham,
2002; see also Arthur et al., 2003). Supporting this notion, a
meta-analysis conducted by Burke and Day (1986) found that the
leadership training programs were more effective when they in-
corporated lecture/discussion and practice/role play when predict-
ing objective learning criteria in comparison to those incorporating
lecture alone. In line with research and theory, we hypothesize
Hypothesis 7: Leadership training programs incorporating
only a practice-based method lead to greater effects on trainee
reactions (H7a), learning (H7b), transfer (H7c), and results
(H7d), as compared with programs incorporating only
information- or demonstration-based methods.
However, researchers have suggested that the most effective
training programs incorporate all three delivery methods (e.g.,
Salas et al., 2012). We posit that any disadvantages associated with
one particular training technique may be overcome by utilizing
multiple techniques. For example, demonstration does not gener-
ally provide a conceptual understanding of why certain behaviors
are occurring whereas information can provide a foundation from
which to understand demonstration (Salas & Cannon-Bowers,
2000), and practice allows for active experimental and reflection.
Meta-analytic research investigating training effectiveness in gen-
eral has shown that most training programs incorporate multiple
delivery methods and the effectiveness of training varies as a
function of the training delivery method specified (Arthur et al.,
2003). Thus, we hypothesize:
Hypothesis 8: Leadership training programs incorporating
information-, demonstration-, and practice-based methods
demonstrate greater effects on reactions (H8a), learning
(H8b), transfer (H8c), and results (H8d) in comparison with
programs implementing only one (e.g., information only) or
two methods (e.g., demonstration and information).
Feedback. According to feedback theory, feedback outlines
successes and failures and how to correct unsuccessful behavior
(Kluger & DeNisi, 1996;Powers, 1973). Feedback is valuable to
trainees because it allows them to gain insight into their current
ability (Maurer, 2002), thereby signaling whether there is a dis-
crepancy between actual and intended performance (Nadler,
1977). Additionally, feedback aids in learning and transfer because
it encourages trainees to participate in metacognitive activities
(i.e., planning, monitoring, and revising behavior; Brown, Brans-
ford, Ferrara, & Campione, 1983) during training. Trainees that
engage in such activities learn at an accelerated rate because they
adjust their learning and behavior after determining problem areas,
thereby leading to increased transfer (Ford, Smith, Weissbein,
Gully, & Salas, 1998). Feedback also engenders affective re-
sponses (Kluger & DeNisi, 1996), and the provision of feedback
may increase trainees’ perceptions of utility, thereby increasing
positive reactions toward the training program (Giangreco, Sebas-
tiano, & Peccei, 2009). As an example of how feedback can
enhance leadership training effectiveness, May and Kahnweiler
(2000) incorporated feedback in their interpersonal skills training
program for supervisors by asking participants to discuss their
performance with a coach using a behavioral checklist after par-
ticipating in a role-play exercise. Interestingly, although feedback
is a popular training tool in leadership training programs, it has yet
to be meta-analytically investigated within this area. The current
study offers this investigation, hypothesizing:
Hypothesis 9: Leadership training programs reporting the use
of feedback display a greater effect on trainee reactions (H9a),
learning (H9b), transfer (H9c), and results (H9d), in compar-
ison with programs that do not report the use of feedback.
Source of feedback. One approach to delivering feedback that
is often utilized within leadership training programs is 360-degree
feedback (Goldsmith, Lyons, & Freas, 2000). Compared with
single-source feedback, 360-degree feedback involves the collec-
tion of information about the focal individual from several sources
(e.g., supervisors, subordinates, customers; Wexley & Latham,
2002). Typically, all sources complete the same development
instrument in reference to the leader, and then the leader is pro-
vided with a report summarizing the ratings. This process allows
the leader to formulate comparisons among various rating sources,
and provides the leader with a more holistic depiction of his or her
areas for improvement because the results are not based on a
single-source (Wexley & Latham, 2002) and because the results
may be perceived as more reliable (Atkins & Wood, 2002;Gre-
guras & Robie, 1998).
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Although some researchers have raised questions about the
validity of 360-degree feedback, believing that simply collecting
feedback from multiple sources may not provide any additional
information (Borman, 1998;DeNisi & Kluger, 2000), organiza-
tions utilizing 360-degree feedback tend to report increases in
productivity and favorable reactions (Hazucha, Hezlett, & Sch-
neider, 1993;Wexley & Latham, 2002). Because this specific type
of feedback is quite popular (around 90% of large organizations
report the use of 360-degree feedback; ETS, 2012), it is important
to investigate its effectiveness via meta-analysis (existing meta-
analyses have yet to compare 360-degree feedback to single-
source feedback). Thus, we hypothesize:
Hypothesis 10: Leadership training programs reporting the use
of 360-degree feedback compared to single-source feedback,
display a greater effect on trainee reactions (H10a), learning
(H10b), transfer (H10c), and results (H10d).
Training location. According to Baldwin and Ford’s (1988)
theory of identical elements, training transfer is maximized when
training stimuli align with the actual work environment (Baldwin
& Ford, 1988). This alignment can be assessed through fidelity, or
the extent to which a training program accurately depicts reality by
mirroring the real-world system (Alessi, 2000;Meyer, Wong,
Timson, Perfect, & White, 2012). According to Rehmann, Mit-
man, and Reynolds (1995), fidelity is composed of three parts:
equipment (i.e., alignment between tools, technology, and other
system features utilized in training and those used on-the-job),
environment (i.e., replication of the actual task environment in
regard to sensory information, motion cues, and other features
within the training program), and psychological fidelity (i.e., the
degree to which task cues and consequences mirror those experi-
enced on-the-job), all of which are important for simulating a
trainee’s job environment during a training program.
Within an on-site leadership training program, trainees are typ-
ically immersed within an environment that is similar, if not
identical, to their work environment; as such, on-site training
programs display high equipment, environment, and psychological
fidelity. For example, during on-the-job leadership training (an
on-site leadership training method), leaders participate in training
and practice trained concepts during normal working conditions
and while working on required work tasks (House & Tosi, 1963;
c Milicevic, Bjegovic-Mikanovic, Terzic-Supi´
c, & Vasic,
2011). In contrast, off-site leadership training programs are housed
within a facility other than the trainee’s organization; as such, the
level of equipment and environment fidelity is reduced, leading to
a potential reduction in training outcomes. In addition, because
on-site programs are conveniently located, we believe that stake-
holders are more likely to be involved in on-site programs, which
is posited to enhance motivation in both trainers and trainees
(Salas et al., 2015). Further, on-site leadership training methods
likely result in a greater ROI as compared with off-site training
programs (Arthur et al., 2003;Avolio, Avey, & Quisenberry,
2010) because resource requirements associated with off-site train-
ing programs may constrain implementation of the training (i.e.,
the expense of off-site programs can be substantial, forcing some
to choose shorter and/or less comprehensive off-site programs).
Although some researchers and practitioners might note that on-
site work can distract leaders attending the program (e.g., trainees
might be more tempted to leave training for work emergencies)
and trainees may not have the opportunity to learn from individ-
uals outside of their organization during on-site training, the ben-
efits previously mentioned likely outweigh these concerns.
Though it has been assumed that on-site training methods have
the “ability to minimize costs, facilitate and enhance training
transfer” (Arthur et al., 2003, p. 243), no existing meta-analytic
evidence has tested this assumption. As such, some have called for
research on the effectiveness of on-site training methods in com-
parison with off-site training programs (e.g., Wexley & Latham,
1991). The current study responds to these calls, and we hypoth-
Hypothesis 11: Programs hosted on-site display a greater
effect on trainee reactions (H11a), learning (H11b), transfer
(H11c), and results (H11d) compared with off-site programs.
Training setting. The setting of the leadership training pro-
gram and whether it is face-to-face or virtually based might also
play a key role in contributing to training effectiveness. We define
virtually based training programs as interventions delivered via a
computer or similar device with no facilitators physically present.
To compare virtual and face-to-face training, we borrow theory
from education on the advantages and disadvantages of e-learning
(Garrison, 2011), which is quickly becoming a popular method of
replacing traditional face-to-face instruction. E-learning is pur-
ported to offer several advantages over face-to-face instruction
such as the capacity to provide self-paced and learner-centered
experiences and archival access to learning materials (Zhang,
Zhao, Zhou, & Nunamaker, 2004). In spite of these advantages,
e-learning is more likely to involve a lack of immediate feedback
unless it is embedded within the training program, discomfort
experienced by users with limited technology experience, and the
potential for increased frustration, anxiety, and confusion as a
result of poorly equipped e-learning systems (e.g., technology
issues; Zhang et al., 2004). We suggest these same problems may
also plague virtually based leadership training programs, making
such programs less effective than face-to-face programs. We fur-
ther posit that a lack of fidelity may also contribute to virtually
based programs being less effective than face-to-face interven-
tions. For example, the high fidelity of face-to-face programs can
incorporate dyadic interactions such as role-play to demonstrate
and practice interpersonal leadership skills, whereas virtual envi-
ronments make interpersonal interaction difficult.
Magerko, Wray, Holt, and Stensrud (2005) also note that face-
to-face training can allow trainers to oversee each trainee’s indi-
vidual progress and adjust the learning experience as necessary. In
contrast, virtual training programs are not typically designed in this
manner (virtual training is more likely to involve asynchronous
communication between the trainer and trainee), although recent
initiatives are seeking to address this problem (Magerko et al.,
2005). Because of this, we argue that trainees in a face-to-face
environment may be more likely to receive an optimal level of
training difficulty, as they have the benefit of interacting with a
trainer who can modify the content or provide specific guidance
that a virtual environment may lack. Although a meta-analysis has
yet to formally compare virtual and face-to-face training environ-
ments, in line with the above theory and evidence we hypothesize:
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Hypothesis 12: Face-to-face leadership training programs in-
crease positive trainee reactions (H12a), learning (H12b),
transfer (H12c), and results (H12d) to a greater degree than
virtual programs.
Exploratory Moderators
There are several additional training design, delivery, and im-
plementation features that might impact training effectiveness,
despite a lack of theoretical support for how the moderator will
affect program effectiveness. To provide scientists and practitio-
ners with a robust meta-analytic investigation of leadership train-
ing programs, we incorporate the following moderators within our
exploratory analyses: (a) training content, (b) program evaluator’s
affiliation, (c) training duration, and (d) publication date.
Training content. There are multiple ways to succeed as a
leader, yet it is unclear which knowledge, skills, and abilities
should be included in the content of a training program to maxi-
mize leadership training effectiveness. In a meta-analysis of 70
leadership training programs conducted by Burke and Day (1986),
training that involved human relations content produced the largest
effect on learning, and self-awareness training content had the
greatest effect on transfer criteria; however, these analyses were
based on a small number of primary studies (k8 and 7,
respectively), and only begin to scratch the surface as to what
content should be incorporated within a leadership training pro-
gram to maximize effects. The current study aims to examine the
effect of training content in a greater pool of primary studies using
Hogan and Warrenfeltz’s (2003) domain model of managerial
education. This theoretical model organizes leadership competen-
cies into four competency domains: intrapersonal skills, interper-
sonal skills, leadership skills, and business skills. Ultimately, this
exploratory analysis allows us to examine whether each type of
training content is effective (e.g., Is interpersonal skills training
effective?) and whether certain training content is more successful
in producing improvements in reactions, learning, transfer, and
results than other training content, giving decision makers knowl-
edge of which type of training may be most valuable if a needs
analysis does not specify a content focus.
Training evaluator affiliation. Additionally, we sought to
identify whether evaluator affiliation (i.e., academic primary study
authors only, practitioner authors only, or a mix of academics and
practitioners) played a role in the effectiveness of the program.
Although the author of a primary study is not always involved in
the design, delivery, and evaluation of the training program, in
many cases, the authors are involved, and thus, we examine
whether a combination of academics and practitioners is more
likely to produce highly effective training programs because of
their diverse backgrounds that can rely on both science and prac-
tice. This has yet to be investigated in the training literature and
poses an interesting question, as well as a test of the revered
scientist-practitioner model (Jones & Mehr, 2007).
Training duration. Researchers have suggested that perma-
nent cognitive and schematic changes are needed in order for
leaders to fully understand how to transform and develop their
followers over time (Lord & Brown, 2004;Wofford, Goodwin,
& Whittington, 1998), and this process takes time. Thus, longer
training programs may be more effective. Contrastingly, longer
leadership training programs may result in cognitive overload
(Paas, et al., 2004) and thus less effective training. Taylor et
al.’s (2009b) meta-analysis of leadership training also found no
support for length of training as a continuous moderator of
training effectiveness (i.e., transfer). As such, the current effort
will shed light on this issue by meta-analytically testing the
impact of training duration on leadership training effectiveness.
Publication date. The design, delivery, and implementation
of leadership training programs has most likely changed over the
years due to technological advancements and the dynamic nature
of organizations today. We investigate whether this has led to a
change in the effectiveness of training. Although Powell and
Yalcin (2010) concluded that leadership training programs evalu-
ated in the 1960s and 1970s produced greater effect sizes than
other decades, this finding has yet to be corroborated while exam-
ining publication date as a continuous moderator using an updated
meta-analytic database.
Literature Search and Inclusion Criteria
Relevant empirical studies were extracted in several ways. First,
literature searches in the databases PsycINFO (1887–December
2014), Business Source Premiere (1886December 2014), and
ProQuest Dissertations and Theses (1861–December 2014) were
conducted. Although the search goes as far back as the late 1800s,
the earliest usable study for the current meta-analysis was pub-
lished in 1951. Published and unpublished studies were included in
order to reduce the potential for publication bias. The following
keywords were included in this initial search: leadership,leader,
(utilizing the asterisk within searches allows for extraction
of all keywords beginning with the root, e.g., management, and
manager), executive, supervisory, training, and development.
Google Scholar was also searched using the same keywords, with
the addition of method (e.g., leadership training AND method).
Lastly, reference lists from previous meta-analyses on pertinent
topics (Arthur et al., 2003;Avolio, Reichard, et al., 2009;Burke &
Day, 1986;Collins & Holton, 2001;Keith & Frese, 2008;Powell
& Yalcin, 2010;Taylor et al., 2009b) were cross-checked. This
initial search returned over 20,742 articles.
Each study was reviewed and included if the following con-
ditions were met: (a) it included an empirical evaluation of a
leadership, leader, managerial, supervisory, or executive train-
ing (development or coaching) program; (b) the study used a
repeated measures design, independent groups design, or an
independent groups design with repeated measures; (c) it in-
volved an adult sample (i.e., over 18-years-old); (d) it was
written in English; (e) it provided the sample size and enough
information to calculate an effect size; and (f) the participants
were employees (e.g., not MBAs/undergraduate students). Ap-
plying the above criteria resulted in a final sample of 335
independent studies (N26,573).
Coding Procedures
Three of the authors coded the studies, and coding discrepancies
were resolved through discussion after each article was indepen-
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
dently coded.
Overall rater agreement was 96.2%. Studies were
coded for sample size, evaluation measure utilized, reliability (i.e.,
Cronbach’s alpha), experimental design (i.e., repeated measures,
same subjects are assessed before and after training; independent
groups, an experimental and control group are measured; and
independent groups with repeated measures, an experimental and
control group are tested before and after training), and effect size.
A Cohen’s deffect size was coded for each study, or the statistical
information needed to compute a Cohen’s deffect size (e.g., means
and standard deviations, t-value). If a primary study reported
multiple, nonindependent effect sizes (e.g., two separate measures
of transfer were used), they were combined using the intercorre-
lations among the measures to create a linear composite (see the
formula provided in Nunnally, 1978). If the information necessary
for calculating a linear composite was not reported (i.e., the inter-
correlations were not available), the effect sizes were averaged.
The following definitions and descriptions were implemented
when coding for additional moderators.
Evaluation criteria. To categorize effect sizes by training
evaluation criteria, we followed the framework developed by Kirk-
patrick (1959).Reactions represent trainees’ perception of the
training program as related to its utility and general likability
(Alliger et al., 1997;Kirkpatrick, 1959,1996). Learning measures
represent quantifiable evidence that trainees’ learned the knowl-
edge, skills, and/or abilities represented during training (Alliger et
al., 1997;Kirkpatrick, 1959,1996). Transfer represents measures
evaluating on-the-job behavior, as assessed by trainees, supervi-
sors, peers, or subordinates (Alliger et al., 1997;Kirkpatrick, 1959,
1996). Both learning and transfer were further categorized as
cognitive learning/transfer (i.e., verbal knowledge, knowledge or-
ganization, and cognitive strategies), affective learning/transfer
(i.e., attitudinal and motivational elements), or skill-based learn-
ing/transfer (i.e., compilation elements and automaticity) based
upon the framework identified by Kraiger et al. (1993). In addition
to these three subcategories, transfer also included a job perfor-
mance category reflecting studies evaluating training with regard
to on-the-job performance. Results included changes in organiza-
tional objectives because of the training program (Alliger et al.,
1997;Kirkpatrick, 1959,1996). Effect sizes were further catego-
rized as organizational results (e.g., turnover, absenteeism, ROI,
profit) or subordinate results (e.g., subordinate job satisfaction,
performance ratings of the leader’s subordinates).
Delivery method. Training delivery method was classified as
information-based (e.g., lectures, presentations, advanced organizers,
text-based training materials), demonstration-based (e.g., case studies,
in-person modeling, computer-generated avatars), practice-based
(e.g., role-play, simulations, in-basket exercises; Salas & Cannon-
Bowers, 2000;Weaver et al., 2010), or a combination of the above
listed methods if more than one delivery method was implemented
(i.e., information and demonstration,information and practice,dem-
onstration and practice,orinformation, demonstration, and practice).
Training content. In order to determine the content trained
within each leadership training program, we utilized the compe-
tency domain model developed by Hogan and Warrenfeltz (2003).
Each study was coded as either: intrapersonal (i.e., internal, indi-
vidual behaviors such as coping with stress, goal-setting, time
management), interpersonal (i.e., behaviors associated with build-
ing relationships such as active listening and communication),
leadership (i.e., behaviors concerned with team building, produc-
ing results, and influencing others), or business (i.e., behaviors
related to content or technical skills such as analytical skills,
financial savvy, decision-making, and strategic thinking).
Trainees’ level of leadership. We coded each primary study
according to the trainees’ level of leadership. Specifically, each
study was coded as either: low-level (i.e., those who interact
directly with first-level employees), middle-level (i.e., managers in
charge of multiple groups of subordinates and serve as a liaison
between subordinate groups and other levels of the organization),
or high-level (i.e., those that do not have any managers above them
except for executives). This categorization scheme was based on
that utilized by previous researchers (DeChurch, Hiller, Murase,
Doty, & Salas, 2010).
Additional codes. Each primary study was also coded with
regard to the presence of feedback, source of feedback (i.e.,
single-source feedback or 360-degree feedback), publication status
(i.e., published or unpublished), training length (in hours), pres-
ence of spacing effect (i.e., training sessions were temporally
spaced daily/weekly/yearly/monthly, or massed), presence of a
formal needs analysis, training setting (i.e., face-to-face or virtual),
location (i.e., on-site or off-site; self-administered programs were
not included in this moderator analysis), trainer’s background (i.e.,
internal,external, or self-administered), attendance policy (volun-
tary or involuntary), label of training program (i.e., leadership
training, leadership development, management training, manage-
ment development, supervisor training, supervisor development,
executive training,orexecutive development), evaluator affiliation
(i.e., academic,practitioner,oramix of academics and practitio-
ners), and publication year.
The current meta-analysis incorporated studies that had three
experimental design types (i.e., repeated measures, independent
groups, independent groups with repeated measures). Because
estimated population parameters are dependent upon the design
type of each study (Ray & Shadish, 1996), appropriate statistical
adjustments were made when aggregating across effect sizes
(Glass, McGaw, & Smith, 1981;Morris & DeShon, 1997;Morris
& DeShon, 2002) using procedures outlined in Morris and DeShon
(2002). This is warranted, in part, because the overall corrected
meta-analytic effect sizes were not practically or significantly
different across study designs (t0.33, p.05; see Table 1).
These procedures required the calculation of r
which is the
correlation between pre- and posttraining scores. The majority of
studies did not report this correlation; therefore, the inverse sam-
pling error variance-weighted average r
across repeated
measures studies was used, per recommendations by Morris and
DeShon (2002). For the current meta-analytic investigation,
.52. After conversions, a repeated-measures deffect size
was estimated by conducting a random effects meta-analysis that
accounted for sampling error variance (Hedges & Olkin, 1985) and
corrected for criterion-related unreliability (Hunter & Schmidt,
Ttests of the mean effect sizes were used to compare moderator
conditions, following Hunter and Schmidt (2004). The subgroup
Detailed coding information for each primary study is located in Table
B of the online supplementary material.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
meta-analytic technique was also used to test for moderators
(Arthur, Bennett, & Huffcutt, 2001), and a separate artifact distri-
bution was applied for each criterion (i.e., reactions, learning,
results, transfer, and overall). In line with recommendations from
previous researchers, the meta-analytic effect sizes were corrected
for unreliability using artifact distributions (Hunter & Schmidt,
2004); the artifact distributions were created from averaging the
internal consistency estimates reported in the primary studies. The
mean reliability for the overall analysis was .97, and the mean
reliabilities for reaction, learning, transfer, and results were as
follows: .92, .95, .96, and .93, respectively.
Results are presented in Tables 1–7.Table 1 presents the overall
meta-analytic dcombined across evaluation criteria, and Tables 2,
3,4, and 5present specific outcome effect sizes (reactions, learn-
ing, transfer, and results). Effect sizes are presented as both the
corrected average value (corrected for unreliability in the crite-
rion and sampling error variance; Hunter & Schmidt, 2004) and the
observed dvalue. To test for the statistical significance of the
effect size, 95% confidence intervals were reported. Continuous
moderator analyses (i.e., training duration and publication year)
are reported in Tables 6 and 7.
Multiple techniques were implemented to test for publication
bias. First, we visually inspected a funnel plot for asymmetry
constructed on the overall effect (Sterne, Becker, & Egger, 2005).
Second, we conducted a trim and fill analysis based on procedures
identified by Duval and Tweedie (2000); results from a fixed
effects model on the overall effect suggest that zero studies were
imputed to the left of the mean, suggesting that publication bias is
not present. Lastly, a moderator analysis was conducted by com-
paring published and unpublished studies (see Table 1), and results
indicated that there were no significant differences between meta-
analytic effect sizes, t1.08, p.05. Taken together, these
results suggest that there is no evidence of publication bias.
In support of Hypothesis 1, results suggest leadership training
programs are effective. Specifically, the overall effect size was
significant and positive (␦⫽.76, 95% CI [.64, .89]). Comparing
across criteria, the strongest effect was found for transfer (␦⫽.82,
95% CI [.58, 1.06]), followed by learning (␦⫽.73, 95% CI [.62,
.85]), results (␦⫽.72, 95% CI [.60, .84]), and reactions (␦⫽.63,
95% CI [.12, 1.15]). However, these values were not significantly
Table 1
Meta-Analytic Results: Overall
Meta-analysis kNdSD %Var
95% CI 80% CR
Overall 335 26,573 .73 .76 1.17 .20 .64 .89 .74 2.26
Published 214 17,219 .78 .81 1.40 .10 .63 1.00 .98 2.60
Unpublished 121 9,354 .66 .69 .66 .96 .57 .80 .16 1.53
Study design
Repeated measures 208 18,182 .75 .78 1.19 .17 .62 .94 .74 2.30
Independent groups 62 3,901 .74 .76 1.13 .26 .48 1.05 .69 2.22
Independent groups and
Repeated measures 58 4,490 .46 .48 .90 .99 .25 .71 .67 1.63
Note.knumber of independent studies; Nsample size; drepeated measures Cohen’s d;␦⫽Cohen’s dcorrected for unreliability in the criterion;
SD corrected standard deviation; %Var percent of variance accounted for by sampling error; CI confidence interval; LL lower limit; UL upper
limit; CR credibility interval.
Table 2
Meta-Analytic Results: Reactions Criteria
Meta-analysis kNdSD %Var
95% CI 80% CR
Reactions 7 620 .58 .63 .73 1.05 .12 1.15 .30 1.57
Training method
Information and practice 2 115 1.31 1.47 1.57 .40 .58 3.52 .54 3.48
Information, demonstration, and practice 2 257 .11 .12 .28 10.03 .27 .50 .24 .48
Feedback 3 145 1.02 1.13 1.28 .00 .24 2.50 .51 2.77
No feedback 4 475 .52 .57 .57 1.56 .04 1.09 .17 1.30
Evaluator’s affiliation
Academic 3 363 .16 .17 .32 8.21 .18 .53 .23 .58
Practitioner 2 118 .59 .65 .00 100.00 .64 .66 .65 .65
Mix 2 139 1.42 1.61 .73 3.08 .64 2.59 .68 2.54
Note.knumber of independent studies; Nsample size; drepeated measures Cohen’s d;␦⫽Cohen’s dcorrected for unreliability in the criterion;
SD corrected standard deviation; %Var percent of variance accounted for by sampling error; CI confidence interval; LL lower limit; UL upper
limit; CR credibility interval.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
different from one another, indicating that leadership training
programs tend to be equally effective across criteria. Positive
effect sizes were also found across more specific evaluation cri-
teria (i.e., cognitive learning/transfer, affective learning/transfer,
skill-based learning/transfer, job performance, organizational re-
sults, and subordinate results; see Tables 2–5). Upon comparing
these more specific criteria, we found that training programs
resulted in significantly greater cognitive learning gains (␦⫽1.05)
than affective learning gains (␦⫽.54, t3.91, p.05) and
skill-based learning outcomes (␦⫽.51, t3.84, p.05; other
comparisons among specific learning outcomes were not signifi-
cant). For transfer of training, training programs resulted in sig-
nificantly greater skill-based transfer outcomes (␦⫽.89) than
affective transfer gains (␦⫽.24, t4.49, p.05) and job
performance gains (␦⫽.56, t2.02, p.05). Further, affective
transfer (␦⫽.24) was also significantly lower than cognitive
transfer (␦⫽.68, t⫽⫺2.70, p.05) and job performance gains
(␦⫽56, t⫽⫺3.26, p.05).Training programs also displayed
significantly greater improvements in organizational outcomes
(␦⫽.75) than subordinate outcomes (␦⫽.22, t6.08, p.05).
Table 3
Meta-Analytic Results: Learning Criteria
Meta-analysis kNdSD %Var
95% CI 80% CR
Learning 153 9,716 .69 .73 .74 .83 .62 .85 .22 1.68
Affective learning 55 4,630 .51 .54 .39 4.16 .44 .65 .04 1.04
Cognitive learning 48 3,206 .99 1.05 .93 .00 .79 1.30 .14 2.24
Skill-based learning 103 5,437 .49 .51 .72 2.25 .37 .65 .41 1.43
Training method
Information 21 1,568 .57 .60 .66 1.37 .31 .89 .24 1.44
Practice 10 302 .27 .28 .13 47.21 .13 .44 .11 .45
Information and demonstration 5 177 1.07 1.14 .78 .10 .44 1.85 .14 2.15
Information and practice 46 2,164 .57 .60 .68 2.18 .40 .79 .28 1.47
Information, demonstration, and practice 29 1,913 1.16 1.24 .74 .35 .97 1.51 .29 2.19
Feedback 44 2,437 .75 .79 .69 .78 .59 .99 .09 1.67
No feedback 108 7,279 .68 .71 .75 .83 .57 .85 .25 1.68
Single-source feedback 6 363 .63 .66 .28 6.53 .42 .91 .31 1.02
360-feedback 8 430 .54 .57 .49 3.48 .20 .93 .06 1.20
Needs analysis
Needs analysis 30 1,218 1.05 1.12 .98 .03 .76 1.47 .14 2.38
No needs analysis 123 8,498 .64 .68 .68 1.15 .56 .80 .19 1.54
Spacing effect
Spaced 116 7,526 .70 .74 .76 .73 .61 .88 .23 1.72
Massed 17 1,344 .86 .92 .74 1.16 .57 1.26 .03 1.86
Virtual 10 620 .52 .55 .31 7.45 .34 .76 .16 .94
Face-to-face 116 6,916 .74 .78 .82 .55 .63 .93 .27 1.83
On-site 27 2,001 .95 1.01 1.09 .01 .62 1.41 .38 2.40
Off-site 31 2,493 .80 .84 .39 1.07 .70 .98 .35 1.34
Voluntary 47 3,038 .65 .69 .64 1.29 .50 .87 .13 1.50
Involuntary 25 2,323 .88 .94 .76 .10 .64 1.23 .04 1.91
Leader level
High-level 16 627 .49 .52 .72 2.91 .17 .87 .40 1.45
Middle-level 32 1,784 .63 .67 .82 1.02 .39 .95 .38 1.71
Low-level 25 1,692 .93 .99 .83 .04 .66 1.31 .08 2.05
Evaluator’s affiliation
Academic 106 6,692 .65 .68 .65 1.33 .56 .81 .15 1.51
Practitioner 19 785 .25 .26 .62 5.84 .02 .53 .53 1.05
Mix 24 1,774 1.14 1.21 .90 .15 .86 1.56 .06 2.36
Intrapersonal 4 265 .69 .73 .18 8.21 .50 .95 .50 .95
Interpersonal 9 263 .36 .38 .29 17.92 .13 .64 .02 .75
Leadership 21 1,232 .62 .66 .57 2.03 .42 .90 .06 1.38
Business 16 1,370 .91 .97 .34 .27 .79 1.14 .53 1.41
External 51 2,744 .70 .74 .55 1.57 .58 .89 .03 1.45
Internal 23 2,396 .92 .98 .98 .02 .60 1.36 .27 2.23
Self-administered 9 604 .50 .53 .30 7.91 .31 .74 .15 .91
Note.knumber of independent studies; Nsample size; drepeated measures Cohen’s d;␦⫽Cohen’s dcorrected for unreliability in the criterion;
SD corrected standard deviation; %Var percent of variance accounted for by sampling error; CI confidence interval; LL lower limit; UL upper
limit; CR credibility interval.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Concerning Hypothesis 2, programs that included a needs anal-
ysis displayed significantly stronger effects for both learning and
transfer than those reporting no needs analyses (t2.57, p.05;
t16.14, p.05, respectively). However, this pattern was not
found for results, as there was no significant difference between
programs that did and did not conduct a needs analysis, t⫽⫺1.59,
p.05, and this could not be tested for reactions due to a small
amount of available primary studies.
Thus, Hypothesis 2 was
supported for two of the three outcomes investigated.
Hypothesis 3c was supported such that voluntary training pro-
grams displayed a significantly stronger effect for transfer than
involuntary programs, t6.26, p.05. However, contrary to
Because there were so few studies involving reactions as a criterion,
many moderators could not be tested on this criterion. In the analyses that
follow, we only discuss moderators of reactions when there were enough
primary studies available to conduct the moderator analysis.
Table 4
Meta-Analytic Results: Transfer Criteria
Meta-analysis kNdSD %Var
95% CI 80% CR
Transfer 190 12,124 .78 .82 1.75 .09 .58 1.06 1.42 3.05
Affective transfer 34 1,463 .23 .24 .34 14.88 .11 .37 .19 .67
Cognitive transfer 21 844 .65 .68 .75 .75 .36 1.00 .27 1.64
Skill-based transfer 155 10,233 .85 .89 1.87 .04 .61 1.18 1.50 3.28
Job performance 35 1,686 .53 .56 .48 4.09 .39 .73 .06 1.17
Training method
Information 23 1,648 .43 .45 1.06 .88 .03 .87 .91 1.81
Practice 28 1,613 .37 .39 .41 7.11 .23 .55 .13 .91
Information and practice 43 2,548 .41 .43 .39 6.61 .31 .56 .07 .94
Demonstration and practice 4 270 .68 .71 .00 11.58 .52 .90 .71 .71
Information, demonstration, and practice 45 2,600 2.02 2.20 3.02 1.97 1.35 3.05 1.66 6.07
Feedback 68 4,250 1.32 1.40 2.64 .13 .79 2.00 1.98 4.78
No feedback 122 7,648 .48 .50 .51 1.90 .37 .63 .43 1.41
Single-source feedback 7 265 .50 .52 .79 2.51 .06 1.10 .50 1.54
360-feedback 16 1,015 .24 .25 .27 15.84 .11 .40 .09 .60
Needs analysis
Needs analysis 34 1,628 3.02 3.51 3.60 11.63 2.34 4.68 1.09 8.12
No needs analysis 156 10,496 .40 .42 .45 5.09 .34 .49 .16 .99
Spacing effect
Spaced 133 9,113 .87 .92 1.89 .03 .61 1.23 1.50 3.33
Massed 24 1,149 .43 .45 .65 3.25 .18 .71 .39 1.28
Virtual 11 562 .21 .22 .12 27.79 .06 .37 .06 .37
Face-to-face 139 8,378 1.04 1.10 2.11 .00 .76 1.43 1.59 3.80
On-site 38 2,490 .35 .37 .30 10.98 .26 .47 .02 .76
Off-site 32 3,476 .51 .54 .57 1.61 .34 .73 .19 1.26
Voluntary 43 2,688 1.99 2.17 3.14 1.57 1.27 3.08 1.85 6.20
Involuntary 46 3,653 .36 .38 .66 2.22 .18 .57 .48 1.21
Leader level
High-level 19 995 .36 .37 .32 10.92 .21 .54 .04 .78
Middle-level 44 2,413 .52 .54 1.20 .72 .20 .89 1.00 2.08
Low-level 58 3,140 1.84 1.99 3.05 1.23 1.23 2.74 1.91 5.89
Evaluator’s affiliation
Academic 150 8,767 .45 .47 .59 3.21 .38 .56 .28 1.22
Practitioner 11 697 1.28 1.35 2.07 .16 .17 2.54 1.29 4.00
Mix 29 2,660 1.88 2.04 3.26 .72 .90 3.18 2.14 6.21
Intrapersonal 6 231 .21 .21 .00 100.00 .09 .33 .21 .21
Interpersonal 10 279 .26 .28 .28 23.14 .04 .51 .09 .64
Leadership 35 1,289 .53 .55 .65 3.22 .33 .77 .28 1.38
Business 19 2,876 1.56 1.67 2.94 .17 .39 2.94 2.10 5.44
External 70 4,678 .49 .52 .58 2.53 .38 .65 .23 1.26
Internal 18 1,172 .43 .45 .54 3.63 .20 .70 .24 1.13
Self-administered 10 534 .21 .22 .14 25.95 .05 .38 .03 .40
Note.knumber of independent studies; Nsample size; drepeated measures Cohen’s d;␦⫽Cohen’s dcorrected for unreliability in the criterion;
SD corrected standard deviation; %Var percent of variance accounted for by sampling error; CI confidence interval; LL lower limit; UL upper
limit; CR credibility interval.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Hypothesis 3d, involuntary training programs exhibited a signifi-
cantly stronger effect for results than voluntary programs, t5.30,
p.05. There was no significant difference between involuntary
and voluntary training programs for learning, t⫽⫺1.44, p.05.
Regarding Hypothesis 4, for results, training programs that
spanned across multiple sessions displayed a significantly larger
effect size than training programs including one massed training
session, t5.26, p.05. This same pattern was also found for
transfer, t2.28, p.05 but there was no significant difference
for learning, t⫽⫺0.97, p.05, providing support for Hypothesis
4c and 4d, but no support for Hypothesis 4b (Hypothesis 4a could
not be investigated).
Our analysis for Hypothesis 5 indicates that training programs
provided to a sample of low-level leaders were associated with
significantly stronger transfer effect sizes than those provided to
high-level leaders, t6.59, p.05 or middle-level leaders, t
4.11, p.05. There was no significant difference in transfer
between high-level leaders and middle-level leaders, t⫽⫺0.89,
p.05. Moreover, for learning, there were no significant differ-
ences between high-level and middle-level leaders, t⫽⫺0.65, p
.05, high-level and low-level leaders, t⫽⫺1.90, p.05, and
middle-level and low-level leaders, t⫽⫺1.47, p.05. This same
pattern was found for results, as there were also no significant
differences between high-level and middle-level leaders, t0.00,
Table 5
Meta-Analytic Results: Results Criteria
Meta-analysis kNdSD %Var
95% CI 80% CR
Results 78 11,640 .66 .72 .56 .76 .60 .84 .01 1.43
Organizational outcomes 53 10,466 .69 .75 .56 .51 .61 .89 .03 1.46
Subordinate outcomes 30 2,507 .21 .22 .27 11.19 .11 .34 .13 .58
Training method
Information 9 1,136 .10 .11 .00 67.39 .04 .18 .11 .11
Practice 11 688 .35 .38 .00 24.40 .25 .51 .38 .38
Information and practice 22 1,208 .55 .60 .63 2.20 .33 .86 .21 1.40
Information, demonstration, and practice 17 1017 .42 .45 .58 3.71 .19 .72 .28 1.19
Feedback 34 2,027 .76 .84 .73 .61 .60 1.07 .09 1.77
No feedback 44 9,653 .64 .70 .52 .64 .55 .84 .04 1.36
Single-source feedback 10 364 .51 .56 .80 2.60 .08 1.03 .47 1.58
360-feedback 7 276 .72 .79 1.06 .58 .04 1.54 .57 2.15
Needs analysis
Needs analysis 20 689 .40 .43 .75 3.81 .11 .76 .52 1.39
No needs analysis 58 10,991 .67 .73 .54 .60 .60 .87 .04 1.43
Spacing effect
Spaced 50 8,173 .45 .48 .42 2.46 .37 .60 .05 1.02
Massed 10 335 .17 .18 .00 59.42 .05 .32 .18 .18
Face-to-face 62 8,308 .43 .47 .41 3.23 .37 .56 .05 .99
On-site 15 1,075 1.12 1.25 .76 .18 .88 1.61 .28 2.21
Off-site 13 1,028 .37 .40 .29 7.87 .21 .59 .03 .77
Voluntary 24 1,523 .48 .52 .54 3.22 .30 .74 .17 1.21
Involuntary 12 2,846 1.24 1.39 .42 .78 1.16 1.62 .85 1.93
Leader level
High-level 6 383 .34 .36 .31 8.75 .06 .67 .03 .76
Middle-level 24 1,217 .33 .36 .62 4.15 .11 .61 .43 1.15
Low-level 18 1,303 .27 .29 .00 53.60 .22 .36 .29 .29
Evaluator’s affiliation
Academic 57 6,364 .94 1.04 .61 .03 .89 1.19 .26 1.83
Practitioner 6 4,211 .36 .39 .21 2.67 .23 .55 .11 .66
Mix 14 1,098 .53 .58 .36 4.76 .39 .78 .13 1.04
Intrapersonal 4 244 .62 .67 .00 100.00 .61 .74 .67 .67
Interpersonal 4 410 .37 .40 .09 31.85 .25 .55 .29 .50
Leadership 17 5,115 .45 .49 .40 1.51 .31 .66 .03 1.00
Business 8 953 .16 .18 .10 23.75 .05 .30 .05 .30
External 37 3,186 .64 .69 .65 1.06 .49 .90 .14 1.53
Internal 6 173 .63 .69 1.37 .64 .45 1.82 1.07 2.44
Self-administered 4 363 .48 .52 .21 8.74 .25 .79 .25 .79
Note.knumber of independent studies; Nsample size; drepeated measures Cohen’s d;␦⫽Cohen’s dcorrected for unreliability in the criterion;
SD corrected standard deviation; %Var percent of variance accounted for by sampling error; CI confidence interval; LL lower limit; UL upper
limit; CR credibility interval.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
p.05, high-level and low-level leaders, t0.56, p.05, and
middle-level and low-level leaders, t0.56, p.05. Thus,
support for Hypothesis 5 was found for one of the three outcomes
Our analysis for Hypothesis 6 indicates that programs delivered
via internal instructors displayed significantly stronger effects for
learning than those that were self-administered, t2.12, p.05.
However, there were no significant differences between programs
delivered by external and internal trainers, t⫽⫺1.16, p.05 or
self-administered and externally delivered programs, t⫽⫺1.65,
p.05 for learning. Moreover, programs delivered by external
instructors were associated with significantly stronger effects for
transfer than self-administered programs, t3.72, p.05 but
there was no significant difference between programs which were
self-administered and provided by internal trainers, t⫽⫺1.75,
p.05. There was also no significant difference in transfer
between programs provided by external trainers and those pro-
vided by internal trainers, t0.48, p.05. Finally, regarding
result outcomes there were no significant differences between
self-administered training programs and those administered by an
internal, t⫽⫺0.31, p.05 or external, t⫽⫺1.14, p.05
trainer. There was also no difference for results between programs
delivered by external and internal trainers, t0.00, p.05.
When testing Hypothesis 7, we found that programs utilizing
information-based methods exhibited significantly stronger effects
for learning than those using practice-based methods, t2.24,
p.05. The opposite pattern occurred for results, and significant
differences were not found for transfer, t0.26, p.05.
Results for Hypothesis 8 show that training programs incorpo-
rating information, demonstration, and practice-based methods
displayed a significantly larger effect size for learning than pro-
grams incorporating only information, t3.15, p.05 or
practice-based methods, t8.32, p.05. Similarly, programs
utilizing all three delivery methods displayed stronger effects for
learning when compared with programs incorporating both infor-
mation and practice-based methods, t3.93, p.05. However,
no significant differences in learning were found when programs
incorporating information, demonstration, and practice were com-
pared with those utilizing information and demonstration, t0.26,
p.05. For transfer, programs including information, demonstra-
tion, and practice-based methods displayed significantly stronger
effects when compared with programs incorporating only practice,
t7.06, p.05 and information-based methods, t4.22, p
.05. This same pattern was also found for transfer when programs
using all three delivery methods were compared to programs
incorporating combinations of information and practice-based
methods, t7.16, p.05 and demonstration and practice-based
methods, t5.92, p.05. For the results criterion, programs
incorporating all three delivery methods displayed stronger effect
sizes when compared to programs using only information-based
methods, t2.52, p.05. However, there was no significant
difference between programs utilizing information, demonstration,
and practice compared with interventions using information and
practice for this outcome, t⫽⫺0.77, p.05 as well as those
using practice only, t0.50, p.05.
When testing Hypothesis 9, we found programs that included
feedback had a significantly stronger effect size for transfer than
those with no feedback, t5.69, p.05. However, there was no
significant difference between interventions using feedback and
those not incorporating feedback for reactions, t0.79, p.05,
learning, t0.56, p.05 and results, t0.86, p.05. Thus,
Hypothesis 9 was only supported for one of the four outcomes. No
support was found for Hypothesis 10, as there were no significant
differences between programs using 360-degree feedback com-
pared to programs using single source feedback for learning, t
0.43, p.05, transfer, t0.92, p.05, or results, t0.50, p
Regarding Hypothesis 11, training programs conducted on-site
displayed significantly stronger effects for results than programs
completed off-site, t4.71, p.05. However, this effect was not
significant for learning, t0.80, p.05 or transfer, t⫽⫺1.55,
p.05. Our analysis for Hypothesis 12 further indicated that
face-to-face programs exhibited a significantly stronger effect size
for transfer than virtual programs, t⫽⫺5.94, p.05. However,
there were no significant differences between face-to-face and
virtual programs for learning, t⫽⫺1.83, p.05 and this
hypothesis could not be assessed for results or reactions results due
to a low number of primary studies. Therefore, Hypothesis 11 and
12 received support for only one criterion.
Table 7
Meta-Analytic Regression Results: Publication Year
Variable kR
Reaction 7 .31 .10
Learning 150 .01 .00
Transfer 190 .00 .01
Results 76 .34
Note.knumber of independent studies; ␤⫽standardized estimate;
variance explained.
Table 6
Meta-Analytic Regression Results: Duration
Variable kR
Model 1: Duration 6 .79 .62
Model 2: Duration 6 34.95
534.16 .68 .06
Model 1: Duration 113 .07 .01
Model 2: Duration 113 .12
113 .04 .01 .00
Model 1: Duration 145 .04 .00
Model 2: Duration 145 .20
145 .17 .00 .00
Model 1: Duration 54 .32
Model 2: Duration 54 2.58
54 2.31
.33 .23
Model 3 (no outliers): Duration 52 .57
52 .02 .30 .20
Note.knumber of independent studies; ␤⫽standardized estimate;
variance explained; R
change in R
above and beyond Model 1;
is the squared duration term.
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Exploratory analyses were conducted to provide further insight into
additional factors influencing leadership training effectiveness.
When investigating training content, findings indicate that programs
including certain competency domains produced significantly greater
effects than others; however, all trained competencies were effective
(i.e., exhibited significant, nonzero effect sizes). For learning, those
incorporating business competencies displayed greater effects than
interpersonal competencies, t3.31, p.05 intrapersonal compe-
tencies, t2.11, p.05. Those with intrapersonal competencies
were also greater than interpersonal competencies, t2.36, p.05.
All other content comparisons, in regard to learning, were not signif-
icant. Similar findings were found for transfer. Programs that trained
business competencies produced the greatest effects compared with
intrapersonal, interpersonal, and leadership (t3.52, 3.17, 2.34,
respectively, p.05), while those incorporating leadership produced
significantly greater effects than intrapersonal competencies, t3.26,
p.05. For results, programs with leadership content produced
significantly greater effects than business content, t3.11, p.05.
Those including intrapersonal content were significantly more effec-
tive than programs including interpersonal and business content, t
5.73 and 13.06, p.05, and programs including interpersonal content
were more effective than business content, t3.87, p.05.
Interestingly, results suggested that studies with both academic and
practitioner evaluators produced the greatest learning improvements
when compared to studies evaluated only by academics, t3.08, p
.05 or practitioners, t4.31, p.05 (see Table 3). Larger learning
effect sizes were also associated with solely academic evaluators as
compared to practitioners, t2.60, p.05. The same was found for
transfer, as a mixed set of evaluators was associated with a signifi-
cantly larger effect size than solely academic evaluators, t4.57, p
.05. However, for transfer, the difference between a mixed set of
evaluators and practitioner evaluators was not significant, t⫽⫺0.75,
p.05 and neither was the difference between academic and prac-
titioner evaluators, t⫽⫺1.85, p.05. Studies with only academic
evaluators displayed the largest effects for result outcomes when
compared to practitioner evaluators, t5.47, p.05 and a mix of
evaluators, t3.58, p.05 (see Table 5). The difference between a
mix of evaluators and practitioner evaluators for results outcomes was
not significant, t⫽⫺1.48, p.05.
We also ran exploratory analyses for duration of training and
publication year as continuous moderators, which are presented in
Tables 6 and 7. Training duration exhibited a positive, significant
relationship with results (ß 0.32, SE 0.00, t2.43, p.02)
and nonsignificant relationships with other criteria. To examine
whether the relationship between training program length and each
outcome is better explained as a curvilinear effect, we tested the
curvilinear relationship between duration and each criterion. Re-
sults indicated a significant contribution of the quadratic term over
and above the linear term (ß ⫽⫺2.31, R
0.33, R
p.01) for result criteria; however, after two influential outliers
were removed, the curvilinear effect disappeared (ß ⫽⫺0.02,
0.30, R
0.20, p.05) indicating that longer programs
are more effective. The curvilinear relationship was also nonsig-
nificant for reactions, learning, and transfer. Publication year dis-
played a positive, significant relationship with results (ß 0.34,
SE 0.01, t3.14, p.002), but there was no significant effect
of publication year on any other criteria.
The guiding purpose of the current study was to answer the
following questions: (a) How effective are leadership training
programs? and (b) How should one design, deliver, and implement
a leadership training program to maximize effectiveness? To ac-
complish this objective, we conducted a meta-analytic review of
335 leadership training evaluation studies and investigated
whether the effectiveness of training (as measured by positive
changes in reactions, learning, transfer, and results; Kirkpatrick,
1959) was significantly affected by the inclusion of several train-
ing design, delivery, and implementation features. We elaborate on
our findings by answering both questions below.
Question 1: How Effective Are Leadership Training
Overall, our findings are more optimistic about the effectiveness
of leadership training than popular press articles that suggest most
leadership training programs are minimally effective (Nelson,
2016) and that leadership cannot be taught because “managers are
created, leaders are born” (Morgan, 2015). Specifically, the current
results indicate that leadership training is effective in improving
reactions (␦⫽.63), learning (␦⫽.73), transfer (␦⫽.82), and
results (␦⫽.72). Moreover, this study is the first to demonstrate
that leadership training improves trainee reactions, in contrast to
the idea often perpetuated in the popular press that individuals
generally dislike training (Kelly, 2012).
Interestingly, not only do the current results suggest inaccura-
cies in some popular press conclusions about the effectiveness of
leadership training, but our results also suggest that previous
meta-analyses may have underestimated the effectiveness of lead-
ership training. For example, the current study’s effect sizes are
substantially larger than what was previously reported in Burke
and Day (1986;
.67), and
although the current results are difficult to compare with meta-
analytic effect sizes for learning and transfer provided by Collins
and Holton (2004;
.38 –1.01),
Powell and Yalcin (2010;
.37 – 1.32;
.30 –
.63), and Taylor et al. (2009b;
.13–.64), because each
An additional exploratory analysis was conducted for program label to
investigate whether the effectiveness of a training program is related to its
label. Leadership researchers and practitioners tend to use several terms
interchangeably when referring to leadership training programs (i.e., train-
ing, development, leader, leadership, and managerial). Our exploratory
analyses showed that the label given to the training program did not impact
effectiveness in regard to learning. For transfer, programs labeled “train-
ing” resulted in greater outcomes compared with those labeled “develop-
ment” (t4.72, p.05), while the opposite was found for results
(t⫽⫺9.82, p.05). To investigate this further, we conducted t-test
comparisons on programs including “leadership,” “supervisory,” or “man-
agement” in their program label. Most labels were not significantly differ-
ent from one another except for the following: programs labeled “leader-
ship development” produced significantly weaker transfer than those
named “leadership training” (t⫽⫺4.88, p.05) and “management
development” labels were associated with significantly weaker effect sizes
than those entitled “leadership training,” “supervisory training,” and “man-
agement training” (t⫽⫺5.80, 4.96, 2.33, respectively; p.05).
Regarding results as a criterion, all labels displayed similar effect sizes
except programs labeled “leadership development,” as programs with this
label were significantly more effective than “leadership training” (t5.68,
p.05) and “management training” (t10.03, p.05).
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
of these presented a wide range of effect sizes depending on the
study design or the rater source, our findings do suggest sub-
stantially larger effect sizes for the results criterion than previ-
ous meta-analyses (Collins & Holton, 2004:␦⫽.39; Powell &
Yalcin, 2010:␦⫽.35). Moreover, the current results are
generated from more than triple the amount of primary studies
included in previous meta-analyses. In answer to the question
posed above, our results not only suggest that leadership train-
ing is effective, but they also suggest that leadership training is
more effective than popular press and previous meta-analyses
The present effort also indicates that leadership training is
effective at improving affective, cognitive, and skill-based out-
comes, corroborating prior meta-analytic findings (Avolio, Reich-
ard, et al., 2009). Similarly, results echoed prior work by indicating
that affective outcomes are typically lower than cognitive and
skill-based outcomes (i.e., affective learning; Avolio, Reichard, et
al., 2009;Kalinoski et al., 2013). In most leadership training
evaluation studies, the affective outcomes involve emotional states
like anxiety (e.g., Eary, 1997). In this context, criteria such as
anxiety might be difficult to change because the training content
provided to trainees does not necessarily train the leader on how to
reduce his or her anxiety, but instead, is designed to improve
leadership behaviors (which ideally reduces one’s anxiety). For
example, Eary (1997) exposed trainees to material related to mil-
itary leadership, such as communication and team building, but
included anxiety levels as an affective outcome. Perhaps if lead-
ership training programs involved more content devoted to the
improvement of affective criteria, these programs would be
equally effective across affective, cognitive, and learning criteria.
Future research should examine whether affective content can
improve affective outcomes to a greater extent.
Results also indicate that both organizational and subordinate
outcomes increase as a result of leadership training. Interestingly,
however, our findings show that leadership training improved
organizational outcomes to a greater degree than subordinate out-
comes, suggesting that the trickle-up effect (i.e., leadership im-
provements aggregate up to the organizational-level) is stronger
than the trickle-down effect (i.e., leadership improvements flow
down to subordinates). We find this surprising given that leader-
ship is a dyadic interaction between leader and follower; thus, the
direct network tie between a leader and a follower should naturally
allow improvements in leadership to be relayed to followers,
whereas the process by which leadership improvements influence
organizational outcomes is not as direct (e.g., a leader who has
taken transformational leadership training can use the training in
daily, direct interactions with her subordinates, but the process via
which this training influences ROI is not as direct). Therefore,
future research could benefit from a stronger understanding of how
to maximize subordinate outcomes of leadership training.
Question 2: How Should One Design, Deliver, and
Implement a Leadership Training Program to
Maximize Effectiveness?
To address this question, we evaluated 15 training design,
delivery, and implementation features through a series of moder-
ator tests. These findings are discussed below along with theoret-
ical implications and areas of focus for future research.
Does a needs analysis enhance the effectiveness of leadership
training? Arthur, Bennett, Edens, and Bell’s (2003) meta-
analysis of the general training literature did not find a clear
pattern of results for studies incorporating a needs analysis; how-
ever, the current study found support for known training recom-
mendations (Salas et al., 2012;Tannenbaum, 2002) by showing
that programs developed from a needs analysis result in greater
transfer and learning. Not only does this support longstanding
generic training theory and recommendations for training, but it
also supports popularly held beliefs in organizations that leader-
ship training is not one size fits all (Gurdjian, Halbeisen, & Lane,
2014). Thus, we recommend leadership training developers con-
duct a needs analysis before designing a program as the current
meta-analytic data suggests this benefits learning and transfer (and
is no more/less effective than programs that have not conducted a
needs analysis for the results criterion).
Are voluntary leadership training programs more effective
than mandatory programs? Our findings suggest voluntary
attendance may be a double-edged sword, whereby it increases
transfer, yet decreases organizational results (there was no effect
on learning). We offer one possible interpretation for these results:
Whereas voluntary attendance may increase transfer because it
increases trainee motivation, it may also result in lower attendance,
and thus, a reduced number of trainees whose leadership improve-
ments can influence higher-level, organizational results. When
looking at the average sample size within our meta-analytic data-
base, categorized by outcome type and attendance policy, we
found some support for this explanation. The average sample size
for mandatory programs evaluating organizational outcomes was
263 trainees but only 41 trainees for voluntary programs. There-
fore, perhaps mandatory programs foster organizational outcomes
to a greater degree because the number of participants is larger.
Are temporally spaced training sessions more effective than
massed training sessions? Interestingly, the current results sug-
gest that distributed leadership training sessions do not result in an
increase in learning outcomes, despite the noted importance of this
method across learning theories (Hintzman, 1974;Paas et al.,
2004;Sweller et al., 1998). Learning seems to occur regardless of
the timing of training, suggesting that leadership content can be
cognitively retained following a massed session. However, transfer
and results criteria did produce findings that varied across tempo-
ral spacing; programs with multiple sessions spaced temporally
resulted in greater transfer and results than massed training ses-
sions. Therefore, although spacing did not directly impact learning
in this context, it appears that spacing is nevertheless beneficial for
the downstream outcomes of transfer and results. A number of
leadership training programs hold spaced training sessions (e.g.,
the Rockwood Leadership Institute, National Seminars Training’s
Management & Leadership Skills for First-Time Supervisors &
Managers, Academy Leadership’s Leadership Boot Camp), and
due to their popularity, we sought to determine whether the time
interval between sessions affects outcomes (i.e., weekly vs. daily).
We conducted an exploratory, follow-up analysis and found no
significant differences regarding learning and transfer between
programs with sessions spaced daily or weekly; however, pro-
grams with sessions spanned weekly (␦⫽1.14) produced larger
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
effects in the results criterion compared to those spaced daily (␦⫽
.44; t3.97, p.05).
Is leadership training more effective for low-level leaders
than high-level leaders? Previous meta-analytic work found
that leadership training results in greater gains for low-level lead-
ers compared with middle- and high-level leaders (Avolio, Reich-
ard, et al., 2009); however, in this previous meta-analysis, outcome
criteria were combined into a single outcome variable, obscuring
an understanding of whether this apparent ceiling effect for higher-
level leaders exists for all training criteria. In the current study, our
results demonstrate that this ceiling effect is present only for
transfer. High-, middle-, and low-level leaders experienced similar
learning gains and similar improvements in the results criterion,
suggesting that higher-level leaders are as equally motivated to
learn as lower-level leaders and equally likely to improve organi-
zational and subordinate outcomes. Thus, the current study sug-
gests that middle- and high-level leaders can benefit from training
(i.e., they can acquire new knowledge, skills, and abilities from
training and improve organizational and subordinate outcomes),
despite recent arguments that propose the expenditure of resources
on developing senior leaders is wasteful because they are set in
their ways (Lipman, 2016). In contrast, our findings showed stark
differences in the transfer criterion across leader level such that
transfer effects were approximately four times weaker for high-
level leaders, perhaps indicating that the greater experience of
high-level leaders has led to entrenched behaviors that are difficult
to change. Thus, our results also suggest that higher-level leaders
may require training that specifically focuses on enhancing trans-
fer. In summary, the current results appear to indicate that previous
findings were not due to a ceiling effect (i.e., wherein high-level
leaders have no room to improve) or a motivational deficit (i.e.,
wherein high-level leaders are not motivated to learn), but rather a
“transfer problem” (Baldwin & Ford, 1988, p. 63) that is perhaps
caused by well-established behaviors that are difficult to change
on-the-job, although this needs to be confirmed with future re-
Are self-administered leadership training programs less ef-
fective than those administered by someone from the trainee’s
organization (i.e., internal), or by someone outside of the or-
ganization (i.e., external)? The current results indicate that self-
administered leadership training programs are less effective than
training administered by an internal or external trainer. Although
self-administered training programs still exhibited nonzero im-
provements, because their effectiveness compared to other training
programs was substantially reduced, we caution practitioners from
using self-administered leadership training programs and we urge
academics to examine why these programs are less effective (e.g.,
Do self-administered trainees perceive the training to be less
valuable? Are these trainees subsequently less motivated?).
Are practice-based training delivery methods more effective
than information-based and demonstration-based delivery
methods? Does the use of multiple training delivery methods
increase the effectiveness of leadership training programs?
Supporting previous meta-analytic research (Burke & Day, 1986;
Taylor et al., 2009b) and basic learning theories that include
practice as an essential component of learning (Kolb, 1984;Piaget,
1952), practice-based methods tend to be more effective than other
delivery methods (i.e., practice was more effective than other
methods for results, and equally effective for transfer). Also par-
allel to recommendations, we found that programs incorporating
multiple methods were significantly more effective than programs
using either a single delivery method or two delivery methods
(Salas et al., 2012). In fact, by adding one or two methods to a
single delivery method, transfer can be increased by more than one
standard deviation. These results were consistent across outcome
criteria; therefore, we recommend training developers utilize prac-
tice if only one delivery method can be used, but employ multiple
delivery methods whenever possible.
Does feedback enhance the effectiveness of leadership train-
ing? Is 360-degree feedback more effective than feedback from
a single source? The current results suggest that feedback sig-
nificantly improves the onset of transfer following a leadership
training program, further bolstering the argument for the use of
feedback within training programs (e.g., Ford et al., 1998;Kluger
& DeNisi, 1996;Smith-Jentsch, Campbell, Milanovich, & Reyn-
olds, 2001). The same pattern was found for reactions, learning,
and results such that the meta-analytic effect size for programs
reporting the use of feedback were greater than those without—
albeit this effect was nonsignificant. These findings indicate that
there is utility in providing trainees with feedback during a lead-
ership training intervention, with an emphasis on effectively de-
signed feedback. Perhaps the nonsignificant difference in reac-
tions, learning, and results stems from not implementing effective
feedback designs. For instance, feedback provided within the
context of leadership training may be focused more on character-
istics of the self instead of task activities, as feedback intervention
theory suggests (Kluger & DeNisi, 1996;Kluger & DeNisi, 1998),
leading to a decrease in feedback effectiveness. Our findings also
illustrate the use of 360-degree feedback in leadership training
programs, as compared with not using 360-degree feedback, is
related to higher results, but lower levels of learning and transfer.
As these differences were not statistically significant and the effect
sizes were generated from a relatively low number of primary
studies (due to a lack of information available), it highlights the
need for future research surrounding the use of 360-degree feed-
back in the leadership development domain. The current results
prove to be surprising given the widespread use of 360-degree
feedback (i.e., approximately 90% of large organizations report the
use of this feedback method; ETS, 2012), and may lend initial
support to DeNisi and Kluger’s (2000) argument that multisource
feedback might not be universally useful. For example, in the
context of leadership training, increasing the number of feedback
sources may not result in additional information received by the
leader beyond feedback from a single-source (Borman, 1998;
DeNisi & Kluger, 2000) and perhaps when negative feedback is
received from multiple sources, this information could thwart
improvement in one’s learning or behavioral transfer because it
threatens one’s self-view. As such, the current findings indicate
that the effect of 360-degree feedback on training outcomes might
be more complex than originally concluded (i.e., potentially inhib-
iting learning and transfer, while improving organizational and
subordinate outcomes), highlighting a need for more research on
this training method.
Are on-site training programs more effective than off-site
training programs? Arthur et al. (2003) noted there was an
extreme paucity of research on training location in their meta-
analysis of the generic training literature. Over a decade later, we
are able to provide meta-analytic evidence that supports the use of
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
on-site training as it improves results to a greater degree than
off-site training (on-site training is equally effective as off-site
training for learning and transfer criteria). Employees learn and
transfer material from off-site programs adequately; however,
these programs may be less applicable to a trainee’s work envi-
ronment, leading to a decrease in result outcomes. Specifically,
off-site programs may feature less involvement from organiza-
tional insiders, relying instead on personnel who lack familiarity
with the training needs of the employees and the organization.
Consequently, the program utilized may have more of a one-size
fits all nature. This is problematic, as a one-size fits all approach
has been criticized for not addressing the specific needs of the
organization; specifically, within the context of assessment cen-
ters, Krause and Gebert (2003) suggested that custom-made pro-
grams may be “more sensitive and more precise in depicting a
company’s environment and dynamics, as well as individual jobs”
(Krause & Gebert, 2003, p. 396) than off-the-shelf interventions.
We argue that this idea can also be applied to leadership training,
as custom-made leadership training programs may better address
the organization’s leadership needs. However, we further note that
the decreased organizational results associated with off-site train-
ing, as compared with on-site training programs, may also stem
from additional sources, such as reduced fidelity or reduced con-
venience for attendees. Given this, we recommend that leadership
training programs use on-site training and we note that future
research would benefit from an examination of why on-site pro-
grams tend to be more effective than off-site programs.
Are leadership training programs that are virtually based
less effective than face-to-face programs? Despite the recent
surge of interest in virtually based leadership programs (e.g.,
Harvard Business Publishing, 2013), no meta-analytic compari-
sons among these programs and face-to-face leadership training
programs have been made, until now. In contrast to meta-analytic
research suggesting that web-based programs are more effective
than face-to-face programs in the generic training literature (Sitz-
mann, Kraiger, Stewart, & Wisher, 2006), our results suggest the
opposite to be true. Specifically, we found greater transfer in-
creases after face-to-face leadership training compared to virtually
based leadership training (and no difference between modalities
for learning criteria). It may be the case that virtually based courses
can be less effective for some criteria because this modality
involves fewer opportunities for demonstration and practice (i.e.,
the effect of modality may be spurious) and future work should
examine whether virtually based programs that use the same
delivery method as face-to-face programs are equally effective.
Does some leadership training program content result in
greater outcomes than other content? Previous meta-analyses
suggest that leadership training outcomes are affected by the
content trained (Burke & Day, 1986;Taylor et al., 2009b), and the
current results support this claim. Regarding both learning and
transfer, programs that trained business skills (e.g., problem-
solving, data analysis, monitoring budgets) were the most effec-
tive, paralleling Hogan and Warrenfeltz’s (2003) theory that busi-
ness skills are the easiest to train out of the four competency
domains. In contrast, we found that soft skills (i.e., leadership,
interpersonal, intrapersonal competencies) improved organiza-
tional and subordinate outcomes (i.e., results) more than hard skills
(i.e., business), supporting leadership theories that highlight the
importance of softer skills (e.g., leader–member exchange, Dan-
sereau, Graen, & Haga, 1975; transformational leadership, Burns,
1978). Thus, these results appear to suggest that although hard
skills are the easiest to learn and transfer, soft skills matter the
most for organizational and subordinate results. These results run
counter to the anticipated demand for hard skills among employers
(Schawbel, 2014), suggesting that softer skills are more important
in predicting desired organizational and subordinate outcomes.
Do leadership training programs evaluated by a team of
practitioners and academics result in greater outcomes com-
pared with those evaluated by either academics or
practitioners? Management literature has long acknowledged
the gap between science and practice and has noted that bridging
this gap would provide mutual enrichment in the field (Hodgkin-
son & Rousseau, 2009). In support of this, our findings indicate
that leadership training programs evaluated by a team of academ-
ics and practitioners exhibited significantly greater learning and
transfer outcomes compared to those evaluated by academics or
industry experts only. This finding supports the argument that the
scientist-practitioner model is necessary for producing superior
work (Jones & Mehr, 2007). Although it is unclear whether mixed
scientist and practitioner author teams worked together to design
and deliver the training, results suggest that academics and prac-
titioners who work together to evaluate (and possibly design/
deliver) leadership training programs are producing more effective
programs, suggesting academics and practitioners should collabo-
rate on leadership training ventures whenever possible.
Are longer leadership training programs more effective?
Although Taylor et al.’s (2009b) meta-analysis of leadership train-
ing did not find support for training length as a continuous mod-
erator of training effectiveness, we found some evidence that
length of training has an impact on effectiveness. Specifically, we
found a linear relationship between training duration and the
results criterion, indicating that longer training programs lead to
improved organizational and subordinate outcomes (a curvilinear
relationship was not supported after outliers were removed). These
results are counterintuitive as they do not fully support CLT
(which suggests that information overload causes strain on work-
ing memory capacity reduced learning; Paas et al., 2004), which
would predict a negative relationship between training duration
and effectiveness. Future research would benefit from an exami-
nation of whether the increased effectiveness associated with lon-
ger programs is due to increased knowledge transfer, more time for
multiple delivery methods, or perhaps increases in trainee percep-
tions of training program value.
Has the effectiveness of leadership training increased over
time? Current results suggest that over time, the effect of lead-
ership training on results has increased; however, this was not the
case for learning or transfer (i.e., improvements in learning and
transfer have been relatively steady over time). This bolsters the
common theme across the current data that leadership training is
more valuable than previously thought (cf. Zimmerman, 2015;
Myatt, 2012). In fact, the leadership training industry has improved
its impact on results over the years.
The current study provides several noteworthy contributions to
the literature; however, there are several limitations. First, we note
that we were only able to include evaluations of leadership training
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
programs reported in available unpublished and published litera-
ture. It is probable that effective leadership training programs are
published more often than ineffective leadership training pro-
grams. Although our publication bias analyses indicated no bias,
we note that ineffective leadership training programs might also be
disproportionately excluded from unpublished work (i.e., organi-
zations with ineffective programs do not give researchers access to
data about the effectiveness of these programs). Thus, despite a
lack of publication bias evident in the current article, our results
may still be upwardly biased because our meta-analytic database
can only reflect existing studies, which are more likely to have
been conducted on effective rather than ineffective training pro-
grams. Moreover, our results may not represent trends evident in
practice that are not documented within available literature. For
example, many leadership training programs have focused on
emotional intelligence since Goleman’s (1998) widely read com-
mentary on emotional intelligence in leadership, but we did not
have enough data to specifically analyze emotional intelligence
training for leaders separately from other forms of training. Sec-
ond, despite the noted popularity of measuring training effective-
ness with reactional data (Patel, 2010), the current search effort
only identified seven experimental studies reporting trainee reac-
tion data. We found that studies reporting reactions tend to utilize
a post-only study design instead of a prepost or independent
groups experimental design, excluding the vast majority of pri-
mary studies that report reactions from meta-analytic examination.
Although the current results indicate that leadership training im-
proves trainee reactions, because of the lack of reaction data, any
moderator analyses conducted within this criterion should be in-
terpreted with caution (we provide several moderated effect sizes
in Table 2 as initial evidence that should be confirmed with future
research). Given that most organizations include reaction data in
their evaluation efforts (Patel, 2010), we urge scholars to interpret
this lack of available primary data as a signpost for future research
in this area.
Practical Implications
The current meta-analytic evidence stands in direct opposition
to the recent argument posed by Myatt (2012), a popular press
contributor, that leadership training “is indeed the #1 reason lead-
ership development fails.” This study provides substantial evi-
dence from 335 leadership training evaluation studies that these
programs are effective and should be used across a variety of
domains. In fact, results suggest that leadership training programs
can lead to a 25% increase in learning, 28% increase in leadership
behaviors performed on-the-job (i.e., transfer), 20% increase in
overall job performance, 8% increase in subordinate outcomes,
and a 25% increase in organizational outcomes (percent increase is
equal to Cohen’s U
50; Cohen, 1988). The results also suggest
that the extent to which a program is effective is related to various
design, delivery, and implementation elements. As such, this study
is meant to serve as a guide for practitioners when developing a
leadership training program, and in this vein, we provide a sum-
mary of our recommendations to practitioners in Table 8.
The current results suggest that practitioners should identify
intended outcomes before developing a leadership training pro-
gram because training design characteristics affect each outcome
differently. As such, questions for practitioners to ask at the early
stages of development include: Who are my stakeholders and what
outcome(s) are they trying to obtain? Am I interested in multiple
outcomes, and if so, are some outcomes more important than
others? If developers do not have a clear objective, the current
results suggest a needs analysis may not only guide the develop-
Table 8
Evidenced-Based Best Practices for Designing a Leadership Training Program
1. Resist the temptation to think that leaders cannot be trained; evidence suggests leadership training programs are effective.
2. Conduct a needs analysis and identify the desired outcome(s) based on stakeholder goals before designing the program.
3. Use multiple delivery methods when possible (e.g., information, demonstration, and practice) and if limitations prevent this, choose practice instead
of other delivery methods.
4. Use caution when spending additional resources on 360-degree feedback (evidence indicates that it might not be more effective than single-source
5. Provide multiple training sessions that are separated by time rather than a single, massed training session.
6. Use caution when implementing self-administered training and instead, choose an internal or external trainer (evidence shows no differences in the
effectiveness of internal and external trainers but indicates that self-administered training is less effective).
7. Consult with others outside of your field to ensure the program is both evidence-based and practically relevant (e.g., if you are a practitioner,
collaborate with an academic).
8. Ensure the program is designed appropriately according to the desired outcome using the guidelines provided below.
Learning Transfer Results
• Use multiple delivery methods • Use multiple delivery methods • Use multiple delivery methods
• Conduct a needs analysis • Conduct a needs analysis • Hold on-site
• Include hard skills (i.e., business skills) • Provide feedback • Require mandatory attendance
• Use a face-to-face setting • Have multiple sessions
• Make attendance voluntary • Provide as much training as possible
(longer programs are more effective)
• Have multiple sessions • Include soft skills
(i.e., intrapersonal, interpersonal, and leadership skills)
• Include hard (i.e., business
skills) and soft skills (i.e.,
leadership skills)
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Table 9
Examples of Effective Training Programs
Moderator Citation Example Effect size
Needs analysis Porras et al. (1982) “Targeted supervisory skills were selected from a problem list created during the
needs-assessment portion of the organizational diagnosis” (p. 437).
Baron et al. (1984) The content was “developed by examining the responsibilities, activities, and
interactions of child care administrators and by adapting the teaching and
social interaction technology that is used in Teaching-Family programs to
managerial interactions” (p. 264).
c Milicevic et al.
“The project began with manager competency needs assessment. Assessment
results were used to design training and supervision so that each management
team had a long term strategy, a business case plan and an implementation
plan for total quality management of the work process at the end of the
project” (p. 248).
Morin and Latham (2000) “Prior to conducting this study, all supervisors had attended a one-day training
programme on interpersonal communication skills. A needs analysis revealed
that the participants had thorough knowledge of the requisite communication
behaviours, but lacked confidence in applying them on the job. Hence, a one-
day refresher training programme was conducted” (p. 569).
Osborne (2003) “In addition to program participant needs assessments completed prior to the
commencement of each program module, I requested that executive committee
members also complete a general needs assessment. Such information enabled
me to focus program design on the needs of those sponsoring the program as
well as on the needs of individual participants. The anonymous responses to
this needs assessment were combined and analyzed. I highlighted frequent
comments and used the information gained to guide program design and
implementation. For example, several committee members expressed a
concern for measurement of program impact. Once I determined that this was
a need, I incorporated the pre and post program testing as well as interim and
final benchmark statement comparisons into the study design” (p. 71).
House (2001) “The program was designed and developed after an extensive six month needs
assessment was conducted to identify the skills and competencies new
managers needed to be successful in the company’s environment. The needs
assessment included interviewing and conducting focus groups with new and
experienced managers in the company, executives of the company and human
resources professionals. In addition, the needs assessment included
benchmarking with other high technology companies to assess their
management training practices” (p .14).
Attendance policy Fitzgerald and Schutte
“They participated voluntarily and were free to withdraw from the study at any
time” (p. 499).
Suwandee (2009) “The consent forms for program participation were provided to the middle
executives of Kasem Bundit University in order to select volunteers to attend
the program. Consequently, the participants were willing to enhance their
knowledge and insights toward their organizational leadership potential” (p.
Alexander (2014) The training was mandatory. 1.34
Spacing effect Green (2002) “The training program is seven days in length, but the training sessions do not
run consecutively. The training program design provides that participants
attend three days of training, return to the workplace for three weeks, and
then return to attend two days of training. There is another three-week break
before participants return for the final two days of training. During the periods
between the training sessions, participants are asked to apply the skills learned
and to return to the course with prepared assignments about their learning
experiences” (pp. 53–54).
Parry and Sinha (2005) There was a 2-day intervention, followed by 3 months of implementing the
leadership development program at work, and then another 2-day intervention.
Hassan, Fuwad, and Rauf
“Training was divided into four modules that were offered with a lag of 7 days”
(p. 126).
Trainees’ level of
Allen (2010) Participants were from nursing unit teams that rotate team leadership on a yearly
Latham and Saari (1979) Trained first-line supervisors. 1.18
Bostain (2000) Trained first-line supervisors. 4.17
May and Kahnweiler (2000) Trained first-line supervisors. 1.13
Internal vs. External
Birkenbach et al. (1985) “The second author was asked to train personnel from the training department of
the company in the use of behaviour modelling...After the initial training
course, the author worked with company training staff in developing the in-
company programme. The major inputs were, however, made by the
company’s own trainers” (p. 13).
(table continues)
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Table 9 (continued)
Moderator Citation Example Effect size
Donohoe et al. (1997) The Employee Assistance Programs (EAPs) coordinator trained the supervisors. 1.94
Nemeroff and Cosentino
“Two research associates from the Life Office Management Association served
as ‘trainers’ and administered the feedback and feedback plus goal setting
treatments” (p. 569).
Practice-based method Bozer, Sarros, and Santora
“The coaching process included 10 to 12 sessions with weekly interventions. All
coaching activities commenced with an assessment and identification of a
developmental issue, followed by a feedback session, goal setting, action
planning, and follow-up sessions, and concluded with an evaluation of
outcomes, consistent with established coaching procedures” (p. 282).
Nemeroff and Cosentino
Goal setting was used in the training. .61
demonstration-, and
Yirga (2011) The Basic Leadership Development Program involved reading material, class
lectures, a coaching experience, and a team project.
Hassan, Fuwad, and Rauf
“Importance of setting specific, difficult but attainable, goals was discussed in a
lecture setting. Afterwards, a day long interactive session was conducted to
identify appropriate goals and objectives for all participants according to their
work requirements. Role playing and in basket exercises were conducted. The
session concluded with a case study situation requiring transformational
leadership exhibition by participants. Participants were then asked to come up
with their goals and objectives in the next session” (p. 126). The other
sessions included an assessment on what the leaders knew about
transformational leadership, role playing, and self-reports on their behavioral
Donohoe, Johnson, and
Stevens (1997)
The training program included a discussion, a training film, and an activity for
the supervisors to assess employee behavior and then re-assess the employee
using a handbook’s evaluation criteria.
House (2001) “Methods used to teach the program included a business simulation, lecture and
discussion sessions, interactive role playing, case studies and skill building
exercises. In addition, the program hosted four executive guest speakers
throughout the week long training program” (p. 14).
Feedback Engelbrecht and Fischer
“Feedback is given by the assessors to the senior of the participant, detailing
strengths and weaknesses and also focusing on developmental action plans for
the participant. A detailed and lengthy feedback is also given to the
participant in an interactive (facilitative) manner” (p. 395).
Nemeroff and Cosentino
“One-hour interviews were held with each manager, and specific feedback on
subordinate’s perceptions of 43 interview behaviors was reviewed and
discussed. Feedback was given at one point in time by an external source
other than the manager’s boss (trainer), in a face-to-face oral transmission
session. In terms of content, the feedback given was in the form of means and
standard deviations from the questionnaire responses of the manager’s
subordinates. Norms were also provided in the form of scale and item mean
and standard deviations for the total sample of managers in the company. The
feedback, together with the norms, permitted managers to determine their own
strengths and weaknesses on specific interview behaviors and to determine
areas in which improvements might be needed. The meaning of the feedback
data and sample norms was carefully explained to each manager by the
trainers. Moreover, during the feedback sessions, great care was taken to
insure a minimum of threat to the managers. They were asked to view the
feedback as diagnostic rather than evaluative. The trainers pointed out that the
feedback information would have developmental implications only to the
extent that the managers used it to increase their repertoire of effective
appraisal interview skills” (p. 569).
On-site Barling, Weber, and
Kelloway (1996)
“The study took place in one region of one of the five largest banks in Canada”
(p. 828).
Knox, (2000) The training took place at a facility within the corporation and “established
trainers currently employed or contracted by the company were used to
facilitate the courses” (p. 34).
Neck and Manz (1996) “The setting for the training study was the Agency Accounting Department of
America West Airlines. America West is an international commercial airline
employing approximately 12,100 people. America West Airlines is a major
carrier based in Phoenix, Arizona with hubs in Phoenix, Las Vegas, and
Columbus, Ohio” (p. 448).
Face-to-face Allen (2010) Coaching, instruction, reflection, and team meetings were face-to-face. 1.25
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
ment of the program, but may also result in a more effective
program. Interestingly, our results also suggest voluntary training
programs seem to be a double-edged sword, which is of concern
for practitioners because most are hoping to increase both transfer
and results. As previously mentioned, an explanation of this find-
ing is that mandatory programs increase employee participation,
thereby increasing the onset of higher-level, organizational results.
Because more employees are in attendance, the effects are more
likely to disseminate throughout the organization; greater atten-
dance equals greater results. From this, we argue that training
developers and researchers should identify ways in which volun-
tary training programs can be more appealing to employees.
Within the marketing literature, many researchers have identified
effective promotional methods (e.g., Goldsmith & Amir, 2010;
Grönroos, 1982;Khan & Dhar, 2010) such as bundling, in which
two or more products are packaged together (Stremersch & Tellis,
2002), and research suggests this practice increases sales (e.g.,
Chakravarti, Krish, Paul, & Srivastava, 2002;Yadav & Monroe,
1993). Leadership training programs that are advertised as a “bun-
dled” program (e.g., programs teaching leadership skills and pro-
viding exposure to a new technology) may also have greater
“sales” (i.e., voluntary attendance) than training interventions ad-
vertised as targeting a single leadership competency.
In addition to the list of recommendations provided in Table 8,
we also provide specific examples of highly effective training
design, delivery, and implementation elements in Table 9, which
were gathered by examining the primary studies within each
hypothesized moderator that exhibited strong effect sizes (e.g., if a
practitioner wanted to develop a training program with feedback,
s/he may use Table 9 for examples of how feedback has been
successfully incorporated into leadership training in previous
The current meta-analysis offers several contributions to the
leadership and training literatures. First, our results suggest that
leadership training is substantially more effective than previously
thought, leading to improvements in perceptions of utility and
satisfaction, learning, transfer to the job, organizational outcomes,
and subordinate outcomes. Moreover, all but seven of the 120
effect sizes were positive and significantly different from zero,
indicating that leadership training likely improves outcomes, re-
gardless of its design, delivery, and implementation elements (i.e.,
leadership training is rarely a “failure;” cf. Myatt, 2012). Second,
the current results suggest that leadership training is most effective
when the training program is based on a needs analysis, incorpo-
rates feedback, uses multiple delivery methods (especially prac-
tice), uses spaced training sessions, is conducted at a location that
is on-site, and uses face-to-face delivery that is not self-
administered. Third, our results also have a variety of practical
implications for the development of training programs, which we
have summarized in Table 8 and provided examples of in Table 9
in order to guide scientists and practitioners in the development of
evidence-based leadership training programs. Finally, we note that
although the current meta-analysis suggests leadership training is
effective, it does not promote a one-size fits all approach; many of
the moderators of leadership training effectiveness investigated in
the current study were important for some criteria but not all,
indicating that training program developers should first choose
their desired criterion (or criteria) and then develop the training
program based on this criterion.
References marked with an asterisk were included in the meta-analysis.
Abrell, C., Rowold, J., Weibler, J., & Moenninghoff, M. (2011). Evalu-
ation of a long-term transformational leadership development program.
Zeitschrift Für Personalforschung, 25, 205–224.
Adair, J. (1983). Effective leadership. London, UK: Pan.
Adcock-Shantz, J. (2011). A study of the impact of a leadership develop-
ment program on a community college’s front-line and middle managers
(Doctoral dissertation). Retrieved from ProQuest Dissertations & Theses
A&I; ProQuest Dissertations & Theses Global. (3490538)
Agboola Segunro, O. (1997). Impact of training on leadership develop-
ment: Lessons from a leadership training program. Evaluation Review,
21, 713–737.
Albanese, R. (1967). A case study of executive development. Training
and Development Journal, 21, 28 –34.
Alessi, S. (2000). Simulation design for training and assessment. In H. F.
O’Neil & D. H. Andrew (Eds.), Aircrew training and assessment (pp.
199 –224). Mahwah, NJ: Erlbaum.
Alexander, B. K. (2014). A case for change: Assessment of an evidenced-
based leadership development program. Dissertation Abstracts Interna-
tional Section A, 74, 9.
Al-Kassabi, M. A. (1985). A management training model for engineers in
Saudi Arabia (Doctoral dissertation). Retrieved from ProQuest Disser-
tations & Theses A&I; ProQuest Dissertations & Theses Global.
Allen, L. A. (2010). An evaluation of a shared leadership training
program (Doctoral dissertation). Retrieved from ProQuest Dissertations
& Theses A&I; ProQuest Dissertations & Theses Global. (3420897)
Alliger, G. M., Tannenbaum, S. I., Bennett, W., Traver, H., & Shotland, A.
(1997). A meta-analysis of the relations among training criteria. Per-
sonnel Psychology, 50, 341–358.