Content uploaded by Roland G. Pepermans
Author content
All content in this area was uploaded by Roland G. Pepermans
Content may be subject to copyright.
This article was downloaded by: [University of Pennsylvania]
On: 20 August 2012, At: 04:39
Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
Journal of Social Service Research
Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/wssr20
Nonprofit Governance Quality: Concept and
Measurement
Jurgen Willems a , Gert Huybrechts a , Marc Jegers a , Bert Weijters b , Tim Vantilborgh c ,
Jemima Bidee c & Roland Pepermans c
a Vrije Universiteit Brussel, Applied Economics, Brussels, Belgium
b Vlerick Leuven Gent Management School, Marketing, Gent, Belgium
c Vrije Universiteit Brussel, Psychology and Educational Sciences–Work & Organization,
Brussels, Belgium
Version of record first published: 12 Jul 2012
To cite this article: Jurgen Willems, Gert Huybrechts, Marc Jegers, Bert Weijters, Tim Vantilborgh, Jemima Bidee & Roland
Pepermans (2012): Nonprofit Governance Quality: Concept and Measurement, Journal of Social Service Research, 38:4,
561-578
To link to this article: http://dx.doi.org/10.1080/01488376.2012.703578
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to
anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should
be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims,
proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in
connection with or arising out of the use of this material.
Journal of Social Service Research, 38:561–578, 2012
Copyright c
Taylor & Francis Group, LLC
ISSN: 0148-8376 print / 1540-7314 online
DOI: 10.1080/01488376.2012.703578
Nonprofit Governance Quality: Concept and Measurement
Jurgen Willems
Gert Huybrechts
Marc Jegers
Bert Weijters
Tim Vantilborgh
Jemima Bidee
Roland Pepermans
ABSTRACT. A nonprofit “governance quality index” was developed to enable verification and falsifi-
cation of contemporary theoretical insights on social service organizations. Indicators were generated
based on an extensive qualitative exploration. For the quantitative validation, a data set was composed of
526 respondents from 52 organizations. Five subdimensions of governance quality are introduced and
are recommended to be used as separate scales, rather than combined into a single score on governance
quality. Furthermore, the recommendation is made to rely on multiple raters per organization to assess
governance quality or related concepts, given the substantial within-organization variance found.
KEYWORDS. Nonprofit governance, formative construct, second-order construct, multilevel analysis
As nonprofit organizations are defined by
an objective they do not pursue (i.e., not dis-
tributing profit [Jegers, 2008]), various organi-
zational goals exist within the nonprofit and/or
social service sectors (McClusky, 2002; Powell
& Steinberg, 2006; D. H. Smith & Shen, 1996).
Moreover, organizations frequently strive for
Jurgen Willems, is a post-doctoral researcher at Vrije Universiteit Brussel, Applied Economics, Brussels,
Belgium.
Gert Huybrechts, is a doctoral researcher at Vrije Universiteit Brussel, Applied Economics, Brussels,
Belgium.
Marc Jegers, is a Professor at Vrije Universiteit Brussel, Applied Economics, Brussels, Belgium.
Bert Weijters, is a Professor at Vlerick Leuven Gent Management School, Marketing, Gent, Belgium.
Tim Vantilborgh, is a post-doctoral researcher at Vrije Universiteit Brussel, Psychology and Educational
Sciences–Work & Organization, Brussels, Belgium.
Jemima Bidee, is a doctoral researcher at Vrije Universiteit Brussel, Psychology and Educational
Sciences–Work & Organization, Brussels, Belgium.
Roland Pepermans, is a Professor at Vrije Universiteit Brussel, Psychology and Educational
Sciences–Work & Organization, Brussels, Belgium.
Address correspondence to: Jurgen Willems, Vrije Universiteit Brussels, Pleinlaan 2, BE-1050 Brussels,
Belgium (E-mail: jurgen.willems@vub.ac.be).
multiple goals simultaneously (Anheier, 2005;
Jegers, 2008). As a result, nonprofit and social
service goals can be complementary—meaning
that the achievement of one goal could enforce
the achievement of other goals. However, these
goals could at the same time be in competition
with each other as scarce resources have to be
561
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
562 J. Willems et al.
allocated to different projects and actions each
focusing on the fulfillment of particular goals.
Furthermore, the achievement of organizational
goals by nonprofit and/or social service organi-
zations is often difficult to quantify or to objec-
tively compare between organizations (Bunger,
2010; Chaves & Tsitsos, 2001; McDonald &
Marston, 2002). As a result, individual opinions
on how to achieve organizational goals and how
they should be managed and negotiated are just
as diverse (Herman & Renz, 1999, 2004; D. H.
Smith & Shen). Therefore, assessing the per-
formance and effectiveness of nonprofit and/or
social service organizations remains an impor-
tant matter of debate (DiMaggio, 2001; Forbes,
1998; Herman & Renz, 1999, 2004).
Given the heterogeneous range of potential
organizational goals and the subjective nature
of their assessment, the practices applied to
ensure certain outcomes, rather than the out-
comes themselves, often form the basis of the
evaluation and comparison of nonprofit and so-
cial service effectiveness (Bradshaw, Murray,
& Wolpin, 1992; Campbell, 2002; Cornforth,
2001; Herman, Renz, & Heimovics, 1997; Si-
ciliano, 1997). Herman and Renz (1998) find
that the more nonprofit leaders focus and report
to their outside stakeholders on commonly ac-
cepted and rewarded management procedures,
the more positive and consistent stakeholder
judgments will be as to the organization’s effec-
tiveness. Similarly, D. H. Smith and Shen (1996)
revealed for volunteer-led organizations that the
application of formalized and broadly accepted
governance practices in nonprofit organizations
enhances their reputational effectiveness. For
the particular context of social services orga-
nizations, Green and Griesinger (1996) find that
board practices are better developed for organi-
zations that are perceived to be more effective.
As a consequence, a vast amount of practitioner
and academic contributions have been dealing
with describing and studying proper governance
practices assumed to ensure effective nonprofit
and/or social service performance (Cornforth,
2003; Cornforth & Edwards, 1999; Edwards &
Cornforth, 2003; Jegers, 2009; Parker, 2007).
This trend can be framed within a broader
evolution of the ongoing professionalization of
social services organizations and nonprofit or-
ganizations in general (Babiak, 2009; Beck,
Lengnick-Hall, & Lengnick-Hall, 2008; Hwang
& Powell, 2009; Johnston, 2010; Kong, 2007).
As social services organizations often depend
on outside funding, they have increasingly been
confronted with managerial and governance re-
quirements to maintain or increase this funding
(Cutler & Waine, 2003; Romzek & Johnston,
2005). As a result, governance quality has be-
come progressively important in nonprofit and
social service literature (Bunger, 2010; Mehro-
tra, 2006; Mi, 2007; Walker, 2002). In the
development of these insights, substantial inspi-
ration is gained from governance practices de-
scribed and studied for for-profit organizations.
However, caution is warranted as too frequently
the assumption is made that successful business
models can be adopted in a nonprofit context
with only minimal changes (Kong). The basic
suppositions of those models are traditionally
based on organizational features such as a single
goal strategy (financial profit), a dominant group
of stakeholders (owners), and the fact that the
one who is consuming the organization’s out-
put is also the one who is paying for it. Fur-
thermore, legal requirements applicable to non-
profit organizations are very different from the
requirements for for-profit organizations (Stark,
2010). In addition, too often, a one-size-fits-all
management approach is suggested (McClusky,
2002) but does not take into account the large
heterogeneity of nonprofit and/or social services
organizations (McClusky; Powell & Steinberg,
2006; D. H. Smith & Shen, 1996).
Several articles have dealt with nonprofit
governance while paying special attention to
the unique characteristics of nonprofit organiza-
tions, such as their various organizational goals
and their diverse set of stakeholders (Cornforth,
2001, 2003; Jegers, 2009; McCambridge, 2004;
Parker, 2007; Saidel & Harlan, 1998; S. R.
Smith, 2008; Stone & Ostrower, 2007). How-
ever, contemporary contributions are still mainly
theoretical and/or based on qualitative and anec-
dotal descriptions, which leaves ample oppor-
tunities for falsification and verification. There-
fore, the study reported in this article had the aim
to develop a construct that can be used for mea-
suring and quantifying the governance quality of
nonprofit and/or social services organizations in
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
Nonprofit Governance Quality 563
general. As a result, the research questions are:
Which set of indicators is relevant to measure
nonprofit governance quality? And once a set of
indicators is proposed, to what extent are they
valid and reliable to be used in various nonprofit
and/or social services contexts?
In the next section, three important challenges
with respect to the proper development of a
governance quality index are summarized. Sub-
sequently, in the “Methods” section, the devel-
opment and validation of the index is described.
Based on an extensive qualitative exploration,
subdimensions of governance quality are de-
fined and indicators are formulated. A question-
naire has been developed and completed by 526
respondents holding leadership positions in 52
nonprofit organizations. Based on these data, va-
lidity and reliability are evaluated using various
structural equation models (SEMs). Finally,
overall usability and consequences of the gover-
nance quality index are discussed, conclusions
are made, and avenues for further research are
suggested.
QUANTIFYING GOVERNANCE
QUALITY—IMPORTANT
CHALLENGES
There are three important challenges related
to the exploration and quantification of nonprofit
governance quality. First, in contrast to previous
attempts to assess related concepts that apply
the reflective approach (Gill, Flynn, & Raising,
2005; Jackson & Holland, 1998), the authors in-
dicate that the use of a formative approach for the
measurement of “governance quality” is most
appropriate. In short, in the reflective approach,
the assumption is made that the latent variable
under study is causing the different indicators
measured, while for a formative concept, such
as “governance quality,” the indicators measured
constitute, by definition, the latent variable. As
a consequence, the causal direction between la-
tent variable and indicators is opposite for the
two types of constructs (Bollen & Lennox, 1991;
Borsboom, Mellenbergh, & van Heerden, 2003).
In the context of governance quality, it is the de-
gree of compliance to a set of certain standards,
broadly agreed upon, that by definition deter-
mines in a normative way an organization’s gov-
ernance quality (Bradshaw, Hayday, Armstrong,
Leveque, & Rykert, 1998). In practice, this has
been operationalized through various codes of
conduct on nonprofit governance. These codes
list a set of concrete aspects that altogether con-
stitute the total concept of governance quality.
The degree to which an organization complies
with each of these aspects determines an orga-
nization’s overall governance quality (Dawson
& Dunn, 2006). Therefore, the aim is to in-
vestigate the unique contribution of each of the
measured aspects to overall governance quality.
As the reflective approach focuses on the com-
mon contribution of indicators (in practice, the
common variance among indicators measured;
DeVellis, 2003), it cannot be used for a proper
assessment and quantification of an organiza-
tion’s governance quality. Methodological con-
tributions have shown convincingly that choos-
ing the wrong approach for different types of
latent variables leads to substantial errors regard-
ing reliability and validity (Diamantopoulos &
Siguaw, 2006). As a result, a specific approach
for the development and validation of such an in-
dex has to be followed (Diamantopoulos & Win-
klhofer, 2001; Jarvis, Mackenzie, & Podsakoff,
2003; Sarstedt & Schloderer, 2010).
Second, governance quality, as described by
both academics and practitioners, is composed
of several conceptual levels. When governance
quality is discussed, one is often confronted with
several subdimensions that altogether define the
overall concept of governance quality. Exam-
ples of subdimensions of governance quality are
the degree to which external stakeholders can
participate in organizational decision-making or
the composition of the leadership team (LeRoux,
2009; Steane & Christie, 2001). However, these
subdimensions can hardly be captured by a sin-
gle indicator. Therefore, the subdimensions of
governance quality, which are also very promi-
nent in codes of conduct, should in turn be ap-
proached as formative constructs. As a result,
the suggestion is made by the authors to quantify
governance quality by use of a second-order for-
mative construct in which the unobserved latent
variable “governance quality” is by definition
composed by a series of other unobserved latent
variables, in turn defined by a set of concrete
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
564 J. Willems et al.
indicators. Such a type of construct is referred
to as a “formative first-order, formative second-
order” construct (Jarvis et al., 2003, p. 205).
Third, because the assessment of governance
quality consists of organizational characteris-
tics rated by individuals, an important distinc-
tion has to be made between unique perceptions
of individual raters and shared perceptions with
other raters involved in the same organization.
As mentioned before, individual judgment of
nonprofit effectiveness, performance, and gov-
ernance practices can vary widely among dif-
ferent people. However, both in qualitative and
quantitative research on nonprofit and social ser-
vices practices, researchers still often rely on a
single rater’s opinion to quantify such organi-
zational characteristics. It would be more ro-
bust to resort to several raters for each organi-
zation studied. Lynn, Heinrich, and Hill (2000)
recognize the great value of multilevel analy-
sis methods for nonprofit research. A multilevel
approach can better reveal insights of manage-
rial interactions of different individuals within
organizations (Hitt, Beamish, Jackson, & Math-
ieu, 2007). By combining opinions of different
raters per organization, unique personal percep-
tions and shared perceptions can be sorted out
and scrutinized in further analyses. In addition,
it provides a reference regarding interrater re-
liability of indicators probed for among differ-
ent people in an organization (Boyer & Verma,
2000).
METHODS AND RESULTS
Specific steps are necessary for testing and
ensuring the validity and reliability of forma-
tive constructs (Baxter, 2009; Bollen & Lennox,
1991; Diamantopoulos & Siguaw, 2006; Dia-
mantopoulos & Winklhofer, 2001; Jarvis et al.,
2003; Petter, Straub, & Rai, 2007). In general,
a distinction can be made between steps “prior
to data collection” and “after data collection”
(Petter et al.). In the remainder of this section, the
concrete steps taken in this study are explained
in three subsections: “Prior to Data Collection,”
“Data Collection,” and “After Data Collection.”
Within these sections, the respective results are
also presented. This makes the explanation of the
consecutive steps clearer as they often depend
on results of previous steps. After this section,
implications for researchers and practitioners of
the nonprofit and/or social services domains are
discussed.
Prior to Data Collection
Prior to the data collection, steps mainly fo-
cus on content and indicator validity. Formative
latent variables are determined by their indica-
tors. As a result, defining a formative concept
consists of formulating its indicators (Diaman-
topoulos & Winklhofer, 2001). However, the de-
sign of the SEM to test the construct after data
collection should be decided upon before data
collection (Petter et al., 2007). To be testable,
the SEM should be “identified,” meaning that
the degrees of freedom (DF) should at least be
0 but preferably higher. However, a formative
construct modeled as such, with only formative
indicators, is an unidentified model (Jarvis et al.,
2003; Petter et al.). Therefore, for the purpose of
being testable, the set of formative indicators and
the respective latent variable should be modeled
in relation to other latent and/or observed vari-
ables. As a result, assessing the unique validity of
the formative indicators in relation to the latent
variable as such is impossible (Bollen & Lennox,
1991). Nevertheless, next to the formative in-
dicators, additional reflective indicators can be
added and are assumed to relate as direct effects
to the formative latent variable. This is referred
to as a multiple indicators and multiple causes
model (MIMIC). By adding at least two reflec-
tive indicators (Jarvis et al.; J¨
oreskog & Gold-
berger, 1975), a standalone formative construct
can be tested—however, under the assumption
that the formative concept studied can be “prop-
erly summarized” by at least two reflective indi-
cators.
Based on qualitative exploration, indicators
were developed, and as a consequence, gover-
nance quality was defined. First, both academic
and practitioner contributions were reviewed to
obtain a detailed concept definition, extract sub-
dimensions, and formulate the formative and re-
flective indicators. Academic assessment tools
or checklists, such as the ones developed by Gill
et al. (2005), Jackson and Holland (1998), and
Herman and Renz (2004), were complemented
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
Nonprofit Governance Quality 565
with several codes of conduct dealing with non-
profit governance quality in nonprofit organiza-
tions, which is in the practitioners’ literature
more commonly referred to as “good gover-
nance.” Second, an online open question survey
was sent to a snowball sample of different ex-
perts such as directors, executives, advisors, co-
ordinators, and consultants of nonprofit and so-
cial services organizations. This survey had the
purpose of grasping aspects of governance qual-
ity not contained in existing checklists, codes of
conduct, and/or academic or practitioner liter-
ature. Questions probed for “best” and “good”
practice examples and rules of thumb regarding
nonprofit governance. In total, 37 questionnaires
were completed. Additional face-to-face inter-
views were conducted by the authors with 20
of the respondents. On average, these interviews
took about 2 hours. All interviews were recorded
and transcribed and served to fine-tune the word-
ing of the indicators. Third, discussions were
launched by the authors in several online discus-
sion forums particularly addressed to nonprofit
experts and professionals as a way of “virtual
focus panels.” Rather than asking for extensive
lists of subdimensions and indicators of gover-
nance quality, short lists of key indicators of high
governance quality were solicited. This enabled
the authors to obtain insight into the priority
and relative importance of certain dimensions
and potential indicators. The authors continued
reviewing new data until no new aspects were
revealed (Maxwell, 2004; Yin, 1993).
Subsequently, the qualitative data and doc-
uments collected were reviewed and coded in
three steps, resulting in an extensive list of all
possible aspects of governance quality. These as-
pects were not mutually exclusive and differed
strongly in their level of detail. Firstly, in a basic
coding step, at least one connotation was given
to each aspect listed. Secondly, meta-coding of
these connotations aimed to distinguish separate
subdimensions of governance quality. This hap-
pened in a repetitive way to define those sub-
dimensions that made classification of the as-
pects identified as unambiguous and mutually
exclusive as possible. Five distinct subdimen-
sions emerged: a) external stakeholder involve-
ment, b) consistent planning, c) structures and
procedures, d) continuous improvement, and e)
leadership team dynamics (definitions are given
in the Appendix). Thirdly, for each of the sub-
dimensions, again repetitive coding searched for
distinct indicators, each dealing with a unique
formative feature of that dimension. Each of
them was worded as an indicator that could be
assessed using a 7-point Likert scale (strongly
disagree to strongly agree). Items were carefully
worded to avoid potential measurement errors as
much as possible (DeVellis, 2003; Foddy, 1993;
Steenkamp, De Jong, & Baumgartner, 2010).
The authors chose to formulate the indicators
in such a way that they can be applied to a broad
set of situations and in different types of non-
profit and social services organizations. Never-
theless, reference is made as much as possible
to concrete and practical aspects encountered by
nonprofit and social services leadership teams
(see the Appendix for a complete description).
Two additional tests were performed before
starting the data collection. First, to improve con-
tent and indicator validity, an additional expert
rating was organized. Two independent raters
were separately given the definitions of the di-
mensions and their indicators. They were asked
to classify each indicator in one of the five di-
mensions and to indicate whether they positively
or negatively affect governance quality. Fleiss’s
Kappa was used to evaluate the agreement of
the two raters with the original classification re-
sulting from the definition and coding process
(Fleiss, 1971; Gamer, Lemon, Fellows, & Singh,
2010). Fleiss’s Kappa ranges between 0 and 1,
with values close to 1 showing high agreement.
A first rating gave a Fleiss’s Kappa value of .607
(p<.001) for the classification of indicators
in the five dimensions and .924 (p<.001) for
the positive versus negative formulation. Sev-
eral definitions and indicators were slightly re-
worded to make their particular focus and aim
clearer. After these modifications, all definitions
and indicators were rated again by the two raters
resulting in a Fleiss’s Kappa value of .879 (p<
.001) for the dimension classification and a value
of 1 (p<.001) for the positive versus negative
formulation of indicators. In addition, for all of
the indicators, at least one rater agreed with the
final classification. As a result, it can be con-
cluded that good content and indicator validity
was achieved.
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
566 J. Willems et al.
Second, an online pilot questionnaire was as-
sessed by two waves of judges, meaning that
adaptations based on the comments of the first
wave of judges were retested in the second wave
(about 20 judges in total with either experience
in survey design or in nonprofit governance).
Judges were encouraged to assess the question-
naire on clarity and user friendliness. Changes
mainly concerned presentation and comprehen-
sibility of the indicators. Indicators per dimen-
sion were randomly presented to the respondents
and answers were mandatory. In the end, 26
indicators in five subdimensions constitute the
“Governance Quality Index” (GQI).
Data Collection
A questionnaire containing the GQI was com-
pleted by 526 respondents holding leadership
positions (directors on the board, executives,
committee members, etc.) in 52 different orga-
nizations. A multistage sampling procedure was
applied, in which organizations were selected
and subsequently respondents within these orga-
nizations were addressed. As the authors aimed
to test the GQI for a broad and heterogeneous
range of nonprofit organizations, they took a re-
sponsive design approach (Groves & Heeringa,
2006), in which desired heterogeneity and ac-
cessibility were traded off while composing the
sample. During a period of 4 months, from Jan-
uary 2010 through April 2010, efforts were made
on a daily basis to recruit new organizations
or to remind contact persons either by mail or
telephone to respond. Three different precontact
channels were deployed (de Leeuw, 2005). First,
umbrella organizations were asked to promote
participation in the study among their mem-
ber organizations. However, these umbrella or-
ganizations encompass organizations based on
similar activities, common goals, and/or com-
mon ideology. Therefore, to improve the het-
erogeneity of the sample, a substantial set of
organizations was directly addressed as well (the
second channel). These organizations were cho-
sen based on their communicated characteris-
tics often extensively documented by themselves
and/or by others on the Internet. Criteria taken
into account for their selection were size, goals,
industry or subsector, and types of funding. As
a third way of identifying and selecting organi-
zations, the authors applied a snowball sampling
approach through personal networks. On the one
hand, some key persons were addressed directly
by the authors, while on the other hand, respon-
dents could suggest other people and/or organi-
zations that could be invited to participate. In
the end, the sample included lobby groups, ser-
vice organizations, environmental organizations,
development organizations, community centers,
theaters, care centers, and sports federations.
Both large and grassroots organizations were in-
cluded, all of whom had various types of funding
sources (own revenues, private or public fund-
ing, or combinations). Once an organization de-
cided to participate, all people involved in the
governance and management processes of the
organization, denoted as the “leadership team,”
were invited to fill out the questionnaire. Prac-
tically, the leadership team included directors
of the board, executives, strategic collaborators,
and advisors in committees. Before sending e-
mail invitations to the respondents, a high-level
contact person within the organization—often
the chairman or executive director—introduced
the research and encouraged people to partici-
pate to the survey. In addition, respondents re-
ceived an automatic reminder by e-mail after 1
week in case they did not complete the question-
naire yet.
The group of invited respondents per orga-
nization varied between 5 and 40 (M=15.27).
Response rates within organizations ranged from
52.6% to 100% (M=71.2%). Of the respon-
dents, 59.5% were officially appointed as direc-
tors of the board of the participating organiza-
tion; 29.5% of the respondents were female; and
94% had a higher education degree (after high
school). The average age was 52.0 years (mini-
mum =22 years; maximum =84; SD =12.41).
After Data Collection
Several SEMs were specified, depending on
different assumptions underlying the testing
of formative constructs (Jarvis et al., 2003).
Based on the models with sufficient “model
fit”—meaning that the total set of assump-
tions made regarding observed and latent vari-
ables are in accordance with the actual data
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
Nonprofit Governance Quality 567
observed—construct validity and reliability of
the indicators constituting the formative con-
struct can be assessed (Bollen & Lennox,
1991; Diamantopoulos & Siguaw, 2006; Dia-
mantopoulos & Winklhofer, 2001).
Construct validity is established when indica-
tors significantly contribute to the latent variable.
In other words, similar to traditional regression
analyses, each indicator is expected to have a
unique and substantial impact on the “dependent
variable” (i.e., the latent variable, controlled for
the other indicators). When certain indicators
turn out not to be significant, a trade-off has to
be made by the researcher: Either such indicators
can be left out, possibly deteriorating content va-
lidity, or they should be kept for content reasons
but with no substantial contribution to the latent
variable (Petter et al., 2007).
Furthermore, a formative construct is con-
sidered reliable in case its indicators show low
multicollinearity. In contrast to the required high
Cronbach’s alpha value indicating high common
variance for reflective indicators, formative indi-
cators should be linearly independent from each
other. Diamantopoulos and Siguaw (2006) pro-
pose a maximum cutoff of 3.3 for the variance
inflation factor (VIF) for each of the formative
indicators.
Two types of SEMs were analyzed. A MIMIC
model was tested, assessing all indicators for
the five dimensions in relation to the overall
latent variable of governance quality (second-
order formative construct: Type IV, in classifi-
cation by Jarvis et al., 2003, p. 205; Figure 1
represents the measurement model). In paral-
lel, five distinct MIMIC models, one for each
subdimension, were tested to validate each as a
standalone formative construct. Table 1 gives a
summary of the SEMs (Models A through F).
All models were tested under the assumption
that all formative indicators covary freely (Jarvis
et al.), as sufficient model fits were found only
under this assumption. However, multicollinear-
ity among indicators was low as none of them has
a VIF value larger than the 3.3 cutoff threshold
(Diamantopoulos & Siguaw, 2006; maximum
is 1.736). In addition to the establishment of
good indicator reliability, free covariances be-
tween all formative indicators could as a con-
sequence thus be allowed in the models speci-
fied. VIF values per indicator are given in the
Appendix.
For the evaluation of goodness-of-fit (GOF),
indicators based on the discussion by Hu and
Bentler (1999) and Marsh, Hau, and Wen (2004)
were used: the root mean square error of ap-
proximation (RMSEA) with a cutoff <.06, the
Comparative Fit Index (CFI) with a cutoff >.95,
and the standardized root mean square residual
(SRMR) with a cutoff <.08. In addition, for
further information and for better comparability,
the 90% confidence interval of the RMSEA is
reported: lower bound (CI-LB) and upper bound
(CI-UB). In addition, the DFs in relation to the
chi-squared (χ2) metric of the tested models are
also given.
Good model fit for the overall MIMIC model
was found, including all 26 formative indica-
tors constituting the five latent dimensions, in
turn constituting the latent variable “governance
quality” (see Model 1 in Table 1 and Figure 1).
This model includes 12 reflective indicators, 2
for each of the five dimensions and 2 for the
overall latent variable “governance quality.” All
formative indicators contribute significantly to
their respective latent dimension variables (p<
.05). Also for the separate MIMIC models for
each subdimension, all indicators have loadings
significantly different from 0. These loadings
are given in the Appendix. GOF statistics for
these five separate models are also given in Ta-
ble 1. For Models C (Consistent Planning) and
F (Leadership Team Dynamics), the RMSEA
statistics do not meet the required cutoff value
(<.06; respectively, .065 and .064). As these
deviations of the cutoff value are only minimal,
and as all other cutoff values for the other GOF
statistics are met, these models were used for
further interpretation. As a result, high construct
validity for the indicators of each of the five sub-
dimensions of governance quality has been ob-
tained. However, for the second-order loadings
(i.e., the contribution of the latent variables quan-
tifying the subdimensions on the overall latent
variable [i.e., governance quality]), the model
shows that only three out of the five significantly
differ from 0. These subdimensions with their
standardized loadings are: a) Consistent Plan-
ning (.308), b) Structures and Procedures (.169),
and c) Leadership Team Dynamics (.513). No
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
568 J. Willems et al.
FIGURE 1. Measurement Model of Second-Order Formative Construct: Governance Quality Index.
significant loadings on the overall latent vari-
able of governance quality could thus be found
for d) External Stakeholder Involvement and e)
Continuous Improvement.
As indicators of governance quality probe for
individual perceptions on the situation of an or-
ganization, the authors also looked at the extent
to which respondents of the same organization
share similar perceptions. This relates to inter-
rater reliability, defined as the proportional con-
sistency of variance among raters of the same
organization (Boyer & Verma, 2000). To elab-
orate on this division of unique individual and
shared opinions, intraclass correlations (ICCs)
were calculated for the different subdimensions
of governance quality by use of a multilevel
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
Nonprofit Governance Quality 569
TABLE 1. Goodness-of-Fit Statistics for Tested Structural Equation Models.
Model A Model B Model C Model D Model E Model F
Specifications
Latent variable(s) Governance Quality
With Five Latent
Subdimensions∗
External Stakeholder
Involvement
Consistent
Planning
Structures and
Procedures
Continuous
Improve-
ment
Leadership Team
Dynamics
Order of construct Second order First order First order First order First order First order
Formative
indicators
26 5 5 5 6 5
Reflective
indicators
12 2 2 2 2 2
DF 315 4 4 4 5 4
Goodness of Fit
RMSEA .046 <.010 .065 <.010 .053 .064
CI-LB and CI-UB .041; .051 .00; .055 .028; .11 .00; .016 .015; .091 .026; .100
CFI .99 1.00 .99 1.00 1.00 1.00
SRMR .042 .008 .014 .005 .010 .012
χ2694.17 2.75 13.10 3.61 12.40 12.69
χ2p-value <.010 .600 .011 .460 .028 .013
∗Subdimensions: 1) External Stakeholder Involvement, 2) Consistent Planning, 3) Structures and Procedures, 4) Continuous Improvement,
and 5) Leadership Team Dynamics.
variance analysis (Hitt et al., 2007; Jones & Sub-
ramanian, 2009; Maas & Hox, 2005; Rasbash,
Charlton, Browne, Healy, & Cameron, 2005). A
three-level model was specified, where the five
subdimension scores were nested within individ-
uals, in turn nested within organizations. As a re-
sult, it can be assessed whether variances of and
covariances between dimension scores are de-
termined at the individual or the organizational
level.
For each respondent, the five dimension
scores were calculated by weighting the respec-
tive indicators of those dimensions based on the
standardized loadings of the separate MIMIC
models (Models B through F). Multicollinearity
of these five scores is acceptable as none of the
VIF values exceeds the 3.3 cutoff value (maxi-
mum is 2.661). The ICCs (denoted as ICC-var)
for each of the dimension scores can be defined
as the proportion of variance at the organiza-
tional level (between-group variance) divided by
the total sum of variance of a dimension score
(i.e., variance at the organizational level plus the
variance at the individual level [within-group
variance]). A similar metric (ICC-cov) quanti-
fies the proportion of the covariances between
subdimension scores attributable to the organi-
zational level compared to the total covariance
between two subdimensions. Significance met-
rics indicate whether variances or covariances at
the group level are significantly different from
0. Results of ICC-var and ICC-cov are given in
Table 2. Table 3 shows the correlations at the
individual and group levels between the dimen-
sion scores of the overall governance quality. All
correlations, at both levels, are significant (p<
.05).
A substantial part of the variances and co-
variances of the five dimension scores can be
attributed to the organizational level: between
17.1% and 23.8% for the variances (ICC-var)
and between 24.0% and 38.8% for the covari-
ances (ICC-cov). All proportions of organiza-
tional level variances and covariances are signifi-
cant. Correlations between the dimension scores
range between .429 and .657 at the individual
level and between .703 and .875 at the organi-
zational level. As a result, it is concluded that
a substantial part of the individual perceptions
on governance quality, and how its dimensions
relate, is shared among leaders from the same
organization. Dimension scores correlate most
strongly at the organizational level.
DISCUSSION
In the process of developing a GQI, sev-
eral steps were taken and various tests were
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
570 J. Willems et al.
TABLE 2. Intraclass Correlations as Proportions of the Organizational-Level Variances of and
Covariances Between Dimension Scores.
External Stakeholder
Involvement
Consistent
Planning
Structures and
Procedures
Continuous
Improvement
Leadership
Team Dynamics
External Stakeholder
Involvement
23.08%∗——— —
Consistent Planning 38.82%∗22.92%∗—— —
Structures and Procedures 31.25%∗28.57%∗23.81%∗——
Continuous Improvement 29.41%∗28.00%∗27.27%∗20.00%∗—
Leadership Team Dynamics 27.78%∗24.00%∗26.92%∗24.00%∗17.07%∗
∗Significance at
p
<.05.
performed to assess and ensure its validity and
reliability. Before data were collected, steps fo-
cused on content validity through an extensive
qualitative approach, while after data collection,
the construct validity and reliability were as-
sessed. Given the significant loadings and the
low multicollinearity of the indicators of the
five subdimensions of governance quality, strong
construct validity and reliability of the first-order
indicators were found. Consequently, the authors
acknowledge the strong usability of each of the
five separate formative constructs respectively
measuring a) External Stakeholder Involvement,
b) Consistent Planning, c) Structures and Pro-
cedures, d) Continuous Improvement, and e)
Leadership Team Dynamics. However, for the
second-order construct (i.e., the loadings of the
five subdimensions on the overall governance
quality latent variable), only partial construct
validity was found. Significant loadings were
found only for three out of the five subdimen-
sions. As a result, an important tradeoff between
content validity and construct validity has to be
made. Nonsignificant indicators either could be
left out (priority for construct validity) or could
be kept (priority for content validity). In the con-
text of a second-order construct, leaving the non-
significant dimensions out is certainly not appro-
priate, as first-order parts are convincingly valid
and contain substantial information. In contrast,
the authors advise, both for practitioners and aca-
demics, a more nuanced approach by including
the five dimension scores separately in models
and/or reports, rather than combining them in a
TABLE 3. Correlations at the Individual Level and at the Organizational Level Between the
Dimension Scores of Governance Quality.
External Stakeholder
Involvement Consistent Planning
Structures and
Procedures Continuous Improvement
Individual Organization Individual Organization Individual Organization Individual Organization
Consistent
Planning
.528∗.871∗——————
Structures
and Pro-
cedures
.429∗.721∗.570∗.809∗————
Continuous
Improve-
ment
.541∗.844∗.616∗.875∗.587∗.772∗——
Leadership
Team Dy-
namics
.507∗.762∗.535∗.703∗.565∗.792∗.657∗.851∗
∗Significance at
p
<.05.
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
Nonprofit Governance Quality 571
single index. In such a way, sufficient detail for
each of the subdimensions can be maintained
and they can be analyzed in relation to each other
and to other variables.
From the multilevel variance analysis, the
conclusion can be made that a substantial part of
individual perceptions regarding the subdimen-
sions of governance quality is shared with others
of the same organization (approximately a quar-
ter to one third). However, a larger part of the
variance of respondents’ scores is attributable to
the individual level. In addition, these subdimen-
sions correlate both at the individual level and
to a slightly stronger extent at the organizational
level. Given the fact that judgments of organi-
zational practices are subject to differing indi-
vidual perceptions, the authors suggest incorpo-
rating several judgments of different respondents
when organizations are the main unit of analysis.
However, even when taking different judgments
into account, opinions can still vary substantially
and to a different extent for different organi-
zations. This makes computational aggregation
of individual judgments (e.g., by summing or
averaging) less appropriate. In this context, the
authors endorse Lynn et al. (2000), who state
that when studying organizational characteris-
tics such as nonprofit governance, performance,
and/or effectiveness, based on individual judg-
ments, a multilevel approach is most suitable.
Especially when dependent variables are based
on individual judgments, variance occurring at
different levels should be controlled for, rather
than neglected or turned into unreliable measure-
ments by using aggregated or single respondent
judgments.
Furthermore, for the purpose of validation, we
have tested the data structure based on SEM. As
such, the assumption is made that the model is
additive, meaning that the weighted sum of items
and dimensions constitute the overall concepts
(and their scores). We have, however, paid little
attention to the magnitude of first- and second-
order loadings (but we did look at the signif-
icance of them as a matter of validation). For
the validation of the construct, the magnitude
of loadings has in fact little importance, as they
are mainly determined by the summarizing re-
flective items (included for validation purposes).
Other summarizing items could result in other
loading values. Nevertheless, from a practical
point of view, important considerations have to
be made on how actual dimension scores are cal-
culated for further practitioner and scientific pur-
poses. Traditionally, from a research perspective,
an additive approach is taken. This means that
unique variance of items (or common variance,
in the case of reflective scales) is added based on
a weighted sum, and subsequently the variance
of this sum is studied in relation to variance of
other concepts. By doing so, the (linear) related-
ness between this and other concepts can be stud-
ied. In contrast, from a practitioner perspective,
the consideration could be made to take a multi-
plicative approach. In such cases, scores of items
and/or dimensions could be multiplied to assess
and benchmark the governance quality of differ-
ent organizations. By taking this approach, the
conditionality of each dimension of governance
quality is stressed. For example, an organization
can be scored high on all but one dimension, but
extremely low on one dimension (e.g., close to
0). From an additive perspective, this would re-
sult in a fairly positive outcome. However, when
multiplying the dimension scores, a low value
would be obtained for the overall concept (also
close to 0). Such an approach expresses the fact
that an organizational crisis could happen when
the organization encounters governance prob-
lems in at least one dimension, and thus it fo-
cuses more on strong shortcomings, rather than
on averaged achievements. Focus on these short-
comings is particularly interesting from a man-
agement perspective as it enables direct actions
where most needed.
In sum, the development of the GQI opens
opportunities for researchers and practitioners
in the nonprofit and/or social services domains.
The five constructs, one for each subdimension
of governance quality, can be used, separately
or in combination, to quantify governance qual-
ity. Given the fact that it has been developed
and validated for a heterogeneous set of non-
profit and social services organizations, the GQI
could be used in various settings for further re-
search enabling comparison and generalization
of contemporary insights. Furthermore, given its
formative nature and the extensive description
provided through the definitions of the over-
all construct and its subdimensions (see the
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
572 J. Willems et al.
Appendix), the GQI can be used by practition-
ers to assess, follow up, and/or benchmark their
self-assessed governance quality. However, as
the indicators are answered by individuals who
give their opinion on the quality of organiza-
tional characteristics, special attention is war-
ranted as perceptions within organizations can
vary substantially. Either a reliability perspective
can be taken, where a minimal level of agreement
among different respondents for a single organi-
zation would indicate a reliable judgment of that
organization’s governance quality, or differences
with respect to opinions within an organization
can be the subject of further research. Various
dynamics and settings within nonprofit and/or
social services organizations could be at the base
for different levels of agreement within each or-
ganization (Lynn et al., 2000). Substantial oppor-
tunities remain in the current research domain to
explain differences in opinions within and across
nonprofit and social services organizations.
CONCLUSION
This article started by acknowledging the
need for exploring subdimensions of governance
quality and for developing a reliable scale to
measure it. In addition, the necessity is recog-
nized that such a scale should be applicable to
a broad and heterogeneous range of nonprofit
and social services organizations to support ver-
ification and falsification of current theoretical
and qualitative insights. Three important aspects
were dealt with when developing the GQI. First,
a formative approach, in contrast to a reflective
one, was applied given the inherent nature of the
concept of governance quality. Second, gover-
nance quality was operationalized as a second-
order construct, given the multiple conceptual
layers that it comprises in academic and prac-
titioner literature. Third, answers of multiple
raters per organization were obtained based on
individual judgments of organizational charac-
teristics.
Through an extensive qualitative research de-
sign, the authors explored the subdimensions and
the indicators of nonprofit governance quality.
Such an approach ensured the content valid-
ity of the construct and indicators for a broad
range of nonprofit and/or social services orga-
nizations. Based on quantitative data of 526 re-
spondents from 52 different nonprofit organi-
zations, reliability and construct validity were
tested. Good validity and reliability were found
for the separate subdimensions of governance
quality: a) External Stakeholder Involvement,
b) Consistent Planning, c) Structures and Pro-
cedures, d) Continuous Improvement, and e)
Leadership Team Dynamics. As a result, each
of the subdimensions can be used as separate
constructs in further research. However, for the
combined second-order construct grouping the
five dimensions into one single formative latent
variable, no sufficient construct validity could
be found. The authors propose to use the five
dimensions as separate constructs in future re-
search on governance quality in nonprofit and
social services organizations. Further analyses
can reveal to what extent each of these subdi-
mensions relate to other concepts relevant for
these organizations.
Further, this study looked into the role of
unique individual perceptions versus shared per-
ceptions of respondents of the same organiza-
tion when assessing the organization’s gover-
nance quality. Although a substantial part of
shared variances and covariances at the organi-
zational level has been found, the main share is
still situated at the individual level. As expected,
opinions on practices within nonprofit organiza-
tions differ largely and thus cannot be ignored
in future research. Given the subjective nature
of quality and performance with respect to the
assessment of the achievement of nonprofit and
social services goals (DiMaggio, 2001; Stone &
Ostrower, 2007), the authors stress the neces-
sity of multiple raters for the assessment of gov-
ernance quality and similar concepts for those
organizations.
In this article, the authors considered gov-
ernance quality as a standalone construct for
reasons of conciseness. However, future work
can substantially add to this by relating quanti-
fied governance quality to other organizational
characteristics and to individual characteristics
of the raters. Building on previous contributions,
contemporary insights could be verified or fal-
sified when governance quality is studied in re-
lation to organizational effectiveness (Mi, 2007;
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
Nonprofit Governance Quality 573
D. H. Smith & Shen, 1996), board effectiveness
(Green & Griesinger, 1996), or perceived effec-
tiveness by outside stakeholders (Babiak, 2009;
Balser & McClusky, 2005). Less explored is the
potential impact of individual characteristics of
respondents, though relevant in the context of
nonprofit and social services organizations. Ex-
amples are tenure of the respondent, whether
or not the respondent is on the board, whether
or not the respondent is the chairman or CEO,
experience in other organizations, being a vol-
unteer for the organization, etc. As very few
contributions dealing with organizational char-
acteristics of nonprofit and social services orga-
nizations have been based on multiple opinions
within these organizations, opportunities are am-
ple to study individual characteristics in rela-
tion to a respondent’s opinion on the governance
quality of these organizations.
ACKNOWLEDGEMENTS
The authors thank Nikolay Dentchev (Vrije
Universiteit Brussel), Cind Du Bois (Belgian
Royal Military Academy), Benny Geys (Vrije
Universiteit Brussel), and Marc Labie (Univer-
sit´
e de Mons) for their valuable suggestions.
REFERENCES
Anheier, H. K. (2005). Nonprofit organizations: Theory,
management, policy. New York, NY: Routledge.
Babiak, K. M. (2009). Criteria of effectiveness in multiple
cross-sectoral interorganizational relationships. Evalu-
ation & Program Planning,32(1), 1–12.
Balser, D., & McClusky, J. (2005). Managing stake-
holder relationships and nonprofit organization effec-
tiveness. Nonprofit Management & Leadership,15(3),
295–315.
Baxter, R. (2009). Reflective and formative metrics of re-
lationship value: A commentary essay. Journal of Busi-
ness Research,62(12), 1370–1377.
Beck, T. E., Lengnick-Hall, C. A., & Lengnick-Hall,
M. L. (2008). Solutions out of context: Examining
the transfer of business concepts to nonprofit organi-
zations. Nonprofit Management & Leadership,19(2),
153–171.
Bollen, K., & Lennox, R. (1991). Conventional wisdom on
measurement: A structural equation perspective. Psy-
chological Bulletin,110(2), 305–314.
Borsboom, D., Mellenbergh, G. J., & van Heerden, J.
(2003). Theoretical status of latent variables. Psycho-
logical Review,110(2), 203–219.
Boyer, K. K., & Verma, R. (2000). Multiple raters in survey-
based operations management research: A review and
tutorial. Production and Operations Management,9(2),
128–140.
Bradshaw, P., Hayday, V., Armstrong, R., Leveque, J., &
Rykert, L. (1998). Nonprofit governance models: Prob-
lems and prospects. Paper presented at the annual As-
sociation for Research on Nonprofit Organizations and
Voluntary Action (ARNOVA) Conference, Seattle, WA.
Bradshaw, P., Murray, V., & Wolpin, J. (1992). Do non-
profit boards make a difference? An exploration of the
relationships among board structure, process and ef-
fectiveness. Nonprofit and Voluntary Sector Quarterly,
21(13), 227–249.
Bunger, A. C. (2010). Defining service coordination: A
social work perspective. Journal of Social Service Re-
search,36(5), 385–401.
Campbell, D. A. (2002). Outcomes assessment and the
paradox of nonprofit accountability. Nonprofit Manage-
ment & Leadership,12(3), 243–259.
Chaves, M., & Tsitsos, W. (2001). Congregations and so-
cial services: What they do, how they do it, and with
whom. Nonprofit and Voluntary Sector Quarterly,30(3),
660–683.
Cornforth, C. (2001). What makes boards effective? An ex-
amination of relationships between board inputs, struc-
tures, processes and effectiveness in non-profit organi-
zations. Corporate Governance: An International Re-
view,9(3), 217–227.
Cornforth, C. (2003). The changing context of
governance—emerging issues and paradoxes. In C.
Cornforth (Ed.), The governance of public and non-
profit organizations: What do boards do? (pp. 1–19).
London, UK: Routledge Studies in the Management of
Voluntary and Non-Profit Organizations.
Cornforth, C., & Edwards, C. (1999). Board roles in strate-
gic management of non-profit organizations: Theory
and practice. Corporate Governance: An International
Review,7(4), 346–362.
Cutler, T., & Waine, B. (2003). Advancing public account-
ability? The social services ‘star’ ratings. Public Money
& Management,23(2), 125–128.
Dawson, I., & Dunn, A. (2006). Governance codes of prac-
tice in the not-for-profit sector: Paternalistic of partic-
ipatory governance? Corporate Governance: An Inter-
national Review,14(1), 33–42.
de Leeuw, E. D. (2005). To mix or not to mix data collection
modes in surveys. Journal of Official Statistics,21(2),
233–255.
DeVellis, R. F. (2003). Applied Social Research Methods
Series: Vol. 26.Scale development: Theory and appli-
cations (2nd ed.). London, UK: Sage Publications.
Diamantopoulos A., & Siguaw, J. A. (2006). Forma-
tive versus reflective indicators in organizational mea-
sure development: A comparison and empirical illus-
tration. British Journal of Management,17(4), 263–
282.
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
574 J. Willems et al.
Diamantopoulos, A., & Winklhofer, H. M. (2001). Index
construction with formative indicators: An alternative
to scale development. Journal of Marketing Research,
38(2), 269–277.
DiMaggio, P. (2001). Measuring the impact of the nonprofit
sector on society is probably impossible but possibly
useful: A sociological perspective. In P. Flynn & V.
Hodgkinson (Eds.), Measuring the impact of the non-
profit sector (pp. 249–272). New York, NY: Kluwer
Academic/Plenum Publishers.
Edwards, C., & Cornforth, C. (2003). What influences the
strategic contributions of boards? In C. Cornforth (Ed.),
The governance of public and non-profit organizations:
What do boards do? (pp. 79–96). London, UK: Rout-
ledge Studies in the Management of Voluntary and Non-
Profit Organizations.
Fleiss, J. L. (1971). Measuring nominal scale agree-
ment among many raters. Psychological Bulletin,76(5),
378–383.
Foddy, W. (1993). Constructing questions for interviews
and questionnaires: Theory and practice in social re-
search. Cambridge, UK: Cambridge University Press.
Forbes, D. P. (1998). Measuring the unmeasurable: Empir-
ical studies of nonprofit organization effectiveness from
1977 to 1997. Nonprofit and Voluntary Sector Quar-
terly,27(2), 183–202.
Gamer, M., Lemon, J., Fellows, I., & Singh, P. (2010). Var-
ious coefficients on interrater reliability and agreement
(R package). Retrieved from http://www.r-project.org
Gill, M., Flynn, R. J., & Raising, E. (2005). The governance
self-assessment checklist: An instrument for assessing
board effectiveness. Nonprofit Management & Leader-
ship,15(3), 271–294.
Green, J. C., & Griesinger, D. W. (1996). Board perfor-
mance and organizational effectiveness in nonprofit so-
cial services organizations. Nonprofit & Management
Leadership,6(4), 381–402.
Groves, R. M., & Heeringa, S. G. (2006). Responsivedesign
for household surveys: Tools for actively controlling
survey errors and costs. Journal of the Royal Statistical
Society: Series A,169(3), 439–457.
Herman, R. D., & Renz, D. O. (1998). Nonprofit organi-
zational effectiveness: Contrasts between especially ef-
fective and less effective organizations. Nonprofit Man-
agement & Leadership,9(1), 23–38.
Herman, R. D., & Renz, D. O. (1999). Theses on nonprofit
organizational effectiveness. Nonprofit and Voluntary
Sector Quarterly,28(2), 107–126.
Herman, R. D., & Renz, D. O. (2004). Doing things right:
Effectiveness in local nonprofit organizations: A panel
study. Public Administration Review,64(6), 694–704.
Herman, R. D., Renz, D. O., & Heimovics, R. D. (1997).
Board practices and board effectiveness in local non-
profit organizations. Nonprofit Management & Leader-
ship,7(4), 373–385.
Hitt, M. A., Beamish, P. W., Jackson, S. E., & Math-
ieu, J. E. (2007). Building theoretical and empiri-
cal bridges across levels: Multilevel research in man-
agement. Academy of Management Journal,50(6),
1385–1399.
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit in-
dexes in covariance structure analysis: Conventional cri-
teria versus new alternatives. Structural Equation Mod-
eling,6(1), 1–55.
Hwang, H., & Powell, W. W. (2009). The rationalization of
charity: The influences of professionalism in the non-
profit sector. Administrative Science Quarterly,54(2),
268–298.
Jackson, D. K., & Holland, T. P. (1998). Measuring effec-
tiveness of nonprofit boards. Nonprofit and Voluntary
Sector Quarterly,27(2), 159–182.
Jarvis, C. B., Mackenzie, S. B., & Podsakoff, P. M. (2003).
A critical review of construct indicators and measure-
ment model misspecification in marketing and con-
sumer research. Journal of Consumer Research,30(2),
199–218.
Jegers, M. (2008). Managerial economics of non-profit or-
ganizations (Routledge studies in the management of
voluntary and non-profit organizations). London, UK:
Routledge.
Jegers, M. (2009). ‘Corporate’ governance in nonprofit or-
ganizations: A nontechnical review of the economic lit-
erature. Nonprofit Management & Leadership,20(2),
143–164.
Johnston, E. (2010). Governance infrastructures in 2020.
Public Administration Review,70(Suppl. 1), s122–s128.
Jones, K., & Subramanian, S. (2009). Developing multi-
level models using MLwiN 2.1: A training manual. On
file with author.
J¨
oreskog, K. G., & Goldberger, A. S. (1975). Estimation of
a model with multiple indicators and multiple causes of
a single latent variable. Journal of the American Statis-
tical Association,70(351), 631–639.
Kong, E. (2007). The strategic importance of intellectual
capital in the non-profit sector. Journal of Intellectual
Capital,8(4), 721–731.
LeRoux, K. (2009). Paternalistic of participatory gover-
nance? Examining opportunities for client participation
in nonprofit social service organizations. Public Admin-
istration Review,69(3), 504–517.
Lynn, L. E., Jr., Heinrich, C. J., & Hill, C. J. (2000). Study-
ing governance and public management: Challenges and
prospects. Journal of Public Administration Research
and Theory,10(2), 233–261.
Maas, C. J. M., & Hox, J. (2005). Sufficient sample sizes
for multilevel modeling. Methodology,1(3), 86–92.
Marsh, H. W., Hau, K., & Wen, Z. (2004). In search
of golden rules: Comment on hypothesis-testing ap-
proaches to setting cutoff values for fit indexes and
dangers in overgeneralizing Hu and Bentler’s (1999)
findings. Structural Equation Modeling,11(2), 320–
341.
Maxwell, J. A. (2004). Qualitative research design: An
interactive approach. London, UK: Sage.
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
Nonprofit Governance Quality 575
McCambridge, R. (2004). Underestimating the power of
nonprofit governance. Nonprofit and Voluntary Sector
Quarterly,33(2), 346–354.
McClusky, J. E. (2002). Re-thinking nonprofit organization
governance: Implications for management and leader-
ship. International Journal of Public Administration,
25(4), 539–560.
McDonald, C., & Marston, G. (2002). Patterns of gov-
ernance: The curious case of non-profit community
services in Australia. Social Policy & Administration,
36(4), 376–391.
Mehrotra, S. (2006). Governance and basic social services:
Ensuring accountability in service delivery through
deep democratic decentralization. Journal of Interna-
tional Development,18(2), 263–283.
Mi, C. S. (2007). Assessing organizational effectiveness in
human service organizations: An empirical review of
conceptualization and determinants. Journal of Social
Service Research,33(3), 31–45.
Parker, L. D. (2007). Internal governance in the nonprofit
boardroom: A participant observer study. Corporate
Governance: An International Review,15(5), 923–934.
Petter, S., Straub, D., & Rai, A. (2007). Specifying for-
mative constructs in information systems research.
Management Information Systems Quarterly,31(4),
623–656.
Powell, W. W., & Steinberg, R. (2006). The non-profit sec-
tor: A research handbook (2nd ed.).NewHaven,CT:
Yale University Press.
Rasbash, J., Charlton, C., Browne, W. J., Healy, M., &
Cameron, B. (2005). MLwiN Version 2.02 [Computer
software]. Center for Multilevel Modeling, University
of Bristol, Bristol, UK.
Romzek, B. S., & Johnston, J. M. (2005). State social ser-
vices contracting: Exploring the determinants of effec-
tive contract accountability. Public Administration Re-
view,65(4), 436–449.
Saidel, J. R., & Harlan, S. L. (1998). Contracting and pat-
terns of nonprofit governance. Nonprofit Management
& Leadership,8(3), 243–259.
Sarstedt, M., & Schloderer, M. P. (2010). Developing a
measurement approach for reputation of non-profit or-
ganizations. International Journal of Nonprofit and Vol-
untary Sector Marketing,15(3), 276–299.
Siciliano, J. I. (1997). The relationship between formal
planning and performance in nonprofit organizations.
Nonprofit Management & Leadership,7(4), 387–403.
Smith, D. H., & Shen, C. (1996). Factors characterizing
the most effective nonprofits managed by volunteers.
Nonprofit Management & Leadership,6(3), 271–289.
Smith, S. R. (2008). The challenge of strengthening non-
profits and civil society. Public Administration Review,
86(Suppl. 1), S132–S145.
Stark, A. (2010). The distinction between public, nonprofit,
and for-profit: Revisiting the ‘core legal’ approach.
Journal of Public Administration Research and Theory,
21(1), 3–26.
Steane, P. D., & Christie, M. (2001). Nonprofit boards in
Australia: A distinctive governance approach. Corpo-
rate Governance: An International Review,9(1), 48–58.
Steenkamp, J. E. M., De Jong, M. G., & Baumgartner, H.
(2010). Socially desirable response tendencies in sur-
vey research. Journal of Marketing Research,47(2),
199–214.
Stone, M. M., & Ostrower, F. (2007). Acting in the public
interest? Another look at research on nonprofit gover-
nance. Nonprofit and Voluntary Sector Quarterly,36(3),
416–438.
Walker, P. (2002). Understanding accountability: Theoret-
ical models and their implications for social service
organizations. Social Policy & Administration,36(1),
62–75.
Yin, R. K. (1993). Applications of case study research.
London, UK: Sage.
APPENDIX
Definition, Subdimensions, and Indicators
of Nonprofit Governance Quality
In this Appendix, the overall definition of non-
profit governance quality is given as used in this
study. As a part of this definition, the five sub-
dimensions of nonprofit governance quality are
dealt with in detail. For each of these dimen-
sions, either five or six formative indicators op-
erationalize the subdimensions. Each indicator
was scored by the respondents on a 7-point Lik-
ert scale (strongly disagree to strongly agree).
VIF scores and standardized loadings for each
of the indicators are given based on the separate
MIMIC models.
Nonprofit governance quality groups a
broad series of requirements that should be satis-
fied, conditions that should be met, and practices
that should be applied by nonprofit leadership
teams to optimally enhance the achievement of
their organizations’ mission and vision. These
requirements, conditions, and practices deal with
five distinct subdimensions: a) External Stake-
holder Involvement, b) Consistent Planning, c)
Structures and Procedures, d) Continuous Im-
provement, and e) Leadership Team Dynamics.
We give a definition of each of them below.
External Stakeholder Involvement is the di-
mension of governance quality that assesses the
interaction of the leadership team with the ex-
ternal stakeholders. On the one hand, this di-
mension looks at the opportunities for external
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
576 J. Willems et al.
stakeholders to provide feedback to the lead-
ers of the organization on their needs and pref-
erences. On the other hand, this dimension
assesses the transparency, responsibility, and ac-
countability of the leadership team toward these
external stakeholders regarding their decisions
and actions taken. The primary external stake-
holders that a leadership team has to deal with
are beneficiaries, funder(s), and other active or-
ganizations in the field. To be fully supportive of
these external stakeholders, the leadership team
should be well informed on the organization’s
operational processes and should be sufficiently
embedded in the field in which the organization
is active.
1. This organization should report its deci-
sions and achievements much more to all
external stakeholders (VIF =1.139; Load-
ing =–.163; Reversed).
2. The beneficiaries of this organization have
multiple ways to get their concerns dis-
cussed by the leadership team of this orga-
nization (VIF =1.102; Loading =.221).
3. The leadership team of this organization is
fully aware of the responsibilities result-
ing from their decisions (VIF =1.462;
Loading =.169).
4. This organization maintains close strate-
gic links with the other players in the field
(VIF =1.245; Loading =.323).
5. The leadership team of this organization
has a strong sense of accountability toward
those who fund the organization (VIF =
1.407; Loading =.192).
Consistent Planning is the dimension of gov-
ernance quality that evaluates the consistency of
the leadership’s approach in dealing with gov-
ernance and management duties. This includes,
on the one hand, a systematic deployment (plan-
ning, execution, and control) of the organiza-
tion’s mission and vision in a midterm strat-
egy (or alike statement), which on its turn is
translated in short-term goals and targets. On
the other hand, it includes being prepared to
deal with both financial and nonfinancial risks.
Through an iterative process of reflection, dis-
cussion, and adequate decisions, the leadership
team and the organization should become re-
sistant against unforeseen crises to avoid a sit-
uation where the leadership team is constantly
occupied with solving emerging problems that
interfere with a sustainable continuation of the
actions planned.
6. This organization has built sufficient fi-
nancial reserves (VIF =1.034; Loading
=.137).
7. We deliberately aim to achieve an opti-
mal balance between informative issues
and decisive issues in board and manage-
ment meetings (VIF =1.251; Loading =
.264).
8. The leadership team of this organization
consciously frames important decisions
with statements from our mission and vi-
sion (VIF =1.336; Loading =.300).
9. Every year, we use a set of predefined
criteria to allocate resources to our ac-
tivities, projects, and processes (VIF =
1.289; Loading =.231).
10. As a part of the leadership team, I have
the feeling that we are constantly fire-
fighting (VIF =1.170; Loading =–.293;
Reversed).
Structures and Procedures is the dimension
of governance quality that appraises the formal
development and documentation of governance
and management bodies within the organization
and how they relate to each other. Structures and
procedures should be developed and updated in
such a way that they optimally induce objective
practices and decisions that are independent of
preferences of (some of) the leaders. If not, flaws
should be timely detected and corrected. In ad-
dition, these formal structures and procedures
should be well understood and documented for
existing and new members of the leadership
team.
11. We have a system that protects, classi-
fies, and backs up the organization’s im-
portant documents and information (VIF
=1.245; Loading =.271).
12. For new people who join this leadership
team, we provide the necessary docu-
mentation on our way of working (VIF =
1.461; Loading =.170).
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
Nonprofit Governance Quality 577
13. We have formal job descriptions for
the different roles in this leadership
team (chairman, board members, execu-
tive managers, committee members, etc.;
VIF =1.234; Loading =.211).
14. The role of the board versus the executive
management is well understood by all
members of the leadership team of this
organization (VIF =1.371; Loading =
.241).
15. Meetings often have to stop unfinished
because of time constraints (VIF =
1.148; Loading =–.178; Reversed).
Continuous Improvement is the dimension
of governance quality that looks at those as-
pects related to continuously organizing ac-
tivities that improve the organization’s perfor-
mance and effectiveness. It includes respond-
ing innovatively to changes in the organization’s
environment. In addition, such actions should
result from evaluating the quality of organi-
zational and leadership achievements. It re-
quires the ability of the leadership team to for-
mulate the right actions when changes to the
organization’s environment occur or when per-
formance and effectiveness are evaluated as
insufficient. Actions should be followed up
carefully, and the right conditions should be cre-
ated to complete these actions in the most proper
way.
16. At least once a year, the leadership team
of this organization discusses its own
engagement and achievements (VIF =
1.230; Loading =.149).
17. This leadership team has problems in
defining the right actions when actual
performance is not as expected (VIF =
1.548; Loading =–.175; Reversed).
18. New strategic decisions are always fol-
lowed up by a member of the lead-
ership team or a by dedicated com-
mittee (VIF =1.363; Loading =
.178).
19. This organization has a creative approach
regarding initiating new projects (VIF =
1.449; Loading =.321).
20. This organization is constantly behind
regarding the proper management prac-
tices (VIF =1.663; Loading =–.197;
Reversed).
21. With the leadership team, we regularly
discuss the new evolutions in the field
(VIF =1.464; Loading =.129).
Leadership Team Dynamics is the dimen-
sion of governance quality that evaluates the
composition of and the personal interactions
within the actual group of people that consti-
tutes the leadership team. This dimension looks
at the particular composition and alignment of
the members in the leadership team each with
their own background, motivation, skills, etc.
In combination, it regards the personal interac-
tions within the leadership team and how this
is supportive for effective organizational deci-
sion making. It deals with how each member
has a professional attitude in relation to all other
members, with team discipline in meetings and
promised engagement, and with how the com-
bination of this particular group of people is a
lever for each others’ ideas and efforts.
22. With this leadership team, we achieve
much more together for the organiza-
tion compared with what we each would
achieve separately (VIF =1.410; Load-
ing =.272).
23. All members of the leadership team are
driven by the same motives to be part of
this organization (VIF =1.701; Loading
=.104).
24. Within this leadership team, we share
the same ideas on what practices to ap-
ply to manage this organization properly
(VIF =1.736; Loading =.293).
25. Within the leadership team, there is a
strong drive to inform ourselves regu-
larly on new trends regarding governance
and management practices (VIF =1.290;
Loading =.255).
26. From time to time, I have the feeling that
some people in the leadership team have
a hidden agenda (VIF =1.382; Loading
=–.192; Reversed).
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012
578 J. Willems et al.
FIGURE 2. Visualized Scale From the Questionnaire for the Reflective Indicator to Assess Overall
Governance Quality.
Reflective Items
Reflective items used, per subdimension and
for general “governance quality” are:
Stakeholder Involvement.
1. Stakeholder involvement in this organiza-
tion is strongly developed.
2. This organization is a good example of
how to deal with external stakeholders.
Consistent Planning.
1. This organization is a best-practice exam-
ple for other organizations regarding con-
sistent planning.
2. This organization always acts fully consis-
tently with its strategy, mission, and val-
ues.
Internal Structures and Procedures.
1. This organization has an efficient set of
well-defined internal structures and pro-
cedures.
2. The internal structures and procedures that
have been developed in this organization
are advanced compared with many other
organizations.
Continuous Improvement.
1. This leadership team is strong in continu-
ously deploying actions that improve the
performance and effectiveness of the or-
ganization.
2. This organization is good at continuously
improving its way of working.
Leadership Team Dynamics.
1. The dynamics within this leadership team
are supportive for an effective output of
the organization.
2. The way this leadership team works to-
gether could be a model for the leaders of
many other organizations.
In General for ‘Governance Quality’.
1. Please rate on a scale from 0 to 10 how
well your organization is doing regarding
“good governance.”
2. In the case that we would categorize
all organizations from “bad governance”
(left) to “good governance” (right), where
would you place your organization (de-
scriptions of the six categories are given
below)? (1. Major shortcomings in the
way the organization is governed; 2. Bad
practice, but already some small initial
achievements; 3. Close to average, but still
below; 4. Close to average, but already
above; 5. Good practices, but some room
for improvement; 6. Best-practice exam-
ple in the field) See Figure 2 for the visual
scale that was given to judge these items.
Downloaded by [University of Pennsylvania] at 04:39 20 August 2012