ArticlePDF Available

Abstract

Multilevel modeling is a technique that has numerous potential applications for social and personality psychology. To help realize this potential, this article provides an introduction to multilevel modeling with an emphasis on some of its applications in social and personality psychology. This introduction includes a description of multilevel modeling, a rationale for this technique, and a discussion of applications of multilevel modeling in social and personality psychological research. Some of the subtleties of setting up multilevel analyses and interpreting results are presented, and software options are discussed.
© 2008 The Author
Journal Compilation © 2008 Blackwell Publishing Ltd
Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
An Introduction to Multilevel Modeling
for Social and Personality Psychology
John B. Nezlek*
College of William & Mary
Abstract
Multilevel modeling is a technique that has numerous potential applications
for social and personality psychology. To help realize this potential, this article
provides an introduction to multilevel modeling with an emphasis on some of its
applications in social and personality psychology. This introduction includes a
description of multilevel modeling, a rationale for this technique, and a discussion
of applications of multilevel modeling in social and personality psychological
research. Some of the subtleties of setting up multilevel analyses and interpreting
results are presented, and software options are discussed.
Once you know that hierarchies exist, you see them everywhere.
(Kreft and de Leeuw, 1998, 1)
Whether by design or nature, research in personality and social psychology
and related disciplines such as organizational behavior increasingly involves
what are often referred to as multilevel data. Sometimes, such data sets are
referred to as ‘nested’ or ‘hierarchically nested’ because observations (also
referred to as units of analysis) at one level of analysis are nested within
observations at another level. For example, in a study of classrooms or
work groups, individuals are considered to be nested within groups.
Similarly, in diary-style studies, observations (e.g., diary entries) are nested
within persons. What is particularly important for present purposes is
that when you have multilevel data, you need to analyze them using
techniques that take into account this nesting. As discussed below (and in
numerous places, including Nezlek, 2001), the results of analyses of
multilevel data that do not take into account the multilevel nature of the
data may (or perhaps will) be inaccurate.
This article is intended to acquaint readers with the basics of multilevel
modeling. For researchers, it is intended to provide a basis for further
study. I think that a lack of understanding of how to think in terms of
hierarchies and a lack of understanding of how to analyze such data
inhibits researchers from applying the ‘multilevel perspective’ to their
work. For those simply interested in understanding what multilevel
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
Multilevel Analyses 843
modeling is and how to interpret the results of such analyses, it should
help them understand what was done and why it was done as they
encounter multilevel analyses in their scholarly activities.
This article describes the multilevel perspective as it pertains to the
types of research social and personality psychologists frequently conduct.
This description includes some of the types of data that can be concep-
tualized within a multilevel framework and some basic principles underlying
the analyses of multilevel data. This article is intended as an introduction
to multilevel modeling, and as such, it only touches the surface of various
topics. Readers who are interested in the details of multilevel modeling
should consult any of the following, all of which are intended as introduc-
tions: Raudenbush and Bryk (2002), Snijders and Boskers (1999), and Kreft
and de Leeuw (1998). Readers interested in the application of multilevel
modeling to specific research designs may want to consult papers mentioned
in the section on potential applications.
The Importance of Understanding Levels of Analysis
Multilevel analyses are appropriate when data have been collected at
multiple levels simultaneously. In this instance, ‘levels’ refers to how the
data are organized, and more important statistically, to whether observations
are dependent (or not independent), a topic discussed below. The traditional
(and recommended) way to refer to these levels is by number (level 1,
level 2, etc.), with larger numbers indicating levels that are higher in the
hierarchy.1 For example, in a study in which persons are organized in
groups (e.g., work groups), measures describing individuals (e.g., productivity)
would constitute the level 1 data, and measures describing the groups
(e.g., position in an organization) would constitute the level 2 data. Studies
in which multiple observations are taken from individuals (e.g., a diary
study) can also be thought of as multilevel data, with observations (e.g.,
daily diary entries) constituting the level 1 data and individual characteristics
(e.g., personality traits) constituting the level 2 data.
One of the important characteristics of such data is that the level 1
observations are not independent. People in a group share whatever
characteristics the group has, and the diary entries people provide have in
common the characteristics of the person. This lack of independence
means that traditional ordinary least-squares (OLS) techniques such as
multiple regression in which level 1 observations are treated as independent
observations cannot be used because such analyses violate a fundamental
assumption – the independence of observations. For example, in a study
of groups, it is fundamentally wrong to conduct a single level regression
analysis in which group level measures are assigned to the individual
members of groups and are then used in the analysis as if they were
individual level measures. Similarly, in a diary-style study, it is fundamentally
wrong to conduct an analysis in which daily observations are the units of
844 Multilevel Analyses
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
analysis (i.e., data entries) and individual (person level) measures such as
personality characteristics are assigned to each day a person maintained a
diary and then used as if they were daily level measures. In both cases, the
observations are not independent.
In addition to violating the independence assumption (which by itself
is sufficient to render them flawed), single-level analyses that ignore the
hierarchical structure of the data can provide misleading results. For example,
assume a data set in which there are three groups with five persons in
each group, and each person provides two measures, x and y (see Table 1).
If the data in Table1 were analyzed by treating the observations as 15
independent observations, the correlation between the two measures
would be 0.73. It is clear from looking at the data, however, that the
relationship within each group is negative, making suspect any procedure
that estimates a positive relationship.
Some have argued that such problems can be solved by using a variable
indicating group membership, what is sometimes referred to as ‘least
squares dummy-codes’ analysis. Including such terms can reduce the
influence on results of mean differences between level 2 units of analysis,
although such analyses still assume that relationships between variables are
constant across groups. The possibility that level 1 (e.g., within-group)
relationships vary between level 2 units of analysis (e.g., groups) can be
examined by including interaction terms between such dummy codes and
level 1 variables; however, even when such dummy codes are included,
the analyses violate important assumptions about sampling error. See
Nezlek (2001) for a discussion of the shortcomings of various types of
OLS analyses of multilevel data.
Illustrative Applications
As Kreft and de Leeuw (1998) noted, understanding what hierarchies are
often leads researchers to recognize the hierarchical nature of data they
have collected or are considering collecting. Before describing multilevel
Tab l e 1 Differences across groups in within-group relationships
Group 1 Group 2 Group 3
X
yxyxy
18413918
275121017
366111116
457101215
5 4 8 9 13 14
366111116
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
Multilevel Analyses 845
modeling analyses per se, I will describe applications of multilevel modeling
to help readers think about how multilevel modeling might be applied
to their specific interests. Note that these descriptions are meant to be
illustrative, not exhaustive.
Within contemporary social and personality psychology, perhaps the
most common use of multilevel modeling is the analysis of various types
of diary data. A discussed by Wheeler and Reis (1991), diary-style studies
(including what are sometimes referred to as intensive repeated measures
or experience sampling designs) can be categorized in terms of the basis
on which data are collected. In interval-contingent studies, data are collected
at certain intervals, which may be fixed (e.g., end-of-day studies) or random
(e.g., ‘beeper’ studies). In event-contingent studies, data are collected whenever
a certain type of event occurs. Social interaction diary studies, frequently
relying on some variant of the Rochester Interaction Record (Wheeler &
Nezlek, 1977), are examples of event-contingent data structures. In both
cases, individual observations (diary entries, social interactions, etc.) can
be thought of as level 1 observations that are nested within persons (level
2 observations). See Nezlek (2001) for a general discussion of multilevel
modeling analyses of such data and Nezlek (2003) for a more specific
discussion of multilevel analyses of social interaction diary data. Multilevel
modeling may also be appropriate for personality psychologists who
are interested in within-person variability in psychological states (e.g.,
Fleeson, 2001). In such studies, state level observations are nested within
persons. See Nezlek (2007a, 2007b) for discussions of multilevel analyses
of such data structures.
The analyses of data collected in groups also calls for multilevel modeling.
In studies in which participants are in groups, individuals are the level 1
observations and groups are the level 2 observations (i.e., persons are
nested within groups). See Nezlek and Zyzniewski (1998) for an intro-
duction to using multilevel modeling to analyze data collected in groups,
and see Kenny, Mannetti, Pierro, Livi, and Kashy (2002) for a more
detailed discussion of this topic.
The use of multilevel modeling in cross-cultural studies, which often
concern topics of interest to social and personality psychologists, is
increasing. In cross-cultural research, samples can be collected in different
countries, and in such studies, persons can be treated as nested within
countries, and for such cross-cultural studies, some considerations may be
more important than they are for other types of multilevel designs. For
example, differences in sample sizes may not correspond to differences in
populations, and depending upon the target of inference, in such cases,
analysts may want to conduct some type of weighted analysis that takes
into account differences in the populations of each country. Given the
difficulties inherent in collecting data in multiple countries, researchers
might be able to collect data in only a few countries. In such cases, it
might be difficult to conduct the types of multilevel analyses described in
846 Multilevel Analyses
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
this article, a topic discussed below. See Nezlek (forthcoming) for a more
detailed discussion of the use of multilevel modeling in cross-cultural
research.
Using multilevel modeling to analyze data in which participants
respond to different stimuli is relatively underutilized. Just as we can
consider the days of a diary as nested within persons, responses to a series
of stimuli (e.g., ratings of 30 situations) can be considered as nested within
persons. In such cases, the types of techniques discussed in this article can
be used to estimate within-person relationships among ratings, and the
estimates provided by these techniques will be more accurate than
estimates provided by other means such as within-person correlations or
regression coefficients. Moreover, the present techniques provide better
estimates than comparable OLS techniques of between-person differences
in such within-person relationships. Along these lines, multilevel analyses
can be used to analyze reaction time data collected in experiments,
and interested readers should consult Richter (2006) for an excellent
discussion of such applications.
Sampling Error within Multilevel Data Structures
In a multilevel data structure, units of observations are randomly sampled
from populations at different levels simultaneously. For example, in a study
of group decision-making, individual participants are sampled to provide
a basis for making inferences about people. At the same time, the groups
that are formed are meant to provide a basis for making inferences about
groups. Similarly, in a diary-style study, days in a person’s life are sampled
to provide a basis for making inferences about each person, and persons
are sampled to provide a basis for making inferences about people.
What is significant about such multiple sampling is that the error
associated with sampling at each level of analysis needs to be estimated. If
the people involved in a group process study were used to form different
groups, the relationships that were found would probably be similar to,
but not the same as the relationships found in the first sampling. Similarly,
the within-person relationships found in a sample of days from a person’s
life would probably be similar to, but not exactly the same as, the rela-
tionships found in another sample of days. In both the group process and
diary study, there are two sources of sampling error.
The need to take into account simultaneously the sampling error at
each level of analysis is why it is not appropriate to use a series of OLS
analyses to estimate within-group (or within-person) relationships and
then use such coefficients in other analyses. For example, it is not appropriate
to calculate measures of within-group processes and then use these measures
in a group-level analysis. Similarly, it is not appropriate to calculate within-
person measures such as correlations or regression coefficients and then
use such measures in person level analyses. Such ‘two-stage’ least-squares
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
Multilevel Analyses 847
techniques (even what are sometimes called weighted least squares) do not
provide estimates of relationships that are as accurate as those provided by
the random coefficient techniques discussed in this article. This is due in
part to the fact that with two-stage least-squares and similar techniques,
the errors at the two levels of analysis are necessarily separate. Multilevel
random coefficient modeling relies on maximum likelihood algorithms
that allow for the simultaneous estimation of multiple unknowns (i.e.,
error terms).
The techniques described in this article provide numerous advantages
over comparable OLS techniques, and these advantages are pronounced
under two conditions. First, when hypotheses of interest concern within-
unit relationships (e.g., within-group relationships in a group study,
within-person relationships in a diary study, etc.). Second, when the
data structure is irregular (e.g., when groups differ in size, when people
provide different numbers of diary entries, etc.). These advantages are
the result of the specific statistical techniques that should be used to
analyze multilevel data, and this is discussed in some detail in Nezlek
(2001).
What Is Multilevel Modeling?
The present discussion of multilevel modeling concerns what is technically
referred to as ‘multilevel random coefficient modeling’ (MRCM). The
term ‘random’ reflects the fact the technique models (or estimates)
random coefficients, something discussed below. Multilevel modeling
is relatively straightforward conceptually. For each level 2 unit (e.g., a
group in group studies, a person in diary studies), a level 1 (e.g., within-group
or within-person) model is estimated. Such models are functionally
equivalent to a standard OLS regression. For example, in a study of
groups, the dependent measure might be productivity and the independent
measures (predictors) could be motivation, personality, etc. For each
group, a model (a regression equation with coefficients) is estimated
describing the relationships among these variables. These coefficients
then become the dependent measures at the next level of analysis.
This last sentence describes the heart of the matter – relationships (or
coefficients describing relationships), not only means, become dependent
measures.2
The simplest model is what is usually called a ‘totally unconditional’ or
‘null’ model – there are no predictors at either level of analysis. The
equations for such a model are below. In standard MRCM nomenclature,
level 1 coefficients are referred to as βs, (subscripted 0 for the intercept,
1 for the first coefficient, 2 for the second, etc.), and the basic level 1
model is:
yij
0j+rij
848 Multilevel Analyses
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
In this model, there are ‘i’ level 1 observations nested within ‘j’ level 2 units
of a continuous variable y that are modeled as a function of the intercept
for each level 2 unit (β0j, the mean of y) and error (rij), and the variance of rij
is the level 1 random variance. Level 1 coefficients are then modeled (analyzed)
at level 2, and level 2 coefficients are referred to as γs. There is a separate
level 2 equation for each level 1 coefficient. The basic level 2 model is:
β0j
00 +u0j
In this model, the mean of y for each of j units of analysis (β0j) is modeled
as a function of the grand mean (γ00) and error (u0j), and the variance of
u0j is the level 2 variance. Such models are referred to as unconditional at
level 2 because β0j is not modeled as a function of another variable at level
2. In other words, no hypotheses (other than testing if the mean, γ00, is
significantly different from 0) are tested. Although such unconditional
models do not test hypotheses per se, they are useful first steps because
they provide estimates of the level 1 and 2 variances (the variances of rij
and u0j). Such estimates may indicate where further analyses would be
informative (i.e., ‘where the action is’ in a data set).
Hypotheses are tested by adding variables at either or both levels of
analysis. For example, if a researcher wanted to know if group productiv-
ity was related to how long a group had been together, the following
model could be examined:
yij
0j+rij
β0j
00
01(Time) +u0j
In this analysis, y would be an individual-level measure of productivity
(e.g., how many widgets a worker produced), β0j would be mean pro-
ductivity (for j groups), and the hypothesis would be tested via the
significance of the γ01 coefficient at level 2. If γ01 was positive and significantly
different from 0, then we could conclude that there was a positive
relationship between productivity and the time a group had been
together. If γ01 was significant and negative, the relationship would be
negative, and so forth.
Those unfamiliar with MRCM might ask: ‘Why not simply calculate
a mean for each group and then use these means in a group level analysis?’
Although the question is reasonable, the answer is clear – MRCM
provides better estimates of group means than the simple average. This
superiority is due to the fact that in MRCM, the coefficients describing
a group are weighted by the reliability of the scores (how consistent the
scores are in a group) and by the number of observations in the group –
something called ‘precision weighting’. Not all means are created equal.
The mean of 9, 9, 9 and 1, 1, 1 is 5. The mean of 4, 4, 4 and 6, 6, 6 is
also 5. The second 5 is a ‘better’ 5 (i.e., more representative) than the first
5. MRCM takes such differences into account.
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
Multilevel Analyses 849
Nevertheless, the advantages of MRCM over OLS-based analyses are
even more pronounced when hypotheses of interest concern within-unit
relationships. For example, assume a cross-cultural study in which
measures of political conservatism and independent self-construal have
been collected in numerous countries, and the hypothesis of interest
concerns the within-country relationship between these two measures.
Within a multilevel framework, such a hypothesis would be tested with
the following model:
yij
0j
1j(Self-construal) +rij
β0j
00 +u0j
β1j
10 +u1j
In the level 1 model, the relationship between conservatism and self-
construal is represented by the β1j coefficient (called a slope to distinguish
it from an intercept), and a slope would be estimated for each of j
countries. The hypothesis is tested at level 2 by the significance of the
γ10 coefficient – is the mean slope significantly different from 0? Such an
analysis is much more accurate than a comparable OLS analysis such as
calculating a correlation for each country, converting them using a r to Z
transform, and then doing a t-test of the mean correlation coefficient. As
discussed in Nezlek (2001), there are various reasons for this, mostly
having to do with how MRCM incorporates into the estimates of
coefficients the unreliability of the covariances underlying the coefficients.
By extension, MRCM also excels (compared with OLS techniques)
when hypotheses of interest concern how slopes vary across level 2 units
of analysis. For example, does the relationship between conservatism and
self-construal vary as a function of industrialization as measured by gross
domestic product (GDP)? Such a hypothesis would be tested at level 2
with the following model:
β0j
00
01(GDP) +u0j
β1j
10
11(GDP) +u1j
The possibility that the conservatism–construal slope varies as a function
of GDP is tested by the γ11 coefficient. This is sometimes called a ‘cross-
level interaction’ (because a level 1 relationship varies as a function
of level 2 variable) or a ‘slopes as outcomes’ analysis (because a slope
from level 1 becomes the outcome or dependent measure at level 2). As
discussed below, interpreting such results is best done by generating
predicted or estimated values (e.g., estimating conservatism-construal
slopes for countries that are ±1 SD on GDP). Note that GDP is also
included in the equation for the intercept. The norm in MRCM is
to include, at least in initial models, the same level 2 predictors for all
coefficients.
850 Multilevel Analyses
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
Non-linear and Categorical Outcomes
The preceding discussion has focused on continuous outcomes (e.g.,
scores on scales or instruments) in part because for most researchers, most
of their dependent measures will probably be continuous. Nevertheless,
many outcomes of interest may be categorical (pass-fail, accepted-rejected,
etc.) or non-linearly distributed. Such outcomes violate an important
assumption underlying multilevel (and single level OLS) analyses the
independence of the mean and the variance. For example, the variance of
a binomial outcome is npq, where n is the number of observations, p is
the probability, and q=1–p. Analyzing such data requires the use of
specific techniques that eliminate this dependence (and other problems).
The specific ways in which the analyses of non-linear and categorical
outcomes deal with violations of assumptions varies depending upon the
nature of the outcome, and a discussion of these differences is well beyond
the scope of this paper. Interested readers should consult a text on
multilevel modeling such as Snijders and Boskers (1999) or Raudenbush
and Bryk (2002) for more details.
Centering
Centering refers to the reference value used to estimate an intercept. For
OLS regression, this reference value is invariably the sample mean. The
intercept represents the expected value for an observation that is at the
mean on all the predictors in a model. In MRCM, there are other options.
At level 1, predictors can be uncentered (the intercept represents the expected
value for an observation with a score of 0 on a predictor), group-mean centered
(the intercept represents the expected value for an observation with a score
at the mean for its groups), or grand-mean centered (the intercept represents
the expected value for an observation with a score at the grand mean). At
level 2, predictors can be either uncentered or grand-mean centered.
Broadly speaking, centering at level 1 tends to influence parameter
estimates more than centering at level 2 in part because centering at
level 1 changes the meaning of the intercept and can change the error
structure. In terms of selecting different options, group-mean centering
at level 1 eliminates level 2 differences in predictors, whereas grand-mean
centering at level 1 introduces level 2 differences into the estimates of
level 1 parameters. Using the previous example of conservatism and
self-construal, if self-construal is group-mean centered, differences
across countries in self-construal do not influence parameter estimates.
This is the closest (conceptually) to conducting a regression analysis for
each country and then analyzing these coefficients in another analysis.
If self-construal is entered grand-mean centered, then whatever differences
exist between countries in self-construal will influence the parameter
estimates (e.g., slopes or within-country relationships).
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
Multilevel Analyses 851
As discussed by Nezlek (2001) and Raudenbush and Bryk (2002),
choices about centering should reflect substantive issues – what does the
analyst want the intercept to represent and why? Blanket recommendations
to center one way or another should be ignored because no single rule
can cover all cases. Unfortunately, space limitations do not permit detailed
consideration in this article of centering options in MRCM, and inter-
ested readers should consult Enders and Tofighi (2007) for a detailed
discussion of centering options and should consult basic multilevel texts
for advice regarding this issue.
Random Error and Specifying Error Structures
For analysts whose primary training has concerned OLS analyses such as
single-level regression, there can be some confusion regarding the definition
of random variation. In OLS analyses, there is only one variance estimate,
which is referred to as error, residual, or random variation. In contrast, in
MRCM, the variance of any parameter is divided into true variance and
random variance, and each coefficient can have its own random error
term. Moreover, tests of significance of the fixed coefficients are based on
true variance, not random variance.
In MRCM, coefficients can be estimated in three ways: (i) randomly
varying, if an accompanying random error term is estimated; (ii) fixed, if
no accompanying random error term is estimated; and (iii) non-randomly
varying, if no accompanying random error term is estimated, but differences
in the coefficient are modeled with a variable at the next level. In the
previous cross-cultural example, if the coefficient representing the within-
country relationship between conservatism and self-construal (β1j) was
modeled without a random error term (i.e., the u1jt term was dropped),
it would be described as a fixed coefficient. If the random term was
included, it would be described as a random coefficient.
Random error terms are tested for significance, and the result of
this test indicates if there is enough information to separate true and
random error. Multilevel modelers typically recommend using a more
liberal significance level for such tests (at least 0.10) because in most
instances, coefficients are theoretically or conceptually random, and to
the extent that it is possible, the model should reflect this. Nonetheless,
when a random error term is obviously not significant (e.g., p>0.20),
most modelers recommend dropping the term – there is no need to
use the information in the data to estimate something that cannot be
estimated.
Some interpret the absence of a random error term to mean that a
coefficient does not vary. This is not really the case. Without a random
error term, the model is not estimating the random variability of a
coefficient; however, non-random variability in a coefficient can still be
examined. For the cross-cultural example, GDP could be included in the
852 Multilevel Analyses
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
level 2 equation for the level 1 conservatism-construal slope regardless of
whether the error term was estimated. When a level 2 variable is used to
predict a level 1 coefficient that has been fixed, the level 1 coefficient is
described as being modeled as ‘non-randomly varying’.
Although the error structure for a model rarely tests hypotheses per
se, the error structure needs to be properly specified before conducting
tests of the fixed effects because including or excluding random error
terms can affect the results of significance tests of the fixed effects. More
detailed discussion of specifying error structures in multilevel models
can be found in Nezlek (2001) and in any introductory multilevel
modeling text.
Model Building
Although there are no true, hard, and fast rules about how to build a
model in MRCM (i.e., how to add terms, specify error structures, etc.),
there are some widely accepted guidelines. First, most multilevel modelers
recommend finalizing (or building) level 1 models before adding terms at
level 2. For example, in a study of decision-making in groups in which
persons are nested within groups, individual differences such as measures
of personality should be included at level 1 before examining how
relationships (slopes) between such individual differences, and decisions
vary as a function of group characteristics. The ‘final’ level 1 model
should include predictors of interest and should specify whether the
coefficients representing these predictors are modeled as random or fixed
(i.e., the error structure needs to be specified).
Second, there is broad agreement among multilevel modelers that
analysts should use forward, rather than backward-stepping, procedures,
particularly when building level 1 models. When using forward stepping,
individual variables are added to a model and checked for significance, and
if the coefficient for an individual variable is not significant, the variable
is deleted. When using backward stepping, all variables of interest are
added simultaneously at the beginning, and a variable is eliminated if the
accompanying coefficient is not significant.
Forward stepping is recommended over backward stepping for MRCM
because of the number of parameters that need to be estimated when a
variable is included. For example, assume a level 1 model with a single
predictor. In such a model, five parameters are estimated: (i) the fixed and
random terms for the intercept; (ii) the fixed and random terms for the
slope; and (iii) the covariance between the two random error terms. If a
second predictor is added, nine parameters are estimated: a fixed a random
term for the intercept and the two predictors (n = 6) and the covariances
among the three random terms (n = 3). If a third predictor is added, 14
parameters are estimated: a fixed and random term for the intercept and
the three predictors (n = 8) and the covariances among the four random
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
Multilevel Analyses 853
terms (n = 6). As variables and parameters are added, models may begin
to tax what is sometimes called the ‘carrying capacity’ of a data set – the
number of parameters a data set can estimate.
Analysts who are accustomed to including all their variables in a single
analyses (the ‘Soldier of Fortune’ philosophy: ‘Kill ‘em all and let God
sort them out’) will need to either gather large data sets that provide a
basis for including large numbers of predictors simultaneously, or they will
need to build their models more carefully. The norm among many
multilevel modelers is to build tight, parsimonious models that include
only those variables that have explanatory power rather than larger models
that will provide less precise estimates for a large number of variables that
may vary considerably in importance and explanatory power.
Interpreting and Reporting Results
The results of a MRCM analyses can be bewildering, particularly when
using a program that was not designed specifically to conduct MRCM
(see section on software below). Nevertheless, the following will help
researchers navigate the numerical jungle regardless of the program that
has created it.
1. For almost all purposes, analysts should focus on tests of fixed coefficients
(sometimes called fixed effects); for example, is a slope significantly
different from 0. Almost all hypotheses will concern fixed coefficients
of one sort or another. Frequently, analysts over-interpret random error
terms, particularly non-significant random terms. Frequently, researchers
claim that a coefficient is constant (does not vary) across units of
analysis because the random error term associated with that coefficient
is not significant. It is important to note that although the absence of
a significant random error term is consistent with the hypothesis that a
coefficient does not vary, it cannot prove such a hypothesis. Moreover, as
noted above, coefficients can vary non-randomly. Simply because a
random error term is not estimated for a coefficient (which some
would take to mean that there is no variability in the coefficient) does
not mean that variability in that coefficient cannot be analyzed.
2. To interpret and report results, many multilevel modelers recommend
estimating predicted values. For example, assume that a level 2 variable
moderates a level 1 slope (i.e., a relationship between two level 1
variables varies as a function of a level 2 variable). If the level 2 variable
is categorical, estimated slopes for each category or groups can be
estimated. If the level 2 variable is continuous, estimated slopes for
specific values of the level 2 variables (e.g., ±1 SD) can be estimated.
3. All coefficients estimated in a MRCM analysis are unstandardized.
Attempting to standardize coefficients (e.g., dividing a fixed effect by
some type of error term) is risky at best. Analysts who are concerned
854 Multilevel Analyses
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
about the effect of differing variances on their analyses or who want
some type of standardized metric should transform their data before
analysis.
4. Estimating effect sizes based on variances estimates within the mul-
tilevel context is a bit complex, particularly compared with OLS
analyses (see Raudenbush & Bryk, 2002, for an informed discussion).
In fact, some multilevel modelers (e.g., Kreft & deLeeuw, 1998, p. 119)
recommend avoiding altogether the use of variance estimates to describe
effect sizes. Part of this difficulty stems from the fact that residual
variance estimates and significance tests of individual coefficients are
estimated separately within MRCM, whereas within OLS analyses,
significance tests are based upon reductions in residual variances.
Within OLS regression, the F-ratio for a variable that is added to a
model is directly related to changes in residual variance. Within
MRCM, it is possible to add a significant predictor to a model with
either no change (or even an increase) in residual variance, something
that is not possible in OLS.
Within this context, the following guidelines may be helpful. Similar
to OLS, effect sizes can be estimated by estimating the reduction in
residual variance resulting from the inclusion of predictors. My experience
is that adding a single predictor and comparing the residual variance to
the residual variance from a model with no predictors seems to produce
fairly reliable estimates. It also seems that the problems described above
are more pronounced at level 1 than at level 2. Finally, some editors
will insist that authors provide effect size estimates regardless of (or
without a knowledge of) these problems. In such cases, authors may
want to report such estimates with a simple caveat that they may not be
totally accurate.
Design Considerations
The preceding has focused on conducting and describing the results of
multilevel modeling analyses. When planning a study (or evaluating an
existing study), other considerations can be important, and I will briefly
discuss two of these: the number of levels that should be used and sample
size and power. In most cases, the levels needed for an analysis will be
suggested rather straightforwardly by the data. For example, two level
models would be appropriate when students are nested within classrooms
or days of a diary are nested within persons. Sometimes, however, an
additional level of nesting may need to be considered. Should classrooms
be nested within schools? If data are collected on multiple occasions each
day, should occasions be nested within days and days nested within
persons? Such questions can be answered in various ways. First and foremost,
are there enough observations to constitute a level of analysis? Only two
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
Multilevel Analyses 855
schools or only two occasions of measurement per day do not provide a
good basis for estimating between-school or within-day parameters.
Second, and related, can the data estimate random effects at all levels of
analysis? If not, perhaps an additional level of analysis stretches the data
too thinly. Sometimes, one type of nesting or another might seem
appropriate (e.g., students within classes or within schools). In such cases,
analysts need to determine what the psychologically meaningful hierarchy
is, something that may be suggested previous research.
A related question is power: How many observations are needed at each
level of analysis? Unfortunately, estimating power for multilevel designs is
considerably more complicated than estimating power for single level
analyses. In a multilevel design, power depends on the number of observa-
tions at each level of analysis, the variance distributions of measures, the
type of effect being examined (e.g., within levels or cross-levels), to mention
a few. Such complexity makes it impossible to provide hard and fast rules.
Nevertheless, some guidance is available, and two good sources for advice
are Maas and Hox (2005) and Richter (2006).
Software Options
The mathematical and statistical theories underlying MRCM have been
available for some time – at least 20 or 30 years depending on which
aspects are being considered. What has changed dramatically in the last 10
years has been the availability of software to conduct these analyses. When
considering software options, researchers should be aware that different
software will provide the same (or very, very, similar) results if exactly the
same models are tested. This caveat is particularly important because different
software packages implement different options such as centering in different
ways. Inexperienced (and even experienced) modelers may unwittingly
run improper or misleading models not because they do not know what
model they want to run, but because they do not know how to implement
the model they want to run using a certain software package.
With this in mind, I recommend that inexperienced modelers begin
with the program HLM (Raudenbush, Bryk, Cheong, & Congdon,
2004), a popular and widely used program. Setting up the analyses
requires the preparation of a data set for each level of analysis, something
I think helps keep things clear. The program has a convenient interface
that makes it easy to change different aspects of the analysis such as
centering and error terms. It is also easy to specify analyses of non-linear
variables such as categorical outcomes. For more sophisticated users, there
are options to conduct latent variable analyses and to model complex
error structures such as autocorrelation, among others. It is important to
keep in mind that HLM can be used for only multilevel models. It cannot
do other types of analyses such as calculating correlations, nor can it
transform data. Moreover, it has some limitations compared to other
856 Multilevel Analyses
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
programs in terms of the sophistication of the models that can be tested
(e.g., error covariances cannot be set to a constant value) and in terms of
the number of levels (HLM is limited to 3).
MlwiN (Rabash, Browne, Steele, & Prosser, 2005) is another single
purpose program for conducting MRCM. MlwiN has somewhat more
flexibility than HLM in terms of data transformation and modeling error
structures, although in my opinion, the interface is not as easy to use and
the output is not as accessible as HLM is. Nevertheless, along with HLM,
MlwiN is a standard setter for programs designed to conduct MRCM.
Another option is MPlus (Muthén & Muthén, 2007), a multipurpose
modeling program that can conduct structural equation modeling and
MRCM. MPlus is very powerful and has numerous options that may be
attractive to more advanced analysts.
Having noted this, researchers may want to use a package with which
they are more familiar such as SAS, SPSS, and more recently, a series of
procedures in R. SAS is particularly powerful in terms of its options
(particularly PROC MIXED; Littell, Milliken, Stroup, & Wolfinger 1996;
Singer, 1998), and for many, SPSS has the advantage of familiarity. R is
an open source software package that is becoming more and more
popular (http://www.r-project.org/), and like SAS, has numerous options
that may be appealing for the more sophisticated user. There is also a
multilevel option in LISREL for analysts who are familiar with this
package.
There are important caveats when using more general-purpose programs
(and when using MlwiN). When using these programs, analysts need to
construct in advance various terms such as those representing cross-level
interactions or variables may need to be centered prior to analysis. In
HLM, centering and the creation of cross-level interaction terms are done
by the program. Another advantage of HLM is that it explicitly separates
coefficients representing effects at different levels of analysis and coefficients
representing cross-level interactions. Other programs do not distinguish
coefficients as clearly. Once again, I would caution users (particularly
novice users) to be particularly careful when setting up models using all-
purpose programs. Unless an analyst is familiar with exactly how options
are specified and how the output is organized, serious problems in model
specification and interpretation of results can occur. See Singer (1998)
for a similar caution.
When Are Multilevel Analyses Appropriate?
To me, the answer to this question is simple. Multilevel analyses are
appropriate when the data are multilevel. Various authors (literally too
many to list – this is an ongoing discussion) suggest that multilevel models
are not appropriate when something called the intraclass correlation
(ICC) is low (or 0). The ICC is a measure of the relative distribution of
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
Multilevel Analyses 857
between- and within-group variance of a measure, and a low ICC means
that there is relatively little variance between groups. For example, in a
study of groups, if all the groups had the same mean on a measure, the
ICC would be 0. Similarly, in a diary study, if all participants had the same
mean on a daily measure, the ICC would be 0.
Of particular importance is the fact that researchers sometimes use a
low or zero ICC to justify a decision not to use multilevel modeling – on
the grounds that because there is no (or very little) between-group variance
in the dependent measure, the grouped (or nested) structure of the data
can be ignored. This is a dangerous assumption that is not justifiable.
Frequently (or almost invariably), researchers are interested in relationships
between measures. The fact that there is little or no between-group
variance in a measure does not mean that the relationship between this
measure and another measure is the same across all groups, something that
is assumed if one conducts and analysis that ignores the grouped structure
of the data. By extension, even if there is no between-group variance for
all of the measures of interest, it cannot be assumed that relationships
between or among these measures do not vary across groups.
Such a possibility is illustrated by the data in Table 2. For groups 1, 2,
and 3, the relationship between x and y is negative, whereas for groups 3,
4, and 5, it is positive. Moreover, the ICC for both measures is 0 – the
means for x and y are identical across all groups. If the data are analyzed
without taking the groups into account, the resulting correlation is 0 –
clearly not an appropriate estimate.
The assumption underlying this article is that analysts should use
multilevel modeling when they have a multilevel data structure – pure and
simple. When I am asked for advice regarding whether or not multilevel
modeling is appropriate, my first question concerns the nature of the data
structure. If there is a meaningful nested hierarchy to the data, my advice
is to use multilevel modeling, irrespective of distracting arguments about
ICCs and so forth.
On the other hand, consistency is the hobgoblin of small minds, and
no single rule covers all cases. For two level data, it is difficult (and it may
Tab l e 2 Varying within-group relationships when intraclass correlations are 0
Group 1 Group 2 Group 3 Group 4 Group 5 Group 6
Xy x y xy xy Xy xy
15 15 15 11 11 11
24 24 24 22 22 22
33 33 33 33 33 33
42 42 42 44 44 44
51 51 51 55 55 55
858 Multilevel Analyses
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
not be appropriate) to use multilevel modeling when there are a limited
number of observations at level 2. For example, a researcher may collect
data in three countries and may want to use multilevel modeling to
analyze these data, treating persons as nested within countries. Although
conceptually accurate, such an analysis would not be practical because
there are not enough countries to provide a basis for making an inference
about countries. A sample of three is simply too small. In such cases,
analysts should consider a technique such as regression by groups in which
separate regression equations are calculated for each country and then
compared.
Discussing such a possibility raises questions about how many units of
analysis are required to justify multilevel modeling. When can, or should,
hierarchies be ignored? Unfortunately, there are no hard and fast rules that
can be used to answer such questions. Certainly, three, four, or five level
2 units will probably not provide a basis for making inferences about some
population, whereas 10 or more probably will. This leaves a gray area,
for which I cannot offer any advice other than to conduct preliminary
analyses to determine if coefficients of interest can be estimated with any
reliability. Inferences based on small samples are always less stable than
those based on larger samples (holding constant other factors such as
variances), and this is also true for multilevel modeling. In addition, see
discussion of power in the previous section on Design Considerations.
Conclusions
Multilevel analyses provide researchers with powerful tools to investigate
phenomena with greater precision than that provided by other techniques;
however, this increased power comes at a price – increased complexity of
the analyses in terms of options and interpretation. Nevertheless, if
researchers invest a modest amount of time reading about multilevel
modeling and learning how to conduct multilevel analyses, I think such
investments will reward them handsomely. Understanding how to think
hierarchically and how to analyze hierarchically nested data may provide
important insights for many researchers, and I hope that this article has
provided a starting point for those who are interested but not knowledgeable.
Short Biography
John Nezlek’s primary research interests are naturally occurring social
behavior and naturally occurring variability in psychological states and the
statistical methods needed to analyze such data, specifically multilevel
modelling. He has written numerous articles and chapters describing how
to use multilevel modeling to analyze the types of data social and personality
psychologists collect and has published numerous papers describing the
results of multilevel modelling analyses. In addition, he has also conducted
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
Multilevel Analyses 859
multilevel analysis workshops worldwide. He received his PhD from the
University of Rochester, and since then, he has been on the faculty of
the College of William & Mary, where he is now a Professor of Psychology.
Endnotes
* Correspondence address: Department of Psychology, PO Box 8795, College of William &
Mary, Williamsburg, VA 23187-8795, USA. Email: john.nezlek@wm.edu.
1For the sake of directness, the article discusses multilevel modeling in terms of two level
models. It is important to note that theoretically, an analysis can have any number of levels.
Nevertheless, analysts are encouraged to think parsimoniously about their models, particularly
in terms of the number of levels they use. In this instance, less is truly more.
2Describing MRCM as a series of multiple regression analyses in which coefficients are ‘passed
up’ from one level of analysis to the next follows the treatment offered by Raudenbush and
Bryk (2002). Although in fact, coefficients at all levels of analysis are estimated simultaneously,
I think that Bryk and Raudenbush’s treatment represents a pedagogical breakthrough. Thinking
of multilevel models as a series of hierarchically nested regression equations helps to maintain
distinctions among effects at and between different levels of analysis. Treatments in which
coefficients from all levels of analysis are discussed in terms of a single equation, although
technically accurate, tend to blur distinctions between coefficients at different levels of analysis,
and such distinctions that are particularly critical for those who are not familiar with multilevel
analyses.
References
Enders, C. K., & Tofighi, D. (2007). Centering predictor variables in cross-sectional multilevel
models: A new look at an old issue. Psychological Methods, 12, 121–138.
Fleeson, W. (2001). Toward a structure-and process-integrated view of personality: States as
density distributions of traits. Journal of Personality and Social Psychology, 80, 1011–1027.
Kenny, D. A., Mannetti, L., Pier ro, A., Livi, S., & Kashy, D. A. (2002). The statistical analysis
of data from small groups. Journal of Personality and Social Psychology, 83, 126–137.
Kreft, I. G. G., & de Leeuw, J. (1998). Introducing Multilevel Modeling. Newbury Park, CA: Sage
Publications.
Littell, R. C., Milliken, G. A., Stroup, W. W., & Wolfinger, R. D. (1996). SAS System for Mixed
Models. Cary, NC: SAS Institute.
Maas, C. J. M., & Hox, J. J. (2005). Sufficient sample sizes for multilevel modeling. Methodology,
1, 86–92.
Muthén, L. K., & Muthén, B. O. (2007). MPlus User’s Guide (4th ed.). Los Angeles: Muthén
& Muthén.
Nezlek, J. B. (2001). Multilevel random coefficient analyses of event and interval contingent
data in social and personality psychology research. Personality and Social Psychology Bulletin, 27,
771–785.
Nezlek, J. B. (2003). Using multilevel random coefficient modeling to analyze social interaction
diary data. Journal of Social and Personal Relationships, 20, 437–469.
Nezlek, J. B. (2007a). A multilevel framework for understanding relationships among traits,
states, situations, and behaviors. European Journal of Personality, 21, 1–23.
Nezlek, J. B. (2007b). Multilevel modeling in research on personality. In R. Robins, R. C. Fraley
& R. Krueger (Eds.), Handbook of Research Methods in Personality Psychology (pp. 502–523).
New York: Guilford.
Nezlek, J. B. (forthcoming). Multilevel modeling and cross-cultural research. In D. Matsumoto
& A. J. R. van de Vijver (Eds.), Cross-Cultural Research Methods in Psychology. Oxford: Oxford
University Press.
Nezlek, J. B., & Zyzniewski, L. E. (1998). Using hierarchical linear modeling to analyze
grouped data. Group Dynamics: Theory, Research, and Practice, 2, 313–320.
860 Multilevel Analyses
© 2008 The Author Social and Personality Psychology Compass 2/2 (2008): 842–860, 10.1111/j.1751-9004.2007.00059.x
Journal Compilation © 2008 Blackwell Publishing Ltd
Rabash, J., Browne, W., Steele, F., & Prosser, B. (2005). A Users’ Guide to MLwiN (2nd ed.).
London: Institute of Education.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models (2nd ed.). Newbury Park,
CA: Sage Publications.
Raudenbush, S., Bryk, A., Cheong, Y. F., & Congdon, R. (2004). HLM 6: Hierarchical Linear
and Nonlinear Modeling. Lincolnwood, IL: Scientific Software International.
Richter, T. (2006). What is wrong with ANOVA and multiple regression? Analyzing sentence
reading times with hierarchical linear models. Discourse Analysis, 41, 221–250.
Singer, J. D. (1998). Using SAS PROC MIXED to fit multilevel models, hierarchical models,
and individual growth models. Journal of Educational and Behavioral Statistics, 23, 323 –355.
Snijders, T., & Bosker, R. (1999). Multilevel Analysis. London: Sage Publications.
Wheeler, L., & Nezlek, J. (1977). Sex differences in social participation. Journal of Personality
and Social Psychology, 35, 742–754.
Wheeler, L., & Reis, H. (1991). Self-recording of everyday life events: Origins, types, and uses.
Journal of Personality, 59, 339–354.
... Some scholars recommend the construction of a multilevel model irrespective of the size of the group effect; the structure of the underlying data is regarded as the decisive element. 189 For the purpose of this research project, the data structure was the determining element for the construction of a multilevel model; Throughout I report, the likelihood test statistic and assess the changes for their significance using (10). Additionally AIC and BIC are reported. ...
... Some scholars recommend a minimum of 20 study sites to perform a multilevel analysis, 186 others advocate for the data structure as the decisive element. 189 The advice of the latter was adopted for this study, guided by the exploratory nature of the survey and the fact that the hierarchical structure of the data restricted the application of traditional, linear regression models. 186 However, the validity of the findings would have benefitted greatly from an inclusion of additional study sites. ...
... (p.26) Simulation studies have demonstrated that a low sample size at level 1 is unlikely to cause any severe issues, while a sample size of n<50 at level 2 can result in biased estimates of the standard error at this level.188 However, some scholars advocate for the data structure as the decisive element, that is, if the data are organised in a multilevel structure, a multilevel model should be applied.189 In our case, sample sizes at level two were n=12 and n=15, thus, it needs to be assessed whether an increasinglycomplex model (i.e. ...
Article
Background Mental health conditions, such as depression and anxiety affect a large proportion of the population in England; it has been estimated that one-in-six adults show symptoms of common mental health disorders in any given week. Additionally, the outbreak of COVID-19 in March 2019 and the measures that have been implemented to curb the spread of the disease have negatively affected some individuals’ mental health and wellbeing. Therefore, to tackle the ongoing mental health crisis, the development of support mechanisms that are easy to access and embedded in the primary care network, is required. Community pharmacies are accessible without the need for an appointment and pharmacists are recognised as currently under-utilised, yet highly skilled primary healthcare providers. Thus, community pharmacy presents as an ideal candidate for establishing an alternative source of mental health support within the primary care network. Additionally, preliminary evidence suggests that pharmacy-recorded transactional data, as registered on loyalty cards, can be indicative of underlying health conditions, including mental health issues. Therefore, the tracking and analysing of these data could facilitate the identification of individuals at risk and, in turn, enable pharmacists to offer targeted support. However, currently there is limited evidence pertaining to public attitudes towards mental health support provided in pharmacies and the utilisation of transactional data to identify individuals at risk. Aim To evaluate public attitudes towards mental health support provided in community pharmacy using purchasing data as a tool to identify individuals at risk of developing mental health issues. Methods This study adopted an explanatory, sequential mixed methods research design, encapsulating two separate research streams. In research stream one, the views of pharmacy users towards mental health support provided in pharmacies were investigated. Research stream two evaluated the views of university students and pharmacy users towards utilising transactional data to identify individuals at risk of developing mental health conditions. Both research streams commenced with the development and subsequent distribution of surveys amongst the population of interest, in order to describe individuals’ attitudes quantitatively. The obtained data were subjected to descriptive and inferential statistical analyses performed in Stata (Release 16). The results informed the subsequent qualitative research phases. Semi-structured interviews were conducted with some pharmacy users (n=9) and university students (n=17), to provide an in-depth understanding of individuals’ stances towards both topics. The obtained narrative data were analysed thematically, utilising the software NVivo (Version 12) to aid with data management. Results Pharmacy users’ attitudes towards mental health support provided in pharmacies ranged from scepticism to being moderately supportive in 2019 (n=3449) and 2020 (n=1474), respectively. Individuals who reported higher levels of trust in community pharmacists exhibited more positive attitudes; self-reporting a diagnosis of depression and/or anxiety was found to be predictive of more negative attitudes. Qualitatively, the importance of trust for public acceptance of mental health support provided in pharmacies was reiterated, and factors influencing individuals’ stances were identified, such as facilitators, advantages and barriers for pharmacy provided mental health care. In research stream two, university students as well as pharmacy users exhibited greater support for the utilisation of aggregate-level loyalty card data in health research than utilising these data to identify individuals specifically. Based on the student interviews, a preliminary framework of factors affecting individuals’ stances was developed. First, aspects pertaining to the data provider, the prospective data user and the nature of the data itself were found influence students’ attitudes. Secondly, university students performed a risk-benefit assessment, and in the instance that the expected benefits outweigh potential risks, students supported the utilisation of loyalty card data for the proposed purpose. Thirdly, greater understanding and trust in the prospective data user acted as facilitators in university students’ thought-process. Pharmacy users’ acceptance considerations appeared to be influenced by similar aspects. Conclusions and recommendations There is public support for establishing community pharmacy as an alternative source for mental health support within the primary care network; especially a role for pharmacists as an information hub and intermediary between pharmacy users and other healthcare professionals was endorsed. However, equipping pharmacists with the necessary toolkit to fulfil this role is crucial, e.g. by offering pharmacy-specific mental health first aid classes, or expanding existing services, such as the new medicines service and the community pharmacy consultation service. Secondly, trust between pharmacy users and pharmacists was found to be fundamental for public acceptance of new services in pharmacies. Therefore, pharmacy-practice research, which evaluates potential trust-enhancing mechanisms, is required; the results should guide future policy. Thirdly, there appears to be public support for the tracking and analysing of transactional data, especially if the benevolence of the approach is emphasised. Likewise, obtaining trust is fundamental for obtaining public acceptability. The importance of trust is widely recognised in the digital health landscape, and the implementation of trust-enhancing measures is a focal point of current policy. Pharmacy practice research and policymaking should draw lessons from these developments, if the tracking and analysing of transactional data in the realm of pharmacy-provided mental healthcare, is sought after.
... Although the effects of group language learning on students' learning attitudes and the effects of trust on WTC in a second language have been examined extensively, no study has examined group-level factors in a group-language learning setting. In the context of group language work, individuals are nested within groups (Nezlek, 2008). The relationships between the psychological factors are usually calculated based on individual learners as the independent ones. ...
... The assumption is that researchers should use multilevel modeling when they have a multilevel data structure. If there is a meaningful nested hierarchy in the data, multilevel modeling is recommended (Nezlek, 2008). Table 2 shows the effect of trust in group members on WTC in English using a hierarchical linear model. ...
... Therefore, unique group-level trust is formed. Although the ICC for WTC was not significant, the group project had a multilevel data structure, and multilevel modeling could be used (Nezlek, 2008). As a result of a hierarchical linear model, students who trusted group members were willing to discuss and present in English among group members, and groups who trusted group members were willing to discuss and present in English. ...
Article
Full-text available
This study examined the effects of individual-level and group-level trust on willingness to communicate (WTC) in a second language, targeting Japanese university students in a group-language (English) learning setting. Although the effects of group language learning on students’ learning attitudes and the effects of trust on WTC in a second language have been examined extensively, no study has examined group-level factors in a group-language learning setting. A questionnaire survey was conducted thrice per semester. Multilevel analysis found that individual-level trust in group members positively influenced individual-level WTC in English, and group-level trust in group members also positively influenced group-level WTC in English repeatedly through one semester. Moreover, the degree of group-level WTC in English changed after the mid-semester. This study contributes to the literature on group-language learning, and has implications for language education where educators must be mindful not only of each student’s characteristics but also of each group’s characteristics to enhance their performance.
... Statistical analyses were run in the same manner as in Study 1 using the lme4 package with days (N = 21) nested within persons (N = 217). All variables were again standardized, scaled, and participant-mean centered, except for the prosocial act variable due to its binary nature (Nezlek, 2011). Descriptive statistics and both within-and between-person correlations for all relevant measures are included in Table 4 below. ...
... An alternative approach to modelling the binary prosocial variable is to person-mean center the variable and separate out the within-person and between-person effects (Bolger & Laurenceau, 2013), as we had in Study 1 with the continuous prosociality variable. However, other authors recommend leaving binary predictor variables un-centered (Nezlek, 2011), and indeed, extracting out within-person and between-person effects for a binary variable does complexify interpretation of our effects. Therefore, we chose to leave the prosocial act variable in Study 2 un-centered, while still controlling for the between-person effect of prosociality by accounting for prosocial proportion (Weinstein & Ryan, 2010). ...
Article
Full-text available
Prosocial behaviors benefit others, but what benefits do they hold for those who enact them? Prosociality can enhance the actor’s well-being, yet whether it is one’s sense of happiness or meaning that is impacted, and how this plays out in everyday life, has received limited attention. We address this gap in knowledge by examining how prosociality is related to daily meaning and happiness across two large daily diary studies in two countries. Study 1 (N = 1140) revealed that changes in one’s subjective sense of prosociality was uniquely associated with both daily meaning and happiness. Study 2 (N = 217) found that self-reported prosocial behavior was also clearly linked to increases in daily meaning, and modestly associated with daily happiness. Altogether, our findings suggest that the subjective sense of prosociality is associated with meaning and happiness, and that performing prosocial acts may be particularly relevant to experiencing meaning.
... The ICC for the model when the outcome variable is negative emotions is 0.67, while the ICC for the model when the outcome is positive emotions is 0.72. These results indicated that HLM is an appropriate method to explore our hypotheses (Nezlek, 2011). We used the random intercept in Model 2 in Table 3 due to the non-significant result of the LRT (χ 2 (2) = 5.73, p = 0.057). ...
Article
Full-text available
Although we are surrounded by various kinds of rumors during the coronavirus disease pandemic, little is known about their primary content, what effect they might have on our emotions, and the potential factors that may buffer their effect. Combining qualitative (study 1 extracted 1907 rumors from top rumor-refuting websites using the Python Web Crawler and conducted content analysis) and quantitative (study 2 conducted an online survey adopting a three-wave design, N = 444) research methods, the current study revealed that government-related rumors accounted for the largest proportion of rumors during the outbreak stage of the pandemic and were positively associated with the public’s negative emotions. We also found that trust in government negatively moderated the relationship between government-related rumors and negative emotions. Specifically, when people had low trust in government, exposure to government-related rumors was positively associated with negative emotions. However, when people had high trust in government, the association was non-significant. For positive emotions, we found no significant effects of government-related rumors. The findings highlight the importance of rumor control during public emergencies and cultivating public trust in government in the long run.
... Still, at the multilevel level, we will explore whether leader's gender has an effect on workplace affect. Multilevel random coefficient models (Nezlek, 2008) are applied to test the second research question. ...
Article
Full-text available
The purpose of this study was to analyze the role of emotional skills and gender in the workplace affect through non-hierarchical level and multilevel analyses. Data were collected through self-reported questionnaires completed by 19 directors and 128 subordinates working in interdisciplinary teams of Centers of Educational and Counselling Support. At non-hierarchical level, higher-order emotional skills (use of emotions and cognitive reappraisal) were positively related to positive affect in the workplace. Negative affect was negatively associated with the ability to regulate emotions and positively associated with the strategy of emotional suppression. From the multilevel analyses, cognitive reappraisal was a significant predictor for both positive and negative affect (inversed) of subordinates. Gender differences were observed in multilevel analyses. We also explored possible tensions between emotional competencies and emotional outcomes (workplace affect) at non-hierarchical and managerial level in interdisciplinary teams. We suggest training programs that promote group emotional empowerment and focus on higher-order emotional skills. At managerial level, we suggest training programs that penetrate cognitive reappraisal of emotions since this is an emotion regulation strategy that promotes safer emotional affect in the workplace.
... This model allowed us to account for the cross-classified structure of the data where words were nested in categories and treated as random effects that were crossed with participants (also treated as random). This analysis method also allows for within-participant differences in response speeds of individual trials to be accounted for, reducing error variance and increasing precision of estimates (see, e.g., Judd et al., 2012;Nezlek, 2008;Richter, 2010) and allows a test of the interaction between a within-participant variable (pairing) and between-participant variable (vanity vs. control). Condition (control = −1, vanity = 1), word pairing (attractiveness with morality = −1, attractiveness with immorality = 1), and their interaction were treated as fixed effects. ...
Article
For almost 50 years, psychologists have understood that what is beautiful is perceived as good. This simple and intuitively appealing hypothesis has been confirmed in many ways, prompting a wide range of studies documenting the depth and breadth of its truth. Yet, for what is arguably one of the most important forms of "goodness" that there is-moral goodness-research has told a different story. Although greater attractiveness is associated with a host of positive attributes, it has been only inconsistently associated with greater perceived morality (or lesser immorality), and meta-analyses have suggested the total effect of beauty on moral judgment is near zero. The current research documents one plausible reason for this. Across nine experiments employing a variety of methodological and measurement strategies, we show how attractiveness can be perceived as both morally good and bad. We found that attractiveness causally influences beliefs about vanity, which translates into beliefs that more attractive targets are less moral and more immoral. Then, we document a positive association between attractiveness and sociability-the nonmoral component of warmth-and show how sociability exerts a countervailing positive effect on moral judgments. Likewise, we document findings suggesting that vanity and sociability mutually suppress the effects of attractiveness on each other and on moral judgments. Ultimately, this work provides a comprehensive process account of why beauty seems good but can also be perceived as less moral and more immoral, highlighting complex interrelations among different elements of person perception. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
... p < .001). Given the dependence of the data (i.e., repeated measures within individuals), we proceeded with analyses despite all of the variability being at Level 1 (Nezlek, 2008). ...
Article
Using data from an online survey assessing perceptions of 12 target1 photos of Black and White males2 and females, the current study examined a moderated mediation model of sexual orientation perceptions (N = 310). Results indicated that perceived masculinity/femininity was not a mediator in the association between target gender and perceived sexual orientation. Black male targets were perceived to be more heterosexual than White male targets, Black female targets were perceived as less heterosexual than White female targets, and Black targets were perceived as more masculine than White targets. These findings may indicate rigid gender- and sexuality-related expectations for Black men. Black women who are perceived as gay may be more at risk of related interpersonal harm than White women. Black sexual minority individuals may face challenges in navigating identity concealment or disclosure. Future research should explore the potential antecedents and consequences of perceptions of Black individuals’ sexual orientation.
... A typical question for the withinperson level is whether or not variation in one variable is associated with future variation in another variable. Technically, knowing the correlations between two variables at the between-person level does not tell us anything about their associations at the within-person level (Nezlek, 2011), and there are numerous reasons for a lack of equivalence between the two levels of analysis (Voelkle et al., 2014). The main goal of the present study was to use longitudinal data to examine the direction of associations between social well-being and personality traits. ...
Article
Full-text available
This study examined the temporal relationships between social well-being and the Big Five personality traits (i.e., neuroticism, extraversion, agreeableness, conscientiousness, and openness to experience), using a sample of 6452 American adults collected at 3 time points over 2 decades. The random-intercept cross-lagged panel model was used, which allows associations between variables to be examined at the between-person and within-person levels. At the between-person level, neuroticism was negatively associated and the other traits were positively associated with social well-being. At the within-person level, increases or decreases in trait levels did not predict subsequent increases or decreases in social well-being. However, increased (i.e., higher-than-usual) social well-being was associated with increased future levels of extraversion and conscientiousness. Thus, sustained improvements in social well-being may precede and predict increases in extraversion and conscientiousness.
Thesis
Full-text available
This study, inspired by the conceptual framework adapted into school settings by Hoy, investigated teacher and school level factors predicting school mindfulness, and employed an explanatory sequential mixed-methods design. Quantitative data came from 1354 teachers nested in 69 middle schools in Ankara. The role of teacher (gender, age, level of education, length of time with principal, years of experience in school, total years of experience in teaching, number of days attended professional development activities) and school level factors (school size, mode of schooling, class size, organizational trust, collective teacher efficacy, academic press) on the school mindfulness variation within and between schools was explored by HLM. Results revealed that 12,7% of the total variance in school mindfulness originates from between school variations, length of time with principal is a significant yet negative predictor at teacher level, and teacher trust in principal, teacher trust in colleagues and collective efficacy in student discipline are significant predictors at school level. All significant teacher and school level predictors explain 96.9% of the between school variation in school mindfulness. In the second phase, an embedded single case design was adopted, and 12 semi-structured interviews were made with school principals, who scored the highest and the lowest in principal mindfulness subscale of the M-Scale, based on teachers' responses within the former phase. Qualitative data was analyzed by theoretical thematic analysis method. For triangulation, unstructured observations and school websites were also analyzed. This study concludes that trust, dynamic collaboration and mutual communication in schools enhance mindfulness by increasing the capacity for problem-solving, sharing decision-making and learning together.
Article
Full-text available
The authors elaborate the complications and the opportunities inherent in the statistical analysis of small-group data. They begin by discussing nonindependence of group members’ scores and then consider standard methods for the analysis of small-group data and determine that these methods do not take into account this nonindependence. A new method is proposed that uses multilevel modeling and allows for negative nonindependence and mutual influence. Finally, the complications of interactions, different group sizes, and differential effects are considered. The authors strongly urge that the analysis model of data from small-group studies should mirror the psychological processes that generate those data.
Article
Full-text available
An important problem in multilevel modeling is what constitutes a sufficient sample size for accurate estimation. In multilevel analysis, the major restriction is often the higher-level sample size. In this paper, a simulation study is used to determine the influence of different sample sizes at the group level on the accuracy of the estimates (regression coefficients and variances) and their standard errors. In addition, the influence of other factors, such as the lowest-level sample size and different variance distributions between the levels (different intraclass correlations), is examined. The results show that only a small sample size at level two (meaning a sample of 50 or less) leads to biased estimates of the second-level standard errors. In all of the other simulated conditions the estimates of the regression coefficients, the variance components, and the standard errors are unbiased and accurate.
Article
Three experience-sampling studies explored the distributions of Big-Five-relevant states (behavior) across 2 to 3 weeks of everyday life. Within-person variability was high, such that the typical individual regularly and routinely manifested nearly all levels of all traits in his or her everyday behavior. Second, individual differences in central tendencies of behavioral distributions were almost perfectly stable. Third, amount of behavioral variability (and skew and kurtosis) were revealed as stable individual differences. Finally, amount of within-person variability in extraversion was shown to reflect individual differences in reactivity to extraversion-relevant situational cues. Thus, decontextualized and noncontingent Big-Five content is highly useful for descriptions of individuals' density distributions as wholes. Simultaneously, contextualized and contingent personality units (e.g., conditional traits, goals) are needed for describing the considerable within-person variation.
Article
Cross-cultural psychologists, and other scholars who are interested in the joint effects of cultural and individual-level constructs, often collect data and are interested in hypotheses that involve multiple levels of analysis simultaneously. For example, in cross-cultural research, it is not uncommon to collect data from numerous individuals in numerous countries (or cultures). Such data structures are frequently referred to as multilevel or hierarchically nested, or simply nested data structures because observations at one level of analysis (e.g., individuals) are nested within observations at another (e.g., culture). Within a multilevel framework, questions of interest could be couched in terms of cultural differences in means of individual-level measures such as Life Satisfaction, within-culture relationships between individual-level measures such as Life Satisfaction and Individualism, and between-cultural differences in such within-culture relationships. When analyzing such nested data structures, the possibility that relationships among constructs can vary across levels of analysis must be taken into account. That is, relationships between two variables at the between-country level (e.g., relationships among country-level aggregates, sometimes referred to as ecological correlations) may or may not be the same as the relationships between these two variables within countries (e.g., individual-level correlations). In fact, relationships at the two levels of analysis are mathematically independent (e.g., Nezlek, 2001), and it is inappropriate to draw conclusions about within-culture relationships from between-culture analyses. This inappropriateness is highlighted by the possibility that within-country (i.e., individual-level) relationships may vary across countries, undermining the validity of any estimate of “the” individual-level relationship, simply because there may not be a single, uniform individual-level relationship.
Article
SAS PROC MIXED is a flexible program suitable for fitting multilevel models, hierarchical linear models, and individual growth models. Its position as an integrated program within the SAS statistical package makes it an ideal choice for empirical researchers and applied statisticians seeking to do data reduction, management, and analysis within a single statistical package. Because the program was developed from the perspective of a "mixed" statistical model with both random and fixed effects, its syntax and programming logic may appear unfamiliar to users in education and the social and behavioral sciences who tend to express these models as multilevel or hierarchical models. The purpose of this paper is to help users familiar with fitting multilevel models using other statistical packages (e.g., HLM, MLwiN, MIXREG) add SAS PROC MIXED to their array of analytic options. The paper is written as a step-by-step tutorial that shows how to fit the two most common multilevel models: (a) school effects models, designed for data on individuals nested within naturally occurring hierarchies (e.g., students within classes); and (b) individual growth models, designed for exploring longitudinal data (on individuals) over time. The conclusion discusses how these ideas can be extended straighforwardly to the case of three level models. An appendix presents general strategies for working with multilevel data in SAS and for creating data sets at several levels.
Article
SAS PROC MIXED is a flexible program suitable for fitting multilevel models, hierarchical linear models, and individual growth models. Its position as an integrated program within the SAS statistical package makes it an ideal choice for empirical researchers and applied statisticians seeking to do data reduction, management, and analysis within a single statistical package. Because the program was developed from the perspective of a "mixed" statistical model with both random and fixed effects, its syntax and programming logic may appear unfamiliar to users in education and the social and behavioral sciences who tend to express these models as multilevel or hierarchical models. The purpose of this paper is to help users familiar with fitting multilevel models using other statistical packages (e.g., HLM, MLwiN, MIXREG) add SAS PROC MIXED to their array of analytic options. The paper is written as a step-by-step tutorial that shows how to fit the two most common multilevel models: (a) school effects models, designed for data on individuals nested within naturally occurring hierarchies (e.g., students within classes); and (b) individual growth models, designed for exploring longitudinal data (on individuals) over time. The conclusion discusses how these ideas can be extended straighforwardly to the case of three level models. An appendix presents general strategies for working with multilevel data in SAS and for creating data sets at several levels.