When Fit Is Fundamental: Performance Evaluations and Promotions of
Upper-Level Female and Male Managers
Karen S. Lyness
Baruch College, City University of New York
Madeline E. Heilman
New York University
Using archival organizational data, the authors examined relationships of gender and type of position
(i.e., line or staff) to performance evaluations of 448 upper-level managers, and relationships of
performance evaluations to promotions during the subsequent 2 years. Consistent with the idea that there
is a greater perceived lack of fit between stereotypical attributes of women and requirements of line jobs
than staff jobs, women in line jobs received lower performance ratings than women in staff jobs or men
in either line or staff jobs. Moreover, promoted women had received higher performance ratings than
promoted men and performance ratings were more strongly related to promotions for women than men,
suggesting that women were held to stricter standards for promotion.
Keywords: gender bias, gender stereotypes, performance appraisal, promotion, glass ceiling
Although their presence in the management ranks is increasing,
women continue to be underrepresented in senior management at
many large private-sector companies (Lyness, 2002; Powell,
1999). One possible explanation is that there are gender differ-
ences in performance evaluations, and how women managers’
performance is evaluated relative to men’s performance influences
women’s subsequent career success. Yet, relatively little is known
about whether or when performance evaluations of women in
upper-level management jobs differ from those of their male
counterparts, or if female managers’ performance ratings differ
depending on the types of jobs they hold. Moreover, it is not clear
from prior research how performance evaluations are related to the
actual career progress of men and women managers. The present
study addresses these issues.
The perceived lack of person–job fit has been used to explain
the occurrence of gender bias against women in organizational
decisions about managers including performance evaluations. That
is, the perceived incongruity between stereotypically based at-
tributes ascribed to women (e.g., kind, caring, and relationship-
oriented) and the attributes ascribed to men (e.g., tough, forceful,
and achievement-oriented) believed to be necessary for success at
male gender-typed jobs, is thought to give rise to expectations that
women will perform poorly in these positions, and the greater the
perceived lack of fit, the more negative the expectations (Heilman,
1983, 1995, 2001). The lack-of-fit model further asserts that these
expectations play a key role in evaluative processes because there
is a tendency to perpetuate and confirm them. These expectations
become the lens through which information is filtered, including
what behavior is attended to, how that behavior is interpreted, and
whether it is remembered when critical decisions are made. As a
consequence, the negative expectations resulting from perceptions
of lack of fit detrimentally affect how women are regarded and
how their work is evaluated when they are in traditionally male
There is empirical evidence to support the lack-of-fit model as
it relates to performance evaluations. A meta-analysis of labora-
tory studies indicated that there was greater gender bias against
women in performance ratings on masculine tasks than on femi-
nine tasks (Swim, Borgida, Maruyama, & Myers, 1989). Another
meta-analysis that focused on evaluations of leaders indicated that
the devaluation of female leaders was more pronounced for roles
occupied mainly by men than for roles occupied more equally by
both sexes (Eagly, Makhijani, & Klonsky, 1992).
Nevertheless, questions can be raised about whether these find-
ings generalize to field settings and performance evaluations of
managers. As Bartol (1999) has noted, there are relatively few
prior field research studies about gender differences in perfor-
mance evaluations, especially for managers, and a review of stud-
ies that have been done indicates that their results have been
inconsistent. Moreover, because the prior field studies that have
examined gender differences in managers’ performance ratings
either have been conducted in government settings (e.g., Powell &
Butterfield, 1994) or have used self-reported ratings or ratings
collected exclusively for research purposes for their analyses (e.g.,
Cannings & Montmarquette, 1991; Tsui & Gutek, 1984), they are
not necessarily informative about gender differences in perfor-
mance ratings that occur in business organizations. Therefore, our
first objective for the present study was to test predictions based on
Karen S. Lyness, Department of Psychology, Baruch College, City
University of New York; Madeline E. Heilman, Department of Psychol-
ogy, New York University.
We thank Michael Judiesch for his helpful comments about an earlier
version of the paper and Roger Millsap for his suggestions about analyses.
In April 2002, portions of this study were presented at the Annual Meeting
of the Society for Industrial and Organizational Psychology, Toronto,
Correspondence concerning this article should be addressed to Karen S.
Lyness, Department of Psychology, Baruch College, City University of
New York, One Bernard Baruch Way, Box B8-215, New York, NY
10010-5585. E-mail: Karen_Lyness@baruch.cuny.edu
Journal of Applied Psychology
2006, Vol. 91, No. 4, 777–785
Copyright 2006 by the American Psychological Association
lack-of-fit ideas as they relate to actual performance evaluations of
upper-level corporate managers.
Key to understanding the role of lack of fit in the different
evaluations of male and female managers is the gender-typing of
the job of manager. Jobs are thought to become gender-typed as
male or female based on either (a.) job responsibilities that are
believed to be gender-linked (Heilman, 1983, 1995) or (b.) the sex
of the usual job-holder (Cejka & Eagly, 1999; Krefting, Berger, &
Wallace, 1978). Managerial jobs, especially at upper levels, have
traditionally been considered to be male gender-typed because the
high levels of organizational authority, responsibility, and status
that are characteristic of these jobs have typically been associated
with men, rather than with women, in our society (e.g., Ragins &
Sundstrom, 1989), thus adding to the manager-as-male image (e.g.,
Schein & Mueller, 1992; Schein, Mueller, Lituchy, & Liu, 1996).
In addition, although women have made progress at moving into
management positions in recent years, gender segregation of or-
ganizational hierarchies persists (Blau, Ferber, & Winkler, 1998;
Jacobs, 1992), with women often concentrated in lower- and
middle-level management positions rather than the more salient
upper-level positions (Lyness, 2002; Powell, 1999). In fact, a
recent study of the highest management levels found that women
held less than 16% of the corporate officer positions in Fortune
500 companies (Catalyst, 2002). The view that managerial jobs are
male gender-typed, first reported over 30 years ago (Schein, 1973,
1975), has persisted over time, with men and managers consis-
tently being described more similarly than women and managers
(Heilman, Block, Martell, & Simon, 1989; Powell, Butterfield, &
Parent, 2002; Schein, 2001).
However, when superiors evaluate managers in actual organi-
zational settings, we suspect that the category manager will not be
so broadly drawn. According to recent research, people attend
more to detail when they are close up than when they are psycho-
logically remote from a target (Trope & Liberman, 2003). There-
fore, it is likely that when direct supervisors are conducting per-
formance evaluations, the global concept of manager will give way
to more fine-grained conceptions. With their extensive knowledge
about organizational job requirements, these superiors are likely to
make distinctions among job definitions and, consequently, in their
perceptions about the degree of male gender-typing for particular
In the case of upper-level management jobs, we expect that line
positions are likely to be viewed as more strongly male gender-
typed than staff positions. Not only is our reasoning based on the
typical gender proportions of men and women in upper-level line
and staff jobs—as recently as 2002, women corporate officers in
Fortune 500 companies held less than 10% of the line jobs, and
70% of these women held staff jobs (Catalyst, 2002)—but also on
the different work contents and organizational roles of line and
staff jobs. Managers in line jobs direct and control essential orga-
nizational activities, such as producing or selling products or
services, whereas managers in staff jobs provide support and
expertise to line managers (Hellriegel, Jackson, & Slocum, 2002).
Furthermore, managers in line positions typically have greater
organizational power and influence than managers in staff posi-
tions (Ragins & Sundstrom, 1989). Thus, the responsibilities of
line management jobs appear to be highly consistent with the
forcefulness and achievement orientation associated with men, but
the responsibilities of staff jobs, even at managerial levels, appear
to be less exclusively male in character.
Because line jobs are likely to be viewed as more strongly male
gender-typed than staff jobs, they should give rise to a greater
perceived lack of fit for women than staff jobs. Female managers
in line jobs should therefore be more disadvantaged, receiving
lower performance evaluations, not only when compared to male
managers in line jobs, but also when compared to female managers
in staff jobs. However, we would not anticipate similarly differ-
ential effects for evaluations of men in upper-level line jobs and
staff jobs. Regardless of whether they are line roles or staff roles,
upper-level managerial jobs are considered to be male in gender-
type. Line roles and staff roles cause deviations in the degree of
male gender-typing of a particular position, but they do not cause
a managerial job to be perceived as female. As a result, there is
always a fit between perceptions of male managers’ attributes and
perceived requirements of their jobs, and we do not expect that
male managers’ performance evaluations will differ depending on
whether they hold line jobs or staff jobs.
Although we are not aware of any published research involving
performance evaluations of women and men managers that has
focused on line and staff jobs, there is a prior field study that
provides some support for our ideas about the relationship of
degree of male gender typing of jobs to performance ratings for
men and women. Pazy and Oron (2001) examined relationships
between gender composition of work groups and performance
ratings of Israeli military officers in noncombat positions, and
found that women’s performance ratings varied but men’s did not,
depending on the proportion of women in the work unit. Women
received lower performance ratings when they worked in groups
with less than 10% women, and women’s average performance
ratings increased as the proportion of women in their work groups
increased. These findings can be interpreted as showing that even
in what the researchers termed “a male–gendered environment,”
(Pazy & Oron, 2001, p. 691) it was possible for jobs to differ in
degree of perceived male gender-typing (as measured by the
proportion of women in the work unit), which was reflected in
women’s performance ratings. In addition, Pazy and Oron’s find-
ing that male officers’ average performance ratings were not
related to gender composition of their work groups also is consis-
tent with our contention that male jobs may differ in degree of
perceived male gender-typing and yet still be perceived as con-
gruent with male attributes.
Thus, based on theory and research about perceived job–gender
incongruity, we predicted that performance ratings for upper-level
managers would be related to the interaction of gender with type of
managerial position (i.e., line or staff). Specifically, we expected
that women in line jobs would receive lower performance ratings
than all other groups:
Hypothesis 1: Female managers in line jobs will be evaluated
more negatively than female managers in staff jobs and than
male managers in both line and staff jobs.
Performance Evaluation and Promotion
The lack-of-fit model also suggests that differing levels of
performance would be required of men and women if they were to
be seen as worthy of a promotion to a male gender-typed position.
LYNESS AND HEILMAN
That is, greater evidence of competence is needed to overcome the
negative performance expectations that besiege women (but not
men) in male gender-typed job domains if they are to be judged as
warranting advancement. Foschi’s (1992, 1996, 2000) theory
about double standards for competence is also relevant to this
issue. According to Foschi, different standards are used in scruti-
nizing equal performance by a man and a woman to make judg-
ments about competence, with performance by members of lower
status groups (e.g., women) assessed by stricter standards than
similar performance by members of higher status groups (e.g.,
men). Furthermore, double standards are thought to be particularly
likely when the status characteristic (e.g., gender) is perceived to
be relevant to performance, as in the case of male gender-typed
jobs. These ideas suggest that in order to be promoted, women
would have to outperform their male counterparts. Our second
objective for the present study was to test this idea using a
longitudinal design to examine the relationship between perfor-
mance ratings and actual promotions received during the subse-
quent 2 years. We predicted:
Hypothesis 2: Women who have been promoted will have
received higher performance ratings than men who have been
These ideas also suggest that men will be given more leeway
when promotion decisions are made. That is, there will be more
flexibility in standards and a greater tendency to give the benefit of
the doubt to men than to women when their performance is not
exemplary. Thus, we also expected that there would be a stronger
relationship between performance evaluations and promotions for
women than for men. Specifically, we predicted:
Hypothesis 3: Promotions will be related to the interaction of
gender with performance ratings, such that performance rat-
ings will be more strongly related to promotions for women
than for men.
Because we did not have information about the types of subse-
quent jobs for which women and men were considered as promo-
tion candidates, we could not use lack-of-fit principles to make
predictions about job type as a factor in promotion decisions. We
did, however, explore initial job type, that is, line or staff, as a
potential moderator of the effects predicted in Hypotheses 2 and 3.
Participants and Procedures
The sample included 489 upper-middle-level and senior-level managers
from U.S. offices of a large multinational financial services corporation.
Leaders of business units and staff functions evaluated these managers on
nine performance dimensions as part of a succession planning process in
the first quarter of 1998. There were 177 raters whose gender was identi-
fiable, including 28 women and 149 men. (Rater gender was missing for 52
ratees.) Raters evaluated an average of two managers (M ? 2.46, mode ?
1). We computed subsequent promotions by comparing participants’ levels
in the management hierarchy on January 1, 1998 to their levels on January
1, 2000. We obtained all data from organizational databases.
Other than rater gender, complete 1998 data were available for 448
managers, and we used this sample to test our predictions. Twenty-two
percent (n ? 100) of the sample was women, and the sample was rather
evenly divided between line managers (n ? 225) and staff managers (n ?
223). As expected, a larger proportion of the men (54%) than the women
(37%) were in line positions, ?2(1, N ? 448) ? 9.00, p ? .01. The average
age of the women was 45 years versus 47 years for the men, t(446) ?
?2.44, p ? .05, the women averaged about 11 years of organizational
tenure versus about 14 years for the men, t(446) ? ?2.84, p ? .01, and a
larger proportion of the men (71%) than the women (61%) held graduate
degrees, ?2(1, N ? 448) ? 3.60, p ? .06. Forty-two percent of the women
compared to 65% of the men held senior level management positions, and
the others were in upper-middle-level management positions, ?2(1, N ?
448) ? 17.47, p ? .001. We controlled for these demographic variables
that covaried with gender by using age, organizational tenure, education,
and whether managers were in senior-level management positions as
control variables in the analyses.
nine dimensions developed by the organization: operating results; customer
effectiveness; personal, business, and technical proficiency; execution
skills; leadership; professional standards; relationships; global effective-
ness; and social responsibility. All performance ratings were made on
three-point scales coded from 1 (low) to 3 (high). A confirmatory factor
analysis with EQS for Windows 6.1 (Bentler & Wu, 1995) indicated that
a one-factor model provided a good fit for the nine performance dimen-
sions, ?2(27, N ? 485) ? 60.0, p ? .001, ?2/df ? 2.22, comparative fit
index ? .96, root mean square error of approximation ? .05. A one-factor
model is also consistent with recent meta-analytic research indicating that
there is a general factor in ratings of job performance (Viswesvaran,
Schmidt, & Ones, 2005). We created a composite performance scale by
summing the nine dimension scores, and the alpha coefficient for the
composite performance scale was .79.
Each participant’s position was coded as staff (0) or line (1)
based on information provided by organizational human resource officers.
Examples of line functions included business management, operations
management, and sales; examples of staff functions included human re-
sources, administration, and external affairs.
Participant gender was coded female (0) or male (1).
Participants’ age and organizational tenure were mea-
sured in years, as of January 1, 1998. Education was measured as
highest degree earned, coded graduate degree (1) and bachelor’s de-
gree (0). Thirty-seven participants had complete data except for edu-
cation so we repeated our regression analyses without education as a
control variable, using the larger sample (N ? 485), and reached the
Participants spanned five levels in the manage-
ment hierarchy, coded from low (1) to high (5), with Level 1 considered
upper-middle-level management and Levels 2 through 5 considered senior-
level management, according to organizational human resource officers.
For ease of presentation of results, we dichotomized organizational levels
into whether or not participants held senior-level management positions on
January 1, 1998, coded yes (1, n ? 269) or no (0, n ? 179). However, we
also repeated our regression analyses with four dummy variables repre-
senting the five management levels (contrasting Levels 2 to 5 with Level
1), and reached the same conclusions about tests of our hypotheses.
A promotion was defined as a move to a higher salary
level in the management hierarchy. Promotions were calculated by sub-
tracting each manager’s organizational level (using the five original man-
agement levels) on January 1, 1998 from his or her level on January 1,
2000. Because only 2% of the sample moved up more than one level, we
treated promotion as a dichotomous variable, coded yes (1, n ? 77) or no
(0, n ? 371).
Managers’ performance was rated in 1998 on
WHEN FIT IS FUNDAMENTAL
Preliminary regression analyses indicated that neither the main effect for
rater gender, ? ? ?.02, p ? .62, nor the interaction between rater gender
and ratee gender, ? ? ?.06, p ? .67, was significantly related to perfor-
mance ratings so we combined the data for all raters in subsequent
analyses. Similarly, a preliminary logistic regression analysis indicated that
neither the main effect for rater gender, b ? .21, p ? .60, nor the
interaction between rater gender and ratee gender, b ? .46, p ? .58, was
significantly related to promotion decisions so we also combined those data
for all raters.
We tested Hypothesis 1 by carrying out a linear regression analysis with
the nine-dimension composite performance scale as the dependent variable,
and controls for human capital (age, organizational tenure, education, and
organizational level) in Step 1; gender in Step 2 (to see if there was a
significant main effect); job type in Step 3; and the gender by job type
interaction in Step 4. We then conducted three planned orthogonal con-
trasts, as suggested by Strube and Bobko (1989) for testing an ordinal
interaction when it is predicted that differences are due to one cell, as in our
case where we predicted that women in line jobs would receive lower
performance ratings than the other three groups. To control for study-wise
error, we set the significance level for each of the three contrasts at ? ?
.05/3 ? .017.
We tested Hypothesis 2 with a linear regression analysis predicting
composite performance ratings, with control variables and the main effects
for job type in Step 1, and gender in Step 2, and we carried out an
exploratory analysis of initial job type as a moderating variable by entering
the gender by initial job type interaction in Step 3. We limited the sample
for these analyses to the 77 managers who had received promotions. For all
regression analyses we used simultaneous entry of all variables within each
step, and significance of the results was determined from the significance
of beta coefficients and change statistics.
Finally, we tested Hypothesis 3 with a logistic regression analysis
because promotion was a dichotomous variable. We entered the previously
used control variables and job type as an additional control variable
(because promotional opportunities might vary in line and staff positions)
in Step 1, gender in Step 2 (to see if there was a significant main effect),
performance rating in Step 3, and the interaction of gender and perfor-
mance rating in Step 4. We carried out an exploratory analysis of initial job
type as a moderating variable by entering the additional two-way interac-
tions in Step 5; and the three-way interaction of gender, performance
ratings, and initial job type in Step 6. We centered the composite perfor-
mance ratings in the logistic regression analyses to eliminate unnecessary
multicollinearity between this variable and the interaction term (Cohen,
Cohen, West, & Aiken, 2003). Then we conducted separate logistic re-
gression analyses for female managers and male managers. Because some
managers left the organization during the 2 years after the performance
evaluations, we conducted all analyses of promotion data first with the
1998 sample (N ? 448) and then with only managers who were employed
for the entire 2-year period (n ? 382).
The means, standard deviations, and intercorrelations of the
variables are shown in Table 1. Examination of these results shows
that gender, in and of itself, was not significantly correlated with
either the nine-dimension performance composite (r ? ?.01) or
promotion (r ? ?.03).
The results of the linear regression analysis testing Hypothesis
1, predicting that the interaction of gender by job type would be
related to composite performance ratings, are presented in Table 2.
Among the control variables, there was a positive relationship of
performance ratings to organizational tenure, ? ? .12, p ? .05.
The main effect for gender was not statistically significant, but we
found a significant main effect for job type (i.e., line or staff), ? ?
?.34, p ? .01, indicating that managers in staff positions received
higher ratings than managers in line positions. As we predicted, the
gender by job type interaction was significantly related to perfor-
mance ratings, ? ? .24, ?R2? .01, p ? .05.
Figure 1 depicts the interaction of gender and job type in the
mean composite performance ratings of female and male manag-
ers. Consistent with Hypothesis 1, this figure shows that women in
line jobs received the lowest performance ratings of all four
We conducted a set of three planned orthogonal contrasts to test
our specific prediction that women managers in line jobs would be
rated significantly lower than managers in the other three groups.
The contrasts indicated that (a) performance ratings did not differ
significantly for men in line jobs and men in staff jobs, F(1,
446) ? 2.57, p ? .110; (b) performance ratings for women in staff
jobs did not differ significantly from pooled ratings for men in line
jobs and staff jobs, F(1, 446) ? 2.64, p ? .105; and (c) perfor-
mance ratings for women in line jobs were significantly lower than
pooled ratings for the other three groups, including women in staff
jobs and men in line jobs and staff jobs, F(1, 446) ? 5.85, p ?
Means, Standard Deviations, and Correlations
1. Performance rating composite
4. Organizational tenureb
6. Senior management levela
7. Job typed
.20** .14** –
a0 ? no; 1 ? yes.
female; 1 ? male.
* p ? .05.
N ? 448.
c0 ? bachelor’s degree; 1 ? graduate degree.
d0 ? staff; 1 ? line.
** p ? .01.
LYNESS AND HEILMAN
.016. These findings are therefore supportive of our reasoning that
women in line jobs were more disadvantaged in performance
ratings relative to women in staff jobs and men in both line jobs
and staff jobs.
Performance Evaluation and Promotion
We tested Hypothesis 2, predicting that promoted women would
have received higher performance ratings than promoted men,
using performance rating as the dependent variable and limiting
the analysis to managers who were promoted (n ? 77). These
results, shown in Table 3, were supportive of our prediction.
Gender was negatively related to performance ratings, indicating
that promoted female managers had higher performance ratings
than promoted male managers, step ? ? ?.26, ?R2? .06, p ? .05,
with gender explaining 6% of the variance in these ratings. When
we restricted the sample to the 69 promoted managers who were
employed for the entire 2-year period, we found similar results,
with higher performance ratings among promoted women manag-
ers than promoted men managers, step ? ? ?.29, ?R2? .07, p ?
.05. Our additional analyses showed that the gender by initial job
type interactions were not statistically significant with either sam-
ple, indicating that initial job type did not have a moderating
Figure 2 shows the mean composite performance ratings for
female managers and male managers who were promoted and not
promoted. Promoted women’s composite performance ratings av-
eraged over 1.5 points higher than promoted men’s composite
performance ratings (M ? 23.89 for women, 22.31 for men).
The results of the logistic regression analysis to test Hypothesis
3, predicting that subsequent promotions would be related more
strongly to performance ratings for women than men, are presented
in Table 4. Among the control variables, only senior-level man-
agement had a significant relationship to promotion, indicating that
managers in senior management jobs were less likely to be pro-
moted than those in lower-level jobs, step b ? ?.54, odds ratio ?
.58, p ? .05. The main effect for gender was not significant, but (as
would be expected) the main effect for performance ratings was
positively related to promotion, step b ? 1.60, odds ratio ? 4.97,
?2(1, N ? 448) ? 13.34, p ? .001. The addition of the interaction
of gender by performance ratings to the model in Step 4 resulted
in a significant improvement to the model, ?2(1, N ? 448) ? 3.88,
p ? .05, according to the likelihood ratio test, supporting our
prediction. Although the Wald test indicated that the beta coeffi-
Composite Performance Rating
Mean composite ratings for female managers and male managers in staff jobs and line jobs are
Linear Regression Analyses Predicting Performance Ratings
with Gender and Job Type
StepVariable Step ?
Gender by job type1
a0 ? bachelor’s degree; 1 ? graduate degree.
female; 1 ? male.
† p ? .10.* p ? .05.
N ? 448. Entries are standardized beta weights.
b0 ? no; 1 ? yes.
d0 ? staff; 1 ? line.
** p ? .01.
WHEN FIT IS FUNDAMENTAL