University of Warsaw
Faculty of Psychology
Girls with guns. Does culture shape
perceptions of females in the military?
Warsaw, June 2020
GIRLS WITH GUNS 3
Girls with guns. Does culture shape perceptions of females in the military?
Faculty of Psychology, University of Warsaw
GIRLS WITH GUNS 4
Stereotype content model (SCM) proposes that groups are perceived on two main dimensions:
warmth and competence. Previous research demonstrated that women in the military are usually
perceived as less competent yet warmer than men. In this study, we examine the differences in
perceptions of women in the military between Israel and Poland. Israel is the only country in the
world with compulsory military service for women. We hypothesized that due to their unique
cultural experience, Israeli participants will perceive women in the military as (1) more
competent and (2) warmer than Polish participants and that (3) this effect will be moderated by
participant’s scores on ambivalent sexism scales. A sample of 455 participants (51% Polish)
indicated to what extent they perceive female characters presented in the pictures as competent
or warm. Analyses showed mixed results with weak support for the hypotheses. Measurement
invariance analysis of dependent variable scales revealed a lack of scalar invariance, indicating
that mean comparisons between groups might be biased. A brief confounding factors analysis is
presented as well as recommendations for further study directions.
Keywords: stereotype content model, gender stereotypes, military, intercultural study,
GIRLS WITH GUNS 5
“Military service has traditionally been considered one of the most distinctive signs of
full citizenship, and the exclusion of women from military service has been inseparable from
their lower civic status”
Stereotypes shape the way we see the world. Social perceptions of gender roles influence
who and how participates in sports (Chalabaev, Sarrazin, Fontayne, Boiché, & Clément-
Guillotin, 2013) and politics (Inglehart & Norris, 2003). The same is true for the military. In
most countries in the world females are rarely, if ever, seen on the front line or in the higher
leadership positions in the army (Bensahel, Barno, Kidder, & Sayler, 2015). To an extent, Israel
has a special position among other countries in this regard, as it is the only country in the world
with compulsory military service for women (Sasson-Levy, 2011). History of women in the
Israeli army is long and complex. It can be traced back as early as 1908, when the earliest Jewish
self-defence organization was set up in Palestine which was under Ottoman rule at the time – a
group known as Ha-shomer (Van Creveld, 2000). Over the years, role of women in the Israeli
military was changing and most of the time it was not free from gender inequalities at some
level. In the year 2000, an amendment has been adopted in the Defense Service Law “stipulating
that every woman has the right, equal to that of men, to serve in any position during her army
service unless the inherent nature of the position demand otherwise” (Gittleman, 2018, para. 24).
Since the late 1990s number of female soldiers serving in the clerical positions dropped by over
13% while number of female combat soldiers is on a sharp rise – it rose by almost 600% between
2005 and 2017 (Gittleman, 2018). Israelis’ experience seems to be unique. Obligatory female
conscription leads to relatively higher frequency of encountering women in decisive roles in the
military than in many other countries, including Poland. Polish legislation allows women to
serve as professional soldiers (Ustawa o służbie wojskowej żołnierzy zawodowych, 2003) and
GIRLS WITH GUNS 6
mentions possibility of mandatory conscription for women having “medical or veterinary
qualifications” in case of emergency (Ustawa o zmianie ustawy o powszechnym obowiązku
obrony Rzeczypospolitej Polskiej, 2003), but military service is not obligatory and has never
been obligatory for women. When comparing the scale of the phenomenon of women in the
military between Israel and Poland, differences seem to be self-evident: in 2018 women
constituted 5.98% (BIP MON, n.d.) of active soldiers in the Polish army, while this number
reached as high as 48% in Israel
(Israel Defense Forces, 2020).
Stereotype Content Model
Stereotype Content Model framework (SCM; Fiske, Cuddy, Glick, & Xu, 2002)
postulates that stereotypes can be described on two basic dimensions: warmth and competence.
Extensive research demonstrated that SCM is relatively universal (Cuddy, Fiske, Glick, 2008)
and relatively stable across cultures (Cuddy, Fiske, Kwan, Glick, Demoulin et al., 2009).
The basic dimensions proposed in the model can be related to intergroup stereotypes
(Fiske et al., 2002; Fiske, 2008; Cuddy et al., 2009) as well as to interpersonal relationships
(Fiske, Cuddy, & Glick, 2007; Russell & Fiske, 2008). Studies related to the latter used fake
social interactions (Russell & Fiske, 2008) or photographed faces (Harris & Fiske, 2006) to elicit
stereotyped judgements and reactions.
SCM has also been applied in the military context, for instance Boldry, Wood & Kashy
(2001) have demonstrated that women in a cadet school were perceived as less competent, yet
higher in warmth than men, though their actual competence was not significantly different than
their male counterparts. This result is congruent with findings presented in a vast literature on
These numbers refer to all soldiers, not only combat soldiers.
GIRLS WITH GUNS 7
Stereotype Content Model – in many contexts women are perceived as less competent, yet
warmer than men. In the current study, we focus on another aspect of warmth-competence
comparisons, namely cultural differences.
The current study
The aim of this study is to examine differences between Polish and Israeli participants in
their perceptions of women in the military. To the best of our knowledge, no such comparison
has been reported in the literature so far.
Being aware of recent debates regarding stereotype measurements and their predictive
validity (Forscher, Lai, Axt, Ebersole, Herman et al., 2019; Oswald, Mitchell, Blanton, Jaccard,
& Tetlock, 2013; Kurdi, Seitchik, Axt, Carroll, Karapetyan, et al., 2019), we propose to measure
warmth and competence in two ways rather than just a single way. First, we follow many of the
earlier SCM works and use items describing traits (e.g. competenet, friendly; Fiske, 2018).
Second, we use items related to specific roles that require competence and warmth – roles of a
manger and a parent (Willemsen, 2002; Gmür, 2006; Fuegen & Biernat, 2014; Connor & Fiske,
2018). We argue that such a role-level assessment is a better proxy for participants’ real-life
assessments and behaviors than abstract trait-level evaluations as it activates participants’ own
experiences and experience-based knowledge (Taylor, 1981).
We see conclusions from this study as potentially helpful in deepening our understanding
of how cultural convictions regarding gender roles may influence policies and how changes in
policies may impact cultural scripts in turn. This is of a special significance now, when group
inequalities tear the world apart in various regions of the world.
GIRLS WITH GUNS 8
We hypothesize that (1) because of the relative prevalence of experience of women in
decisive roles in the army, perceived competence of women in the military will be higher in
Israel than in Poland. Further, we predict that (2) a similar pattern will appear for warmth –
although military context might increase perceived competence and decrease perceived warmth,
leading to decline in liking (Phelan & Rudman, 2010; Connor & Fiske, 2018; Galinsky &
Schweitzer, 2015), in Israelis this process will be mitigated by mere exposure effect (Zajonc,
1968). As demonstrated by Zajonc (1968, 2001), people develop preference for objects that they
are familiar with and level of familiarity is positively related to the frequency of exposure. In the
context of this study there are two parallel ways in which mere exposure effect can be at work:
(a) women in uniforms and women carrying guns are a common sight in Israeli public space,
creating a strong impression of familiarity, and (b) most people’s sisters, daughters and/or female
friends served, serve or will serve in the army at some point of their lives, thus a female soldier is
someone familiar. Finally, we predict that (3) the differences in perceived (a) competence and (b)
warmth between countries will be moderated by participants’ hostile and benevolent sexism
scores (Glick & Fiske, 1996).
Data for this study was collected online via two internet panels: Ariadna
(https://panelariadna.pl/) in Poland and Midgam (https://www.midgampanel.com/) in Israel. The
study was developed using Qualtrics software. Participants received either small monetary
compensation for their participation (9 NIS in Israel) or reward program points (in Poland).
This study has been conducted as a part of a larger scientific project. This means that
tasks and scales used in this study were presented among other tasks and scales within the same
large questionnaire. A list of tasks and scales can be found in the Appendix.
GIRLS WITH GUNS 9
A total sample of 602 participants (mean age: 42.62 years; 53% female; 58% Polish) took
part in the study. After applying the exclusion criteria, 455 observations (mean age: 43.07 years;
53% female) were used in the main analyses. The final sample (N = 455) consisted of 231 Polish
(mean age: 41.93 years; 52% female), and 224 Israeli participants (mean age: 44.25; 54%
Three exclusion criteria were applied to the data: (1) all non-complete trials were
removed; (2) all participants who spent less than 10 minutes or more than 60 minutes on the full
were excluded from the sample, and (3) all participants who had zero variance on
the SDO scale
and zero variance on the main task were excluded from analyses; the rationale
here is twofold: first, since the SDO scale used in the study contains one reverse-scored item,
having zero variance on this scale means that the participant assigned maximum or minimum
score to contradicting items, which is a potential indicator of low attention level during the task;
second, the probability that the participant was not paying attention to the tasks increases given
that (s)he got zero variance also on all dependent variable scales.
Two dependent variables were measured in this study: (1) competence and (2) warmth,
each in two operationalizations: (a) trait-level and (b) role-level.
Including tasks and items external to the current study. More details can be found in the
See Scales section for more details.
GIRLS WITH GUNS 10
Trait-level warmth and competence. What we refer to as trait-level operationalizations
is what is known from classical SCM literature. We asked participants to indicate how warm and
how competent a person presented in the picture was. Labels used to assess perceived
competence and warmth of targets (competent, able, warm, friendly) were based on previous
SCM research (Fiske et al., 2002, Cuddy, Fiske, & Glick, 2008; Fiske, 2018).
Role-level warmth and competence. Role-level operationalizations, on the other hand,
refer to assessments of how well a depicted person would fit to a role requiring high competence
(an efficient manager) or high warmth (a good parent) rather than how well she fits trait-level
attributions. Research shows that competence is among the highest rated attributes constituting
manager stereotypes (Willemsen, 2002; Gmür, 2006), while warmth is strongly associated with
parenting, especially in case of women (Fuegen & Biernat, 2014; Connor & Fiske, 2018).
Instrumentation. Total of four dependent measures were collected in this study: (1)
trait-level competence, (2) trait-level warmth, (3) role-level competence and (4) role-level
warmth. Trait-level warmth and trait-level competence were measured on a 7-point Likert scales
with two items per scale. Role-level warmth and role-level competence were measured by single
items also on 7-point Likert scales.
The scales for trait-level competence and trait-level warmth displayed good to acceptable
reliabilities as measured by Spearman-Brown formula for 2-item scales (Eisinga, Grotenhuis, &
Pelzer, 2013): (a) competence: ρ = .87, (b) warmth: ρ = .71. To check if items loaded the
expected factors, factor analysis has been conducted. Data revealed good to mediocre sampling
adequacy (average MSA = .65) (Hair, Black, Babin, Anderson, & Tatham, 2006). Bartlett’s test
for sphericity was significant (χ2(6) = 674.24, p < .001), indicating that the correlation matrix is
not the identity matrix. The determinant of correlation matrix was larger than 10-5 (det = 0.225)
GIRLS WITH GUNS 11
indicating a lack of excessive multicollinearity in the data (Hair et al., 2006). Two-factor solution
was suggested by parallel analysis (Humphreys & Montanelli, 1975). Analysis with two factors
was conducted, using ‘oblimin’ rotation. Factor loadings for this solution mirrored intended scale
structure, providing high values for both factors: competence (.89, .87) and warmth (.74, .74). At
the same time, all cross loadings were very small (< .03).
Correlations between operationalizations. Association between trait-level and role-
level competence had moderate strength as expressed by Pearson correlation coefficient: r(453)
= .38, p < .001 (Cohen, 1988); relationship between trait-level and role-level warmth was
slightly stronger: r(453) = .46, p < .001.
Ambivalent sexism was measured using five items
based on Ambivalent Sexism
Inventory (ASI; Glick, Fiske, 1996) adapted by Mikołajczak (2016), where two items measured
benevolent sexism, two other items measured hostile sexism and the remaining item measured
believes related to motherhood. This selection of questions was previously used in Stefaniak &
Winiewski (2019) and displayed decent reliability
: α = .68 for hostile and α = .78 for benevolent
subscales respectively (Mikołajczak in: Stefaniak & Winiewski, 2019). In the current study,
reliabilities were slightly higher for hostile sexism (α = .74) and slightly lower for benevolent
sexism (α = .68).
The main task of this study was to evaluate a person in the picture on dimensions of
warmth and competence. Stimuli used in the study were collected from various internet sources,
Measured on a 7-point Likert scale, where 1 = strongly disagree, 5 = strongly agree.
Measured using Cronbach’s alpha.
GIRLS WITH GUNS 12
including Pinterest platform (https://www.pinterest.com) and Google image search engine
(https://www.google.com). Criteria for stimuli selection included: (a) character facing the
camera, (b) character’s head size comparable to other pictures, (c) character’s age and level of
attractiveness similar to that of other depictions. All stimuli are presented in Figure 1.
Stimuli used in the study
In the main task of the study participants were presented with one randomly selected
picture of a woman in a military uniform and asked to indicate how warm and how competent
the depicted person is, using provided scales. To minimize the effects of social desirability
(Cuddy et al., 2009; Fiske et al., 2002), we asked participants to consider how people like them,
but not themselves could think about a person in the picture (“please consider how people like
you can think about the person in the picture”). Below the picture, dependent variable scales
were provided, measuring participants’ perceptions of depicted person on dimensions of trait-
level competence, trait-level warmth, role-level competence and role-level warmth.
GIRLS WITH GUNS 13
The main task was preceded by a welcome screen, where participants agreed to
participate in the study, external tasks and scales and moderator scales. After the main task, a
series of demographic questions were asked and the study finished.
For each dependent variable operationalization (trait-level competence, trait-level
warmth, role-level competence and role-level warmth) a separate analysis was conducted. The
data was analyzed using multilevel modelling framework (Kahn, 2011). In each model, random
effects of picture (intercept) were estimated to partial out the variance related to idiosyncratic
effects of individual pictures.
To test the first hypothesis, two models were fitted – one for each dependent variable
operationalization (trait-level vs role-level). The architecture was identical in both models: the
dependent variable was regressed on participant’s country and gender (covariate) with random
effects of picture id.
Participant’s country was a significant predictor of perceived trait-level competence (b =
0.47, t = 2.959, p = .003). The direction of this effect was opposite to that predicted in
Hypothesis 1. The total proportion of variance explained
as expressed by Nakagawa’s
Conditional R2 (Nakagawa & Schielzeth, 2013) was low and amounted to 4% (R2 = 0.04).
Nakagawa’s Marginal R2 was very low (R2 = 0.022), indicating that most of the variance
Including fixed and random effects.
GIRLS WITH GUNS 14
explained by the model was due to random effects of pictures. No other predictors were
No predictors were significant for role-level competence.
The second hypothesis was tested analogically to the first one – two models were fitted –
one for trait-level warmth, one for role-level warmth. The architecture was identical as before:
dependent variable was regressed on participant’s country and gender. Random effects of picture
id were estimated.
Participant’s country of origin was a significant predictor of perceived trait-level warmth
(b = 0.35, t = 2.507, p = .013). The direction of this effect was opposite to that predicted in
Hypothesis 2. The total proportion of variance explained was moderate, with conditional R2 =
0.18. Marginal R2 was low (R2 = 0.02), indicating that most of the variance was explained by
random effects of pictures. Gender was on the verge of significance (b = 0.25, t = 1.759, p = .08),
suggesting that male participants perceived female targets as warmer than female participants
No predictors were significant for role-level warmth.
To test Hypothesis 3, four models were fitted (for trait-level competence, role-level
competence, trait-level warmth, and role-level warmth respectively). In each case, the dependent
variable was regressed on participant’s country, gender and their ASI scores. Two interaction
GIRLS WITH GUNS 15
terms between country and sexism scores (one for hostile and one for benevolent sexism) were
added. Random effects of picture id were estimated as in the previous steps.
Trait-level competence (model with ASI interaction)
No predictors were significant for competence.
Role-level competence (model with ASI interaction)
Overall fit of the model including ASI moderation was significantly better than the fit of
the respective model without such moderation (Δχ2 = 12.653, df = 4, p = .013). There were two
significant main effects for role-level competence: (a) hostile sexism (b = -0.21, t = -3.208, p
= .001) and (b) benevolent sexism (b = 0.2, t = 3.502, p < .001). Additionally, there were two
non-significant effects at a trend level: (a) main effect of country (b = -0.75, t = -1.684, p = .093)
and (b) interaction effect between country and hostile sexism (b = 0.17, t = 1.78, p = .08). The
main effect of country was in the direction predicted by Hypothesis 1. A trend-level interaction
effect is presented in Figure 2.
GIRLS WITH GUNS 16
Effect of interaction between hostile sexism and country for role-level competence
Note. This effect was at a trend level (p = .08)
Trait-level warmth (model with ASI interaction)
Three main effects were significant for trait-level warmth: (a) hostile sexism (b = -0.15, t
= -2.234, p = .03), (b) benevolent sexism (b = 0.12, t = 2.017, p = .04), and (c) gender (b = 0.30, t
= 2.057, p = .04), suggesting that male participants seen target females as warmer than female
participants did. No interaction effect was significant.
Role-level warmth (model with ASI interaction)
Overall fit of the model including ASI moderation was better than the fit of respective
model without such moderation, but the difference did not reach significance at the customary
level of p < .05 (Δχ2 = 7.833, df = 4, p = .098). Two main effects were significant for role-level
warmth: (a) hostile sexism (b = -0.2, t = -3.212, p = .001), and (b) benevolent sexism (b = 0.17, t
GIRLS WITH GUNS 17
= 3.150, p = .002). Additionally, there was one significant interaction between country and
hostile sexism (b = 0.2, t = 2.34, p = .002). This interaction supports Hypothesis 3 and is
visualized in Figure 3. It is also worthwhile to mention that introduction of interaction term
reversed the sign of the main effect of country (b = -0.69, t = -1.524, p = .13). Although this
effect remained non-significant, its direction changed to that predicted by Hypothesis 2.
Effect of interaction between hostile sexism and country for role-level warmth
Hypothesis 1 stated that perceived competence of women in the military will be higher in
Israel than in Poland. Overall results for competence are ambivalent, providing limited support
GIRLS WITH GUNS 18
against Hypothesis 1. The main effect of country was significant for trait-level competence, but
not for role-level competence.
In hypothesis 2 we predicted that the level of perceived warmth of female military targets
will be higher in Israel than in Poland. Overall results are – again – mixed. Country was a
significant predictor of trait-level warmth. The direction of this effect was opposite to that
predicted in the hypothesis. No predictors were significant for role-level warmth.
Hypothesis 3 predicted that relationships between (a) country and perceived competence
and (b) country and perceived warmth will be moderated by participants’ scores on hostile and
ambivalent sexism scales. The results partially support these predictions, but only for role-level
operationalizations of dependent variables. Interaction between country and hostile sexism was
significant for role-level warmth and on the verge of significance for role-level competence.
Interaction between country and benevolent sexism was not predictive of the dependent
Interaction effects of hostile sexism and country are particularly interesting, showing that
hostile sexism had much stronger influence on assessment of warmth and competence in Israelis
than in Poles. Notably, Israeli participants with low level of sexism rated targets as more
competent and warmer than Polish participants with the same level of hostile sexism did.
Additional and exploratory analyses
There might be a couple of reasons for the ambivalence of results in the main analyses.
Since the scales measuring dependent variables were not culturally adapted beforehand, it might
be the case that they do not measure the same concept across groups. Having said that, pictures
GIRLS WITH GUNS 19
presented to participants could have also elicited culture specific effects. Although random
effects of picture were estimated in the analyses, these effects did not take any intergroup
differences into account. Therefore, culture-dependent picture effects might have biased the
results of this study.
All intergroup comparisons post risk that a given instrument do not measure the same
concept across tested groups. Measurement invariance analysis aims to test if groups can be
meaningfully compared on a given dimension. To assess measurement invariance for dependent
variables in the current study, an analysis was conducted following 4-step method within
structural equation modeling (SEM) framework (Putnick & Bornstein, 2016). There are four
levels of invariance that can be assessed using this method: (1) configural invariance, confirming
that the factor structure is the same across compared groups, (2) weak (metric) invariance,
providing us with the information if a given construct has the same meaning across compared
groups, (3) strong (scalar) invariance, informing us if mean-based comparisons are justified
between groups and, (4) strict invariance, testing if individual items’ errors are the same across
groups (Gregorich, 2006).
Measurement invariance analysis for trait-level competence, trait-level warmth and ASI
are presented below. Measurement invariance is not relevant for single-item scales, therefore
there are no measurement invariance analyses provided for role-level competence and role-level
warmth. It is also worthwhile to notice that trait-level warmth and trait-level competence are
tested using a single model. The same is true for hostile and benevolent sexisms. Since each of
the mentioned constructs (trait-level competence, trait-level warmth, benevolent sexism, hostile
GIRLS WITH GUNS 20
sexism) was measured using only two items, it was necessary to test these concepts at a scale
rather than subscale level to attain model identification.
Trait-level warmth and competence
Configural invariance. Configural invariance tests if the same items measure the same
construct in two groups. To test configural invariance a model has been fitted using ‘lavaan’
package (Rosseel, 2012). Mardia’s test (Mardia, 1970) demonstrated that the data did not follow
multivariate normality (MVN; all p-values < .001). To address MVN violation, a robust
estimator (MLR) was used. Overall model fit was very good (χ2(2) = 0.458, p = .796; CFI = 1;
SRMR = .001; RMSEA < 0.003), confirming that the same items measure the same constructs in
Polish and Israeli subsamples.
Weak (metric) invariance. To assess metric invariance, a model was fitted with factor loadings
constrained to be the same in both groups and compared against the configural model. The
overall fit was good (χ2(4) = 4.730, p = .316; CFI = .998; SRMR = .021; RMSEA = 0.028).
Model was not significantly worse than configural model (Δχ2 = 4.5437, p = .1) in terms of
Satorra-Bentler test (Satorra & Bentler, 2001), but the CFI difference between the models was
smaller than .01 (ΔCFI = .002), indicating that metric invariance holds (Cheung & Rensvold,
2002). The result suggests that Israeli and Polish participants attributed the same meaning to both
analyzed dependent variables.
Strong (scalar) invariance. If scalar invariance in attained, mean comparisons between groups
are justified. To assess scalar invariance a model was fitted with loadings and intercepts
constrained. Next, the model was compared against the metric model. The overall model fit was
acceptable (χ2(6) = 16.796, p = .001; CFI = .971; SRMR = .037; RMSEA = 0.089), but
significantly worse than metric invariance model (Δχ2 = 13.174, p = .001). CFI difference was
GIRLS WITH GUNS 21
greater than .01 (ΔCFI = .029), indicating a lack of scalar invariance (Putnick & Bornstein,
2016; Cheung & Rensvold, 2002). This result suggests that group means of trait-level warmth
and trait-level competence cannot be meaningfully compared between Israeli and Polish samples.
Ambivalent Sexism Inventory (ASI)
The measurement invariance procedure for ASI followed the exact same steps as the
procedure for trait-level warmth and competence. Mardia’s test (Mardia, 1970) indicated
violation of multivariate normality in skewness (p = .002), therefore a robust estimator (‘MLR’)
was used in further steps.
Configural invariance. A configural invariance model demonstrated an excellent fit (χ2(2) =
2.415, p = .299; CFI = .999; SRMR = .009; RMSEA = 0.03) indicating that configural invariance
holds for Ambivalent Sexism Inventory.
Weak (metric) invariance. The model with constrained loadings was fit to test metric
invariance. The fit was acceptable (χ2(4) = 20.032, p = .316; CFI = .951; SRMR = .049; RMSEA
= 0.133), but significantly worse than for the configural model (Δχ2 = 17.108, p < .001). CFI
difference between the two models was larger than .01 (ΔCFI = .048), indicating a lack of metric
invariance (Cheung & Rensvold, 2002). This result suggests that loadings for particular items
related to hostile and benevolent sexism differ between Polish and Israeli participants. Therefore,
concepts of sexism are not equivalent between these two groups, potentially making group-level
comparisons not meaningful.
The main task of this study was to assess warmth and competence of a person showed in
the picture. This task was designed to elicit stereotypical judgements related to gender roles in
participants. Nonetheless, regardless of group membership, each person is also assessed by
GIRLS WITH GUNS 22
others on warmth and competence dimensions as an individual and not only as a group member
(Fiske et al., 2007). Assessments of warmth and competence at individual levels are driven by
assessor’s personal experience but also cultural factors and might be biased towards familiar-
looking faces (Zebrowitz, Bronstad, & Lee, 2007; Zajonc, 2001). Therefore, pictures that looked
more familiar to participants from each culture might have been assessed more positively than
those looking less familiar. Although picture effects were controlled in this study, this control did
not take into account potential cultural differences in picture assessments between groups.
A brief analysis of random effects of pictures revealed different patterns for Israeli and
Polish participants. Results of this analysis are presented in Figure 4 (trait-level competence) and
Figure 5 (trait-level warmth). An interesting trend can be observed for Polish sample. A person
depicted in picture 1 (called ‘FM1’ in the plots) was systematically assessed less positively than
remaining two. One possible explanation could be that a person depicted in this photograph is
dark-haired – a characteristic not stereotypically associated with Slavic type of looks, which
could elicit gender unrelated stereotypes in some of the Polish participants. Other explanations
are also likely here: picture 1 is the only one, where firearms appear and Polish participants,
being less familiar with observing women with guns could assess the presented person as less
warm (backlash effect) and less competent (traditional hostile stereotype).
GIRLS WITH GUNS 23
Random effects of pictures on trait-level competence in Israeli and Polish samples
Random effects of pictures on trait-level warmth in Israeli and Polish samples
This study aimed to test answer the question if cultural differences between Israel and
Poland influence perceptions of women in the military on two basic dimensions: competence and
warmth. This question remains largely unanswered.
The results of the current study are mixed, providing weak and conditional support
against Hypotheses 1 and 2 and partial support for Hypothesis 3. Estimates of the main effects of
GIRLS WITH GUNS 24
country were unstable across multiple analyses and sensitive to the operationalization of the
dependent variables and presence of interaction terms. There are several potential explanations
for this issue.
First, additional analyses revealed a lack of scalar invariance in the dependent variables
and lack of metric invariance in the moderator, indicating that comparisons between the two
groups were not reliable (Davidov, Meuleman, Cieciuch, Schmidt, & Billiet, 2014; Schoot,
Schmidt, Beuckelaer, Lek, & Zondervan-Zwijnenburg, 2015). Second, a brief random effects
analysis demonstrated that individual pictures were rated differently across groups, which could
further distort the results. Third, as indicated by Fiske et al. (2007), warmth and competence
assessments might be more strongly correlated when people judge individuals vs groups. In such
situations, halo effect may occur, dimming elicited group stereotypes.
That being said, some of the results provided limited support for the hypotheses. Models
with interaction terms provided partial support for the hypotheses. Israeli participants with low
levels of hostile sexism rated women in the military as warmer and more competent than Polish
participants with the same score on hostile sexism scale did. Notably, hostile sexism displayed
much stronger relationship with role-level competence and role-level warmth in Israeli sample
than in Polish sample. What draws attention is that for Polish participants hostile sexism seems
to be essentially unrelated to the ratings of competence and warmth. A surprising result,
considering extensive research on backlash effect (Phelan & Rudman, 2010; Connor & Fiske,
2018; Galinsky & Schweitzer, 2015). Backlash effect occurs, when women behave in a way that
does not follow stereotypical expectations: “women who pursue or possess power are at risk for
backlash, defined as social and economic penalties for defying stereotypic expectations” (Phelan
& Rudman, 2010, p. 807; see also: Rudman, 1998). Since sexism levels in Poland are relatively
GIRLS WITH GUNS 25
high (Mikołajczak, 2016), this result seems difficult to explain. One explanation could be that
individual characteristics of depicted women dimmed the stereotypical judgements. Another,
could be that Polish participants interpreted women in the pictures as serving in the military in
more traditional roles, e.g. taking clerical or medical positions. The latter explanation could be
congruent with the results of the picture effects analysis – in Polish sample, the only picture with
systematic negative effects on warmth and competence was the one containing firearms. This
might suggest that backlash effects appeared in the Polish sample but were ‘distilled’ by
stereotype-coherent interpretations of two other pictures.
Another interesting conclusion that one can draw from the presented analyses is that two
operationalizations – trait-level and role-level – led to different results. This could be due to the
fact that traits (friendly, competent) may prime different attributions than roles (manger, parent)
(Graham & Folkes, 1990; Fiske & Taylor, 1991), but it can also be seen in a broader context of
recent debates on how we measure stereotypes and what is predictive validity of these measures
(Forscher et al., 2019; Oswald et al., 2013; Kurdi, et al., 2019).
The current study had several limitations and future research could address them. First, a
lack of scalar invariance for dependent variables and a lack of metric invariance for ASI should
be addressed – ideally, all scales should be pretested and culturally adapted before conducting
the study. Second, appropriate stimulus sampling should be applied (Wells & Windschitl, 1999).
Picture effect analysis revealed that biasing culture-specific picture effects could have appeared
in the current study, biasing its results. It is also worth to mention, that all the photographs in the
current study depicted young Caucasian individuals. They might have invoked different levels of
identification in participants identifying with different social groups (e.g. based on age or
ethnicity). These pictures might have not reflected the multi-ethnic structure of Israeli society
GIRLS WITH GUNS 26
(Lewin-Epstein & Cohen, 2019), which could have further distorted the study’s results,
particularly, leading to the in-group favouritism effect in Polish participants (Cuddy et al., 2009).
Third, role-level operationalizations should be measured by more than one item. One item scales
may have low content validity and their reliability is impossible to estimate (McIver &
Carmines, 1981). Finally, a comparison of perceptions of women in the military between the
countries should be benchmarked against a control group. Without a control group, it is not clear
if the observed effects are specific to women in the military or to women (or people regardless
their gender) in general.
Provided analyses do not give a definite answer to the question posted in the title of this
study. The results are mixed. Israeli participants with low scores on the hostile sexism scale
perceived women presented in the pictures as warmer and more competent than Polish
participants with the same level of hostile sexism did. This effect was conditional and depended
on the operationalization of the dependent variable. When levels of sexism were not controlled
for, Polish participants rated the targets as warmer and more competent than Israelis did, but –
again – this effect was unstable between the operationalizations of the dependent variables.
Additional analyses revealed a lack of scalar invariance of the dependent variables and a
lack of metric invariance for moderator scales, indicating that comparisons between analyzed
groups could have been biased. A brief analysis of picture effects demonstrated that pictures
used as stimuli in this study could have elicited different assessment patterns in compared
groups, further distorting the study’s results.
GIRLS WITH GUNS 27
Suggestions for further research are fourfold: cultural adaptation of scales, proper
stimulus sampling, increase in the number of items for role-level operationalization of dependent
variables and inclusion of a control group in the study design.
GIRLS WITH GUNS 28
Altemeyer, B. (1981). Right-wing authoritarianism. Winnipeg: University of Manitoba Press.
Bales, R.F. (1950) A set of categories for the analysis of small group interaction. Am. Sociol.
Rev. 15, 257–263
Barak-Erez, D. (2007). The Feminist Battle for Citizenship: Between Combat Duties and
Conscientious Objection, Cardozo Journal of Law and Gender, 13, 537.
Bensahel, N., Barno, D., Kidder, K., & Sayler, K. (2015). Battlefields and boardrooms: women’s
leadership in the military and the private sector. Retrieved (May 22, 2020) from:
BIP MON (n.d.) Wojskowa służba kobiet. https://archiwum2019-bip.mon.gov.pl/przydatne-
Boldry, J.G., Wood, W., & Kashy, D. (2001). Gender Stereotypes and the Evaluation of Men and
Women in Military Training. Journal of Social Issues 57(4). 689–705.
Chalabaev, A., Sarrazin, P., Fontayne, P., Boiché, J., & Clément-Guillotin, C. (2013). The
influence of sex stereotypes and gender roles on participation and performance in sport
and exercise: Review and future directions. Psychology of Sport and Exercise, 14(2),
Connor, R. A., & Fiske, S. T. (2018). Warmth and competence: A feminist look at power and
negotiation. In C. B. Travis, J. W. White, A. Rutherford, W. S. Williams, S. L. Cook, &
K. F. Wyche (Eds.), APA handbooks in psychology®. APA handbook of the psychology
GIRLS WITH GUNS 29
of women: History, theory, and battlegrounds (p. 321–342). American Psychological
Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit indexes for testing
measurement invariance. Structural Equation Modeling, 9(2), 233–255.
Cuddy, A.J., Fiske, S.T., Kwan, V.S., Glick, P., Demoulin, S., Leyens, J., Bond, M.H., Croizet,
J., Ellemers, N., Sleebos, E., Htun, T.T., Kim, H., Maio, G., Perry, J., Petkova, K.G.,
Todorov, V., Rodríguez-Bailón, R., Morales, E., Moya, M., Palacios, M., Smith, V.,
Pérez, R., Vala, J., & Ziegler, R. (2009). Stereotype content model across cultures:
towards universal similarities and some differences. The British journal of social
psychology, 48(1), 1–33.
Cuddy, A. J. C., Fiske, S. T., & Glick, P. (2008). Warmth and competence as universal
dimensions of social perception: The stereotype content model and the BIAS map. In M.
P. Zanna (Ed.), Advances in experimental social psychology: Vol. 40. Advances in
experimental social psychology, Vol. 40 (p. 61–149). Elsevier Academic Press.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Routledge.
Davidov, E., Meuleman, B., Cieciuch, J., Schmidt, P., & Billiet, J.B. (2014). Measurement
Equivalence in Cross-National Research. Annual Review of Sociology 40. 55–75.
Eisinga, R., Grotenhuis, M., & Pelzer, B. (2013). The reliability of a two-item scale: Pearson,
Cronbach, or Spearman-Brown? International Journal of Public Health, 58(4), 637–642.
GIRLS WITH GUNS 30
Ferguson, C. J. (2009). An effect size primer: A guide for clinicians and researchers.
Professional Psychology: Research and Practice, 40, 532-538.
Fiske, S. T. (2018). Stereotype Content: Warmth and Competence Endure. Current Directions in
Psychological Science, 27(2), 67–73. https://doi.org/10.1177/0963721417738825
Fiske, S. T., Cuddy, A. J. C., Glick, P., & Xu, J. (2002). A model of (often mixed) stereotype
content: Competence and warmth respectively follow from perceived status and
competition. Journal of Personality and Social Psychology, 82(6), 878–902.
Fiske, S. T., Cuddy, A. J., & Glick, P. (2007). Universal dimensions of social cognition: warmth
and competence. Trends in cognitive sciences, 11(2), 77–83.
Fiske, S. T., & Taylor, S. E. (1991). McGraw-Hill series in social psychology. Social cognition
(2nd ed.). Mcgraw-Hill Book Company.
Forscher, P. S., Lai, C. K., Axt, J. R., Ebersole, C. R., Herman, M., Devine, P. G., & Nosek, B.
A. (2019). A meta-analysis of procedures to change implicit measures. Journal of
Personality and Social Psychology, 117(3), 522–559.
Fuegen, K. & Biernat, M. (2014). Gender-based standards of competence in parenting and work
roles. In M. Ryan & N. Branscombe The SAGE handbook of gender and psychology (pp.
131-147). SAGE Publications, Ltd. 10.4135/9781446269930.n9
Galinsky, A. D., & Schweitzer, M. (2015). Friend and foe: When to cooperate, when to compete,
and how to succeed at both. Random House Business.
GIRLS WITH GUNS 31
Gittleman, I. S. (2018). Women’s Service in the IDF: Between a ‘People’s Army’ and Gender
Equality. The Israel Democracy Institute. https://en.idi.org.il/articles/24554
Glick, P., & Fiske, S. T. (1996). The Ambivalent Sexism Inventory: Differentiating hostile and
benevolent sexism. Journal of Personality and Social Psychology, 70(3), 491–512.
Gmür, M. (2006). The gendered stereotype of the 'good manager': Sex role expectations towards
male and female managers, Management Revue 17(2), 104-121.
Graham, S., & Folkes, V. S. (Eds.). (1990). Applied social psychology. Attribution theory:
Applications to achievement, mental health, and interpersonal conflict. Lawrence
Erlbaum Associates, Inc.
Gregorich S. E. (2006). Do self-report instruments allow meaningful comparisons across diverse
population groups? Testing measurement invariance using the confirmatory factor
analysis framework. Medical care, 44(11 Suppl 3), 78–94. https://doi.org/10.1097
Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E., & Tatham, R. L. (2006). Multivariate
data analysis (Vol. 6). Upper Saddle River, Pearson Prentice Hall.
Harris, L. T., & Fiske, S. T. (2006). Dehumanizing the Lowest of the Low: Neuroimaging
Responses to Extreme Out-Groups. Psychological Science, 17(10), 847–853.
Ho, A. K., Sidanius, J., Kteily, N., Sheehy-Skeffington, J., Pratto, F., Henkel, K. E., Foels, R., &
Stewart, A. L. (2015). The nature of social dominance orientation: Theorizing and
measuring preferences for intergroup inequality using the new SDO7 scale. Journal of
Personality and Social Psychology, 109(6), 1003-1028.
GIRLS WITH GUNS 32
Humphreys, L., G. & Montanelli, R., G. (1975). An investigation of the parallel analysis
criterion for determining the number of common factors. Multivariate Behavioral
Research 10, 193-205.
Imhoff, R., Dotsch, R., Bianchi, M., Banse, R., & Wigboldus, D. H. J. (2011). Facing Europe:
Visualizing Spontaneous In-Group Projection. Psychological Science, 22(12), 1583–
Inglehart, R., & Norris, P. (2003). Rising tide: Gender equality and cultural change around the
world. Cambridge, UK: Cambridge University Press.
Israel Defense Forces. (2020, May 22). Wikipedia.
Kahn, J. H. (2011). Multilevel modeling: Overview and applications to research in counseling
psychology. Journal of Counseling Psychology, 58(2), 257–271.
Kenny, D. A. (2020, May 18). Identification. davidkenny.net.
Koch, A., Imhoff, R., Dotsch, R., Unkelbach, C., & Alves, H.W. (2016). The ABC of stereotypes
about groups: Agency/socioeconomic success, conservative-progressive beliefs, and
communion. Journal of personality and social psychology, 110(5), 675–709.
Kurdi, B., Seitchik, A. E., Axt, J. R., Carroll, T. J., Karapetyan, A., Kaushik, N., Tomezsko, D.,
Greenwald, A. G., & Banaji, M. R. (2019). Relationship between the Implicit Association
Test and intergroup behavior: A meta-analysis. American Psychologist, 74(5), 569–586.
GIRLS WITH GUNS 33
Kurpius, S. E. R., & Lucart, A. L. (2000). Military and civilian undergraduates: Attitudes toward
women, masculinity, and authoritarianism. Sex Roles: A Journal of Research, 43(3-4),
Lewin-Epstein, N., & Cohen, Y. (2019) Ethnic origin and identity in the Jewish population of
Israel. Journal of Ethnic and Migration Studies, 45(11), 2118-2137.
Marcinkiewicz-Gołaś, A. (2006). Ochotnicza Legia Kobiet 1918-1922. PAT.
Mardia, K. V. (1970), Measures of multivariate skewnees and kurtosis with applications.
Biometrika, 57(3), 519–530.
McIver, J. P., & Carmines, E. G. (1981). Quantitative Applications in the Social Sciences:
Unidimensional scaling. SAGE Publications, Inc. 10.4135/9781412986441
Mikołajczak, M. (2016). The structure of ambivalence toward women – Legitimization of gender
discrimination and prospects for change.
Nakagawa, S., Schielzeth, H. (2013). A general and simple method for obtaining R2 from
generalized linear mixed-effects models. Methods in Ecology and Evolution 4(2): 133-
142. DOI: 10.1111/j.2041-210x.2012.00261.x
Oswald, F. L., Mitchell, G., Blanton, H., Jaccard, J., & Tetlock, P. E. (2013). Predicting ethnic
and racial discrimination: A meta-analysis of IAT criterion studies. Journal of
Personality and Social Psychology, 105(2), 171–192. https://doi.org/10.1037/a0032734
Phelan, J. E., & Rudman, L. A. (2010). Prejudice toward female leaders: Backlash effects and
women's impression management dilemma. Social and Personality Psychology Compass,
4(10), 807–820. https://doi.org/10.1111/j.1751-9004.2010.00306.x
GIRLS WITH GUNS 34
Pratto, F., Sidanius, J., Stallworth, L. M., & Malle, B. F. (1994). Social dominance orientation: A
personality variable predicting social and political attitudes. Journal of Personality and
Social Psychology, 67(4), 741–763. https://doi.org/10.1037/0022-35220.127.116.111
Putnick, D. L., & Bornstein, M. H. (2016). Measurement invariance conventions and reporting:
The state of the art and future directions for psychological research. Developmental
Review, 41, 71–90. https://doi.org/10.1016/j.dr.2016.06.004
Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of
Statistical Software, 48(2), 1–36.
Rudman, L. A. (1998). Self-promotion as a risk factor for women: The costs and benefits of
counterstereotypical impression management. Journal of Personality and Social
Psychology, 74(3), 629–645. https://doi.org/10.1037/0022-3518.104.22.1689
Russell, A. M. T., & Fiske, S. T. (2008). It's all relative: Competition and status drive
interpersonal perception. European Journal of Social Psychology, 38(7), 1193–1201.
Sagy, S., Orr, E. & Bar-On, D. (1999). Individualism and Collectivism in Israeli Society:
Comparing Religious and Secular High-School Students. Human Relations, 52, 327–348.
Sasson-Levy, O. (2011). Research on Gender and the Military in Israel: From a Gendered
Organization to Inequality Regimes. Israel Studies Review, 26(2), 73-98. Retrieved from
Satorra, A., & Bentler, P. M. (2001). A scaled difference chi-square test statistic for moment
structure analysis. Psychometrika, 66(4), 507–514. https://doi.org/10.1007/BF02296192
GIRLS WITH GUNS 35
Schoot, R.V., Schmidt, P.F., Beuckelaer, A.D., Lek, K.M., & Zondervan-Zwijnenburg, M.
(2015). Editorial: Measurement Invariance. Frontiers in Psychology, 6.
Stefaniak, A., Winiewski, M. (Eds.). (2019). Uprzedzenia w Polsce 2017. Oblicza przemocy
międzygrupowej. Liberi Libri.
Taylor, S. E. (1981). A categorization approach to stereotyping. In D. L. Hamilton (Ed.),
Cognitive Processes in Stereotyping and Intergroup Behavior (1st ed., pp. 82–114).
Taylor, S. E., Fiske, S. T., Etcoff, N. L., & Ruderman, A. J. (1978). Categorical and contextual
bases of person memory and stereotyping. Journal of Personality and Social Psychology,
36(7), 778–793. https://doi.org/10.1037/0022-3522.214.171.1248
Ustawa o służbie wojskowej żołnierzy zawodowych (Dz.U.) 2003 nr 179 poz. 1750 (Poland)
Ustawa o zmianie ustawy o powszechnym obowiązku obrony Rzeczypospolitej Polskiej (Dz.U.)
2003 nr 210 poz. 2036 (Poland)
Van Creveld, M. (2000). Armed But Not Dangerous: Women in the Israeli Military. War in
History, 7(1), 82–98. https://doi.org/10.1177/096834450000700105
Wells, G. L., & Windschitl, P. D. (1999). Stimulus sampling and social psychological
experimentation. Personality and Social Psychology Bulletin, 25(9), 1115–1125.
Willemsen, T. M. (2002). Gender typing of the successful manager-A stereotype reconsidered.
Sex Roles: A Journal of Research, 46(11-12), 385–391.
Zajonc, R. B. (1968). Attitudinal effects of mere exposure. Journal of Personality and Social
Psychology, 9(2, Pt.2), 1–27. https://doi.org/10.1037/h0025848
GIRLS WITH GUNS 36
Zajonc, R. B. (2001). Mere Exposure: A Gateway to the Subliminal. Current Directions in
Psychological Science, 10(6), 224–228. https://doi.org/10.1111/1467-8721.00154
Zebrowitz, L. A., Bronstad, P. M., & Lee, H. K. (2007). The contribution of face familiarity to
ingroup favoritism and stereotyping. Social Cognition, 25(2), 306–338.
GIRLS WITH GUNS 37
Scales and tasks included in the project
As mentioned in the main text, this study was a part of a larger scientific project. The
following tasks were used in the project (elements used in the current study are marked with bold
font): (a) SDO scale (as used in Stefaniak & Winiewski, 2019), (b) RWA scale (as used in
Stefaniak & Winiewski, 2019), (c) items related to national identification, (d) items related to
locus of control and morality, (e) items regarding perceptions and emotions related to COVID-19
pandemic, (f) ASI scale, (g) items related to personal and collective victimhood and trauma, (h)
items related to perceptions, emotions and attitudes towards various national or ethnic groups, (i)
a task including a rape depiction, where a man is the offender and a woman is a victim, (j) items
related to a person perception depending on his / hers political views, (k) the main task of this
study, (l) demographics questions. The original ordering is preserved.
Introduction screen content
zapraszamy do udziału w badaniu dotyczącym
postaw społecznych, prowadzonym przez
studentów i pracowników Wydziału Psychologii.
Udział w badaniu jest całkowicie dobrowolny i w
każdym momencie można z niego zrezygnować.
Badanie trwa ok. 20 minut. Kliknięcie "dalej"
oznacza wyrażenie zgody na udział w badaniu.
W razie pytań, zapraszamy do kontaktu pod
ךתוא ןימזמ לאערזי קמע תימדקאה הללכמה ןמ רקחמה תווצ
בח תודמע רקחמב ףתתשהלהז רקחמב תולאשהמ קלח .תויתר
תושגר ררועל תויושע רשא תושק תויגוסב תוקסוע
.םיילילש סיסב לע הניה רקחמב ךתופתתשהש ןייצל ונל בושח
ה/תאו יתובדנתהעגר לכב ךתופתתשה תא קיספהל ה/לוכי .
ןולאש יולמ כ חקול רקחמה-25 אבה" לע הציחל .תוקד "
רקחמב ףתתשהל ךתמכסה תא ת/ןתונ ה/תאש התועמשמ.
ונילא י/הנפ אנא ,תולאש ךל שי םא ב :
GIRLS WITH GUNS 38
Dependend variables measurement
Instructions used in the main task of the study
Prosimy się zastanowić, w jaki
sposób ludzie tacy jak Pani/Pan
mogą myśleć o osobie
przedstawionej na zdjęciu.
Prosimy odpowiedzieć używając
skal umieszczonych poniżej.
אנא בשח/י דציכ םישנא ךומכ םיספות
תא תומדה העיפומש הנומתב. אנא ןייצ/י
תא ךתבושת ךות יששומ םלוסב ןלהלש
Please think about how people
like you can perceive the
character in the picture. Please
indicate your answer using the
Items (7-point Likert scales)
Items used in the main task of the study – trait-level items.
(1) nieprzyjazna – (7)
אל יתודידי/ת - יתודידי/ ת
(1) friendly – (7) unfriendly
(2) niezdolna – (7) zdolna
אל רשכומ/ת - רשכומ/ ת
(1) unable – (7) able
(2) niekompetentna – (7)
אל יעוצקמ/ת - יעוצקמ/ ת
(1) incompetent – (7)
(2) zimna – (7) ciepła
רק/ה - מח/ ה
(1) cold – (7) warm
Items used in the main task of the study – role-level items; 1 = Strongly disagree; 7 = Strongly agree.
Ta osoba byłaby skutecznym
תומדה וזה היהת להנמ/ת בוט/ ה
This person would be an
Ta osoba byłaby dobrym
תומדה וזה היהת הרוה בוט/ ה
This person would be a good