Content uploaded by Steven V. Owen
Author content
All content in this area was uploaded by Steven V. Owen on Mar 02, 2019
Content may be subject to copyright.
JOURNAL OF RESEARCH IN SCIENCE TEACHING VOL. 44, NO. 10, PP. 1461– 1478 (2007)
Psychometric Reevaluation of the Women in Science Scale (WiSS)
Steven V. Owen,
1
Mary Anne Toepperwein,
2,4
Linda A. Pruski,
2,4
Cheryl L. Blalock,
2,4
Yan Liu,
2,4
Carolyn E. Marshall,
2,4
Michael J. Lichtenstein
2,3,4
1
Department of Epidemiology and Biostatistics, The University of Texas Health Science
Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229-3900
2
Barshop Institute for Aging and Longevity Studies, The University of Texas
Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio,
TX 78229-3900
3
Division of Geriatrics and Gerontology, Department of Medicine, The University of Texas
Health Science Center at San Antonio, 7703 Floyd Curl Drive, San Antonio, TX 78229-3900
4
Frederic C. Bartter General Clinical Research Center, The University of Texas
Health Science Center at San Antonio, 7703 Floyd Curl Drive, MC 7891,
San Antonio, TX 78229-3900
Received 20 November 2005; Accepted 19 October 2006
Abstract: The Women in Science Scale (WiSS) was first developed in 1984, and is still being used in
contemporary studies, yet its psychometric properties have not been evaluated with current statistical
methods. In this study, the WiSS was administered in its original 27-item form to 1,439 middle and high
school students. Confirmatory factor analysis based upon the original description of the WiSS was modestly
supportive of the proposed three-factor structure, but the claimed dimensions showed substantial
redundancy. Therefore, we split our sample and performed exploratory factor analyses on one half. The
most satisfactory solution, a two-factor model, was then applied to the crossvalidation sample with a
confirmatory factor analysis. This two-factor structure was supported with a total of 14 items. Factor 1,
Equality, contains eight items, and factor 2, Sexism, six items. Although our data are limited to adolescents,
the WiSS, with improved psychometric properties, may be used descriptively to assess attitudes toward
Contract grant sponsor: Science Education Partnership Award; Contract grant number: R25-RR-18549 [National
Center for Research Resources (NCRR) and the National Institute on Aging (NIA)]; Contract grant sponsor:
a Minority K-12 Initiative for Teachers and Students; Contract grant number: R25-HL-075777 [National
Heart, Lung, and Blood Institute (NHLBI)]; Contract grant sponsor: the Frederic C. Bartter General Clinical
Research Center; Contract grant number: MO1-RR-01346.
Correspondence to: S.V. Owen; E-mail: OwenSV@uthscsa.edu
DOI 10.1002/tea.20187
Published online 20 March 2007 in Wiley InterScience (www.interscience.wiley.com).
ß2007 Wiley Periodicals, Inc.
women in science and with additional stability and repeatability testing, may be used in evaluation research.
The shortened WiSS should result in shorter administration time, fewer missing data, and increased
acceptance among survey administrators in classroom settings. ß2007 Wiley Periodicals, Inc. J Res Sci
Teach 44: 1461–1478, 2007
Keywords: general science; gender/equity; evaluation and theory; secondary
Attitudes toward women in science have been of interest since gender equity issues first arose,
motivating implementation and assessment of intervention programs aimed to enhance women’s
science achievement. Attitudes related to gender seem to influence institutional policies, and thus
contribute to the underrepresentation of women in science and the subsequent loss of talent in
this critical field (Simpson, Koballa, Oliver, & Crawley, 1994). The contributions of science are
enormous, and such a loss of talent has far-reaching societal impact. Attitudes toward science
are also related to long-term commitment to science (Simpson et al., 1994). If, as Gardner (1975,
p. 22) suggested, ‘‘sex is probably the single most important variable related to pupils’ attitudes to
science,’’ then a dependable instrument that can measure attitudes toward women in science
should be valuable for evaluating the effectiveness of interventions aimed at reversing the
underrepresentation of women in science.
The Women in Science Scale (WiSS) is an attitude survey developed in 1984 (Erb & Smith,
1984) and still in use today. Review of the original validation and subsequent use of the WiSS
poses questions related to its dimensional structure and scoring. This study reevaluated the WiSS
using factor analysis techniques that were not readily available when the survey was developed.
A shortened instrument with a strong factor structure and improved scoring would be of benefit in
developing and assessing intervention programs related to gender equity issues. This introduction
summarizes gender issues in United States public education, challenges related to measurement of
scientific attitudes, and the development of the WiSS. This summary provides the rationale for
a psychometric reevaluation of the WiSS.
Background
For the United States to remain a technological leader, its educational system must prepare all
students to be literate in science, engineering, and technology [Congressional Commission on the
Advancement of Women and Minorities in Science, Engineering and Technology Development
(CAWMSET), 2000; Darke, Clewell, & Sevo, 2002; Morella, 2002; Strauss, 1988]. However,
gender and racial disparities in education hinder progress toward a scientifically literate public.
Gender equity emerged as a major issue in 1972, with the adoption of Title IX to prohibit sex
discrimination in federally funded education programs, followed in 1974, by the Women’s
Education Equity Act (Hammrich, Richardson, & Livingston, 2001). In 1980, the Science and
Technology Equal Opportunities Act directed the National Science Foundation to compile and
evaluate data on the status of women and minorities in science and engineering fields and to report
biannually to Congress on their findings (Rosser & Lane, 2002). These legislative acts raised
awareness, identified specific needs, and spawned numerous federally funded science education
programs for women. Major educational interventions resulted in significant improvement in
opportunities for young women [Darke et al., 2002; Rosser & Lane, 2002; U.S. Dept. of Education
(USDOE), 2004]. As progress was made, the operational definition of gender equity as equality of
access evolved into equality of outcomes (Clewell, 2002). This shift altered both the goals and
assessment of gender-bias intervention programs. Although much has been accomplished, current
studies indicate more must be done to improve participation at all educational levels with respect
to gender [American Association of University Women (AAUW), 1998; National Coalition for
Women and Girls in Education (NCWGE), 2002; USDOE, 2004].
1462 OWEN ET AL.
Journal of Research in Science Teaching. DOI 10.1002/tea
Attitudes toward Science versus Achievement Gap
Although differing about the extent of gender gap in math and science, both Title IX at 30
(NCWGE, 2002) and the USDOE’s (2004) Trends in Educational Equity of Girls and Women,
report improvement in National Assessment of Educational Performance (NAEP) scores in math
and science. But reducing the gender gap in math and science achievement has not extended to the
affective domain, which may be as important as cognitive issues in steering educational and career
choices (Koballa, 1988). The affective domain relative to science is complex, and involves
constructs such as attitudes, values, beliefs, opinions, interests, and motivation (Simpson et al.,
1994). Over time, research into the affective domain has revealed greater understanding of terms
and concepts; for example, the concept of attitude ‘‘can be subdivided into three dimensions:
feeling, cognition, and behavior’’ (Simpson et al., 1994, p. 212). Negative attitudes toward school
are prevalent in both genders, but girls’ attitudes decline over time at a faster rate and not liking
math and science increases with age (AAUW, 1992; USDOE, 2004). To the extent that having a
more positive attitude toward science influences motivation, performance, and career choice,
improving student attitudes towards science and scientists is a goal of science educators (Koballa,
1989; Laforgia, 1988). As further sharpening of terms motivates new research, a clearer picture of
student attitudes toward science should emerge.
Girls’ Success and Attitudes in Science
Early learning experiences have a major impact on later achievement in math, science, and
technology (Strauss, 1988; Huang & Du, 2002). Stereotypical gender bias alienates girls from
these fields early in their education (Darke et al., 2002), and even 6-year-olds of both genders
associate science with males (Hughes, 2002). As occupational stereotypes have a profound effect
on educational and career choices (Hughes, 2002), the effort to eliminate gender bias in education
and the workplace should attend to early schooling.
Gender bias often occurs with course placement or selection, an important predictor of long-
term commitment to science (Clewell & Burger, 2002; NCWGE, 2002; USDOE, 2004). At the
elementary level, girls outnumber boys in gifted programs, but by 10th grade, membership is
reversed (Sadker, 2000). Boys are more likely to be in gifted math and science classes, whereas
more girls populate gifted language arts programs (Sadker, 2000). Secondary boys take more
advanced courses, and are more likely to take all three core science courses: biology, chemistry,
and physics (AAUW, 1998; Clewell, 2002), although this trend is slowing (NCWGE, 2002;
USDOE, 2004). Fewer postsecondary females enroll in physics, calculus, and engineering, and far
fewer major in math, computer science, engineering, and the traditional physical sciences
(Clewell & Burger, 2002; USDOE, 2004).
Females still face sex discrimination in educational programs and underrepresentation in
higher paying science, engineering, and technology (SET) jobs (NCWGE, 2002). Almost three-
fourths of women surveyed in the 2001 International Study of Women in Physics [American
Institute of Physics Statistical Research Center (AIPSRC), 2001] reported that their interest in
physics developed ‘‘before or during the secondary school level,’’ implying the importance of
providing both opportunities and encouragement in the study of physics (and other SET fields)
early in the educational system.
Although the mandate to eliminate gender bias in federally funded education is over 30 years
old, and improvements have been made, more needs to be done to prepare teachers to deal
effectively with this issue in their classrooms. The American Association of University Women
(1998) reported that teachers receive little training on gender issues; a 1993– 1994 survey revealed
an average of only 2 hours per semester were spent on the topic. Progress has been made in some
REAPPRAISAL OF THE WISS 1463
Journal of Research in Science Teaching. DOI 10.1002/tea
areas, but with advances in the workplace, teacher training, and thus education, often falls behind
(AAUW, 1998).
Summary of Measurement of Attitudes toward Science
Attitude is difficult to define and assess. It is measured indirectly, usually by self-report
(Cross, 2005; Koballa, 1988, 1989). Beliefs are the cognitive basis for attitudes, which are learned
and thus plastic; attitudes, in turn, form cognitive relationships that probably predispose behaviors
(Cross, 2005; Osborne, Simon, & Collins, 2003). Attitudes toward science have proven difficult to
measure because they do not form a simple, unidimensional construct (Laforgia, 1988). A clear
description of attitudes toward science did not exist until the 1970s, when a distinction was made
between attitudes toward science and science attitudes; the latter referring to science as a process
(Laforgia, 1988; Osborne et al., 2003; Ramsden, 1998).
Numerous subconstructs contribute to students’ attitudes toward science, and these must be
carefully defined and distinguished (Osborne et al., 2003; Ramsden, 1998). Evidence for a
measurement’s dimensionality is best undertaken by factor analyses, and once a structure is
declared, one hopes that the structure is invariant over time and across populations. For example, if
an instrument is standardized on New England students, and the authors show good evidence for
three subdimensions, the same dimensionality should be expected when the measure is given to
Southwestern youngsters. Moreover, if there are multipl e dimensions operating, a ‘‘total’’ scale is
entirely misleading (Gardner, 1995, 1996; Laforgia, 1988; Osborn et al., 2003), yet this ‘‘basic
principle of psychometrics is frequently ignored’’ (Gardner, 1996, p. 914). Many instruments have
been produced to measure attitudes toward science. In a comprehensive review, Munby (1980)
reported that few were theoretically or statistically acceptable.
Summary of Development and Use of WiSS
During the 1970s, many intervention programs were implemented and evaluated to address
underrepresentation of women in the math and SET (science, engineering, and technology) fields;
early adolescence emerged as an important developmental period to affect career choices (Darke
et al., 2002; Rosser & Lane, 2002). The multidimensional Women in Science Scale (WiSS) was
created to quantify attitudes of adolescent girls and boys toward women in science (Erb & Smith,
1984), and was subsequently used to evaluate the effectiveness of COMETS Science (Career
Oriented Modules to Explore Topics in Science), a career-related curriculum targeting early
adolescent students in grades 5 –9 (Smith & Erb, 1986).
A review of the literature revealed that in addition to the aforementioned studies, the WiSS has
been used in four peer-reviewed investigations and three dissertations. A systematic review of
these papers evaluated study population, study design, theoretical description, and instrument
reliability, validity, and usability; the systematic review of each paper is summarized in
Appendices A and B. The WiSS was used as a conceptual template in the creation of the Early
Childhood Women in Science Scale (ECWiSS), a modified instrument to measure the attitudes of
young children toward women in science (Mulkey, 1989). Factor analysis indicated that 15 items
should be dropped from the ECWiSS, but the identified items were not removed. Furthermore, a
factor analysis showed five dimensions, but two of these were uninterpretable. Evans, Whigham,
and Wang (1995) used the WiSS to evaluate the effectiveness of a 3-day role model intervention
that was more effective for girls than for boys. Problematic issues raised in this study include a
pretest that was modified before use as the posttest and the blending of items from the Women in
Science Scale (Erb & Smith, 1984) and two previously published attitude scales (Evans et al.,
1995).
1464 OWEN ET AL.
Journal of Research in Science Teaching. DOI 10.1002/tea
Stake (2003) used six items from the WiSS to evaluate a science enrichment program that
provided positive information about women in science. Stake (personal communication, August
31, 2005), explained that the items were chosen to reflect attitudes toward the value and worth of
women in science. However, these concepts were not the original dimensions of the WiSS, and
these six items were subsumed into a larger instrument containing items from the Science Self-
Concept Scale (Campbell, 1991). Wyer (2003) also used the WiSS in a study of persistence in
science majors. A summary of the study can be found in Appendix A.
Appendix B summarizes three dissertation studies using the WiSS. Other studies were
included in our original search, but upon examination were found to be unrelated to our work.
Wyer (2003) credited two additional dissertations with using the WiSS; first, Ridgill (1975) used
the Attitudes Toward Women Scale (Spence, Helmreich, & Stapp, 1975), but not the WiSS, and
second, as part of a dissertation study, a new scale called the Women’s Internalization of
Stereotypes Scale (WISS) was created (Giacobbi, 1998), which is not the same WiSS reviewed in
this paper. As detailed in Appendix B, Al-Munea (1994), Bailer (1998), and She (1994) reported
the validation process used by Erb and Smith (1984). Al-Munea (1994) translated the WiSS into
Arabic, but did not validate the translated instrument.
Since its development, the WiSS has been treated as a valid measure of its construct with little
review of its psychometric properties. Erb and Smith (1984, p. 391), asserted that the WiSS is
based upon ‘‘three dimensions of attitudes toward women in science.’’ Those three dimensions
were: (1) women possess characteristics that enable them to be successful in science careers, (2)
women’s roles as mother and wife are compatible with successful science careers, and (3) women
and men should have equal opportunities to pursue science careers.
In their validation of the WiSS, Erb and Smith (1984) first modified the instrument based upon a
readability pilot testwith fifth graders; a second modification from the analysis of612 junior, middle,
and high school surveys resulted in the elimination of three items with poorpart –whole correlations.
Another 612 surveys from the original sample were used to crossvalidate the final 27-item
instrument; for the WiSS overallscore, the alpha reliability estimate was .92 and test– retest (8-week
interval) reliability was found to be .82 (Erb & Smith, 1984). Construct validity was evaluated using
contrasting groups and convergent validity coefficients. Erb and Smith (1984, p. 394) reported that
the WiSS could ‘‘distinguish between males and females regardless of age within the range of
interest.’’ Erb and Smith (1984) concluded that the WiSS delivers reliable and valid data that assess
attitudes of early adolescents toward women in science. We should point out that in the development
of the WiSS, Erb and Smith (1984) discussed three ‘‘dimensions’’ spanning the 27 items, but every
subsequent analysis was based upon a single composite score, the sum of responses over all items.
They reported no factor studies of the WiSS. Given some of the methodological issues related to use
of the WiSS, we sought to reevaluate the psychometric properties and factorstructure of the scalein a
sample of middle and high school children in San Antonio, TX.
Methods
Participants
The WiSS was administered to a convenience sample of 1,439 middle school students from 16
teachers in nine schools within four school districts in San Antonio, TX, at the beginning of the
2004 –2005 school year. The distribution of boys and girls by grade is shown in Table 1. All but one
school in the study were majority –minority (Mexican–American) schools. Reliable assessment
of ethnic or racial group on the basis of surnames in this sample was not possible. This project was
approved by the sponsoring university’s Institutional Review Board.
REAPPRAISAL OF THE WISS 1465
Journal of Research in Science Teaching. DOI 10.1002/tea
Instrument
The WiSS was administered as a 27-item survey printed on two pages. We used the instrument
items as originally published by Erb and Smith (1984). Respondents were asked to circle a number
on a 6-point Likert scale that reflected their true feelings for each item. The response format ranged
from Strongly Agree (1) to Strongly Disagree (6), with no option of a neutral response.
Analyses
Missing Data and Outliers. Preliminary analyses were conducted to assess whether
responses were missing at random through the data set, and if so, to impute missing data. The
pattern of missing data is shown in Table 2. We planned to impute missing data; however, any
imputation process assumes, at worst, that data are missing at random (MAR). To assess MAR
initially in the full data set, dichotomous groups were created: one with complete data across all 27
survey items, and the other with missing responses on one or more of the items. With SPSSv14.02,
discriminant function analysis was then used topredict missingness group status. The only complete
data outside the WiSS items were grade level and gender, so these variables were used to predict
missingness. The result of the discriminant analysis was Wilks’ a¼.99, w2
(2)
¼3.94, p¼.14. In
short, missingness was unrelated to grade and gender, so data were tentatively assumed to be MAR.
In the WiSS survey, the first 16 items were printed on one page, and the remaining 11 items on
the second page. A check of item frequencies showed a jump in missing data beginning with item
17. Because all second page items consisted of Smith and Erb’s (1984) third dimension (‘‘Women
and men should have equal opportunities to pursue science careers’’), we decided to eliminate 89
students who omitted their second page completely. This reduced the sample to 1,350.
The remaining data were searched for multivariate outliers, defined as cases that had
excessive influence on the statistical solution. These additional analyses actually took place within
a portion of the main analyses. Every confirmatory factor analysis (CFA) was inspected for
potential multivariate outliers, defined as thosewith significant (p<.0001) Mahalanobis distances
from the centroid of all relevant items in a given analysis. Each CFA produced three to five
suspected outliers, and was then rerun without the suspected outliers. In no instance did omission
of suspected outlier cases affect the various indices of close fit, so no cases were removed from
those main analyses.
Table 1
Distribution of Boys and Girls (Number and Column Percent) Taking the WiSS
Grade
a
Gender
Boys Girls Total
N(%) N(%) N(%)
6
a
55 55 110 (7.7%)
7
a
448 534 982 (68.5%)
8
a
94 122 216 (15.1%)
9 54 71 125 (8.7%)
Totals 651 (45.4%) 782 (54.5%) 1433 (100.0%)
a
There were 1,439 students in the original sample. Gender and /or grade level were missing for six.
1466 OWEN ET AL.
Journal of Research in Science Teaching. DOI 10.1002/tea
Preliminary Factor Model. In their paper describing the development of the WiSS, Erb and
Smith (1984) reported no empirical study of the scale’s dimensionality. To evaluate the internal
structure of the WiSS, we arranged a confirmatory factor analysis around the three logical scales
reported by Erb and Smith (1984). This analysis used the entire sample of 1,350.
Factor Analyses. Because the preliminary CFA was not supportive, the central analyses
involved exploring and modifying the factor structure of the WISS for a female sample, and then
testing whether the modified factor structure held for both females and males. For factorial validity
evidence, we adopted some suggestions by Smith, McCarthy, and Anderson (2000) and Marsh,
Ellis, Parada, Richards, and Heubeck (2005). Given the size of our sample, we were able to split it
twice. The first, a random approximate 50% split, resulted in two samples: one called validation
(n¼690) and the other, crossvalidation (n¼660). Next, each of these two samples was split into
female and male subsamples. The resulting subgroups were female validation (n¼373); male
validation (n¼317); female crossvalidation (n¼373); and male crossvalidation (n¼287).
In developing factorial validity evidence, we took the position that the baseline factorial
structure should be based on a sample of females. We assumed that feminist philosophy during the
last several decades was more relevant to females than to males. Also, the WiSS itself is about
attitudes toward women in science, and it contains items motivated by feminism (e.g., equal
Table 2
Data Completeness for the WiSS (The WiSS contains 27 items)
# Questions Answered n%
27 1028 71.43%
26 214 14.88%
25 65 4.51%
24 15 1.04%
23 10 .69%
22 4 .28%
21 3 .21%
20 2 .14%
19 1 .07%
18 2 .14%
17 2 .14%
16 52 3.61%
15 8 .56%
14 6 .42%
13 4 .28%
12 2 .14%
11 4 .28%
10 0 0
900
8 1 .07%
7 1 .07%
6 1 .07%
5 1 .07%
400
3 1 .07%
200
1 1 .07%
0 11 .76%
TOTAL 1439 100%
REAPPRAISAL OF THE WISS 1467
Journal of Research in Science Teaching. DOI 10.1002/tea
educational and employment opportunity for women). Thus, initial exploratory factor work was
based on the female validation sample, and the invariance of that structure was tested on the male
validation sample. To assess the generalizability of the factor structure, a second multigroup
confirmatory analysis was performed with both crossvalidation groups. Finally, comparisons were
made to discover whether science attitudes differed by gender or grade level.
Results
Preliminary Analyses: Data Imputation
All initial analyses began with the reduced data set (N¼1,350 after deleting those who
omitted the second survey page). Missing data pose no special challenge for confirmatory factor
analysis, which uses an expectation maximization algorithm (full information maximum
likelihood) to create a full covariance matrix before it calculates parameter estimates. But because
exploratory factor analysis programs default to listwise deletion, omitted survey items can sharply
reduce the sample size. (In our reduced data set, the eligible sample would have been reduced by
another 322 cases, or 23.8%.) Once MAR had been confirmed (see methods above), we used
SYSTAT v.11.0 to impute missing data by way of expectation maximization (EM). The EM
procedure created a full data matrix that was used in all remaining analyses.
Preliminary Analyses: Initial Confirmatory Factor Analysis
Using the entire sample, a CFAwas arranged for a three-factor model, using the item subsets
proposed by Erb and Smith (1984). The model was unsatisfactory, with w2
(321)
¼1975.5, p<.001.
However, the w2 is vulnerable to large samples, so we focused on two widely used measures of
close fit, Bentler’s comparative fit index(CFI; an acceptable model should have at least .90) (Hu &
Bentler, 1999), and the root mean square error of approximation (RMSEA) (Steiger & Lind,
1980), which estimates the proportion of error per parameter estimated; the usual criterion is .08 or
less. In our data, the CFI was .86 and the RMSEA was .06 (.90 CI¼.059– .064). Even more
problematic were the intercorrelations among the three factors: .89, .90, and .90, suggesting
extreme redundancy among the claimed dimensions (see Figure 1).
Main Analyses: Exploratory Factor Analyses
Because the CFAwas not strongly supportive of the three-factor model, we staged a series of
exploratory factor analyses (EFAs), beginningwith an almost certain overextraction of sixfactors,
and then successively reducing the dimensions to a final analysis of one factor. SPSS was used for
the factor procedures, with a principal axis extraction and oblique (direct oblimin) rotation. These
EFAs were all carried out on the female validation subsample.
Using a loading criterion of .35, pattern matrices were inspected, and the simplest structure
came about with the one- and two-factor solutions. The two-factor solution showed overlapping
factors, correlated at r¼.40. With the female validation data, both factor solutions were then
submitted to confirmatory factor analyses, using AMOS 6.0. In terms of fit indices, the two-factor
model was superior, with the most obvious discrepancy in the relative w
2
(w
2
divided by its degrees
of freedom), 2.85 in the one-factor model and 2.25 in the two-factor model. In the two-factor
model, the dimensions were readily interpretable; we labeled one Equality (16 items) and the
other, Sexism (9 items; two original items did not survive the exploratory factor analyses).
However, the CFI was substandard, at .83, suggesting that the model could be markedly improved.
1468 OWEN ET AL.
Journal of Research in Science Teaching. DOI 10.1002/tea
We set about modifying the model by first deleting five highly redundant items and then by
removing five items with standardized loadings less than .50 (e.g., ‘‘Both men and women can
combine careers with family life’’ was removed; ‘‘Women can combine successful careers with
successful marriages’’ was retained). This resulted in a substantially reduced structure (Figure 2),
with Equality now containing eight items, and Sexism, six items with a correlation of .78 between
factors. Predictably, the simplified model also fit the data better, with CFI ¼.93 and RMSEA ¼.06
(.90 CI ¼.04–.07).
As a further test of the adequacy of the reduced model, we compared internal consistency
reliability estimates for the original two factors with estimates for the factors with fewer items.
Table 3 summarizes these reliabilities, and includes Spearman-Brown estimates for what the
reliability should have been by reducing the number of items for each scale score (Nunnally &
Bernstein, 1994). Because the reduced versions of each subscale showed higher reliabilities than
predicted by the Spearman-Brown formula, they were retained without further modification.
There was little loss of internal consistency by reducing the number of items.
Main Analyses: Confirmatory Factor Analyses (CFA)—Validation in the Male Sample
The next step was to conduct a multigroup CFA, imposing the female factorial structure on the
male validation data set, and evaluating model fit for both groups simultaneously. This was
arranged as a series of four successive models, each creating more constraints than the previous
one. The first was unconstrained, meaning that specific items were designated for one or the other
factor, but the factor loadings were free to vary within the female and male samples. The second
model constrained the unstandardized factor loadings to be exactly the same for both samples. The
third model added an equality constraint for the model covariance (that is, the association between
Figure 1. Erb and Smith’s (1984) logical WiSS dimensions.
REAPPRAISAL OF THE WISS 1469
Journal of Research in Science Teaching. DOI 10.1002/tea
the two factors). Finally, the fourth model added equivalent error variances (item measurement
residuals) for males and females. Because each of the final three models is nested within
the previous model, statistical comparisons may be made between the models. There was an
insignificant change between the unconstrained model and the constrained-loadings model,
Dw2
(12)
¼19.85, p¼.07. Each later model comparison was significant. In summary, we concluded
the validation portion with a demonstration that the female factor structure fit the male data
acceptably well, even with the imposition of equal loadings. Further evidence of fit was that the
loadings-constrained multigroup model gave CFI ¼.92, with RMSEA ¼.04 (.90 Cl ¼.03 –.05).
Main Analyses: Confirmatory Factor Analyses—Crossvalidation
For the assessment of factorial invariance, the same multigroup model was applied to the
female and male crossvalidation data sets. In this series of nested models, a similar result emerged:
an insignificant change occurred between the unconstrained model and the constrained-loadings
Figure 2. Validated and crossvalidated two-dimensional WiSS structure.
Table 3
Reliability Estimates for WiSS Subscales, Female Validation Sample
Factor Original Scale Alpha
Reduced Scale,
Spearman-Brown
Prediction
Reduced Scale,
Actual Alpha
Equality .84 (16 items) .72 (8 items) .78 (8 items)
Sexism .77 (8 items) .72 (6 items) .75 (6 items)
1470 OWEN ET AL.
Journal of Research in Science Teaching. DOI 10.1002/tea
model, Dw
2
(12)
¼13.17, p¼.36. This means that, for the crossvalidation data sets, the original
female model fit the male data with the additional requirement that the unstandardized loadings
between the two factors be identical for both females and males. The multigroup close fit indices
showed marginal fit, with CFI ¼.89, and RMSEA ¼.05 (.90 Cl ¼.04 –.06). Table 4 shows item
stems and standardized loadings for the crossvalidated two-factor model.
Main Analyses: Gender and Grade Comparisons
Having demonstrated the relative invariance of the two-dimensional structure of the reduced
WiSS, the final step was to do gender and grade comparisons. Scale scores were created to
represent the Equality and Sexism factors. These were constructed as means of the relevant items
within each factor. Then, a 2 (Gender) 4 (Grade) factorial MANOVA was arranged to evaluate
both scale scores simultaneously. For this analysis, all subgroups were recombined into a total
group data set (N¼1349; one student was missing grade information).
The multivariate gender by grade interaction was nonsignificant (p¼.49). Both main effects
were significant. For gender, Wilks’ l¼.93, F(2, 1,340) ¼41.95, p<.001, and for grade, Wilks’
l¼.97, F(6, 2,680) ¼6.08, p<.001. In terms of multivariate effect sizes, although the gender
difference was plainly the larger, with a squared canonical correlation of .07; for grade, it was only
.01. The interpretation is that gender grouping explained 7% of the total variation in the combined
scales of Equality and Sexism.
One-way ANOVAs were used to probe the multivariate results. The subscale means involved
in these post hoc tests are shown in Table 5. For the gender comparison, both subscales showed
differences between males and females, with Equality giving F(1, 1341) ¼78.66, p<.001
and Sexism, F(1, 1341) ¼84.62, p<.001. Effect sizes, h
2
, were both .06 [SPSS incorrectly uses
Table 4
Final WiSS Model, Item Stems, and Standardized Loadings
a
Equality Subscale Loading
WiSS 1. Women can be as good in science careers as men can. .56
WiSS 5. Women can make important scientific discoveries. .55
WiSS 7. Women are not reliable enough to hold top positions in scientific and technical
fields.
.61
WiSS 11. A woman with a science career will have an unhappy life (reverse scored) .55
WiSS 18. A woman should have the same job opportunities in science careers as a man. .66
WiSS 20. Women should not have the same chances for advancement in science careers as
men do (reverse scored)
.57
WiSS 21. Women should have the same educational opportunities as men. .60
WiSS 27. A successful career is as important to a woman as it is to a man. .48
Sexism Subscale Loading
WiSS 10. A woman’s basic responsibility is raising children. .56
WiSS 13. A wife should spend more effort to help her husband’s career than she spends on
her own.
.57
WiSS 22. Women have less need to study math and science than men do. .58
WiSS 24. Men need more math and science careers than women do. .60
WiSS 25. It is better for a woman to study home economics than chemistry. .59
WiSS 26. It is wrong for women to seek jobs when there aren’t enough jobs for all the men
who want them.
.63
a
Standardized loadings are from female crossvalidation CFA,in which unstandardized loadings for males and females were
constrained to equivalence. All item stems are from the original WiSS.
REAPPRAISAL OF THE WISS 1471
Journal of Research in Science Teaching. DOI 10.1002/tea
partial h
2
, an upwardly biased estimate in the factorial ANOVA, as its effect size (see Pierce,
Block, & Aguinis, 2004). The h
2
values reported here were recalculated from SPSS’s ANOVA
source tables.] Females showed a mean Equality score of 1.74 (on a 6-point scale where
1¼Strongly Agree), whereas males lagged at 2.32. These means are nearly three-fourths of a
standard deviation apart. Thus, females as a group showed stronger endorsement of gender
equality in scientific study, careers, and behaviors. The Sexism mean scores were further apart,
with females at 4.76 and males, 4.09, representing a difference of .88 standard deviation. (Note
that higher scores reflect more disagreement.)
Grade level effects were also significant for both subscales, with Equality showing F(3,
1341) ¼7.61, p<.001, and Sexism, F(3, 1341) ¼8.80, p<.001. Effect sizes, h
2
, were both .02,
so grade level, compared to gender, explained far less variation in the subscale scores. Tukey HSD
pairwise comparisons were used to probe the grade level effects. As seen in Table 5, different
subscale patterns emerged for the various grade levels. For Equality, there are no differences
among grades 6 to 8, but ninth graders, showing stronger Equality attitudes, are split off from the
other grades. For Sexism, there is a general linear pattern, with sixth graders different from (and
lower than) all other grades. Seventh and eighth graders are indistinguishable, but are lower than
ninth graders. In short, for both attitude subscales, there appear to be weak developmental trends,
with somewhat more accepting attitudes as grade levels increase.
Discussion
Exploratory and confirmatory factor analyses formed the central portion of this reappraisal of
the psychometric properties of the WiSS. We discovered that, contrary to the suggested scoring of
the scale, there were two related dimensions underlying the items. These dimensions generally
reflected positive (Equality) and negative (Sexism) attitudes about women in science. Items
reflecting the two dimensions were reduced in number without sacrificing factor clarity or
reliability. Also, the two dimensions showed a relatively invariant structure across genders and
across random subsamples of the data.
Compared to males, females showed more positive attitudes about equality and less sexist
attitudes, with a larger difference on the Sexism subscale. The effect sizes we found for gender
differences were more than triple those of Smith and Erb (1986; calculated from their Table IV, p.
672). Although they used an overall WiSS score, compared to our two subscale scores, it is
worrisome that over the past two decades, gender differences in attitudes toward women in science
may have increased. On the other hand, their sample was collected from five Midwestern and
Table 5
WiSS Subscale Means, Standard Deviations, and Alpha Coefficients for Gender and Grade Groups
Group (n)
Equality Subscale Sexism Subscale
Mean SD Alpha Mean SD Alpha
Gender
Females (746) 1.74 .71 .78 4.76 .91 .75
Males (604) 2.32 .90 .79 4.09 1.07 .77
Grade
Grade 6 (104) 2.19
a
.77 .69 4.13 1.04 .75
Grade 7 (933) 2.00
a
.86 .81 4.40
a
1.04 .79
Grade 8 (188) 2.15
a
.96 .85 4.40
a
1.06 .81
Grade 9 (124) 1.77 .75 .79 4.78 .93 .78
Note: For each subscale, grades with the same subscript do not differ from each other.
1472 OWEN ET AL.
Journal of Research in Science Teaching. DOI 10.1002/tea
Southwestern states, whereas ours was limited to a single metropolitan area in south Texas, so the
larger gender differences may be peculiar to our sample.
Grade level differences were statistically significant but their effect sizes were substantially
smaller, only hinting at improved attitudes as students get older. Of course, developmental change
cannot be clearly demonstrated with cross-sectional data, so a longitudinal study is needed to
verify this.
In Erb and Smith’s (1984) paper describing the original WiSS development, they reported
collecting data from junior high, middle, and high schools, but did not describe the grade levels.
Instead, they summarized data by age level, ranging from 10 to 16 years of age; this approximates a
span from 5th through 11th grade. Their grade level span was thus apparently wider than ours,
which contained sixth through ninth graders. Our data showed a small improvement in attitudes
with increasing grade levels. Erb and Smith found a significant age effect, but because they used a
conservative pairwise comparison method, they could not specify which ages were different from
one another. An examination of their age-level scores shows an inexplicable pattern, with ages 10,
14, and 16 showing the most positive attitudes.
Early psychometric specialists (e.g., Edwards, 1957) typically recommended creating scale
scores by summing individual items scores. Although that proposal is still widely used today, it
creates two problems. The first is interpretive. For example, our WiSS Equality subscale contains
six items, and the Sexism scale, eight items. Given a six-point response scale, the possible range of
values for a summative score is 6 to 36 for Equality and 8 to 48 for Sexism. Thus, the scoring
frames of reference are different for the two scales, which makes it harder to compare scores
between the two scales. The more serious problem, though, is that summative scores are incorrect
in the face of missing data. A respondent who omits one item on the eight-item Equality scale has a
lower summative score, but it is simply an artifact of missing data and does not represent a lower
attitude. Put another way, two respondents can end up with equivalent summative scores when one
has no missing data but a poor attitude, and the other has a better attitude but missing data. We
recommend that users of the WiSS (and other survey measures) create mean scores instead of
summative scores. Mean scores have the dual advantage of sitting in a recognizable frame of
reference (the original response scale), and compensating for missing data so that scores are not
artificially lowered because of omitted items.
In summary, this psychometric reevaluation of the WiSS demonstrated that the questionnaire
could be successfully shortened from 27 to 14 items with two underlying factors, Equality and
Sexism. The instrument may be used descriptively, such as assessing classroom attitudes about
women in science to assist in planning lessons. Or with additional stability and repeatability
testing, it may be used in evaluation research, as, for example, an outcome measure in a school
intervention study. The shortened version, as described here, should result in shorter
administration times, fewer missing data, and increased acceptance among persons administering
the WiSS in classroom settings (i.e., teachers).
The National Center for Research Resources (NCRR), National Institute on Aging (NIA),
and National Heart, Lung, and Blood Institute (NHLBI) are all part of the National
Institutes of Health (NIH). The contents of this report are solely the responsibility of the
authors and do not necessarily represent the official views of NCRR, NIA, NHLBI, or NIH.
We thank the administration, teachers, and students from Edgewood ISD, Northside ISD,
NorthEast ISD, and South San Antonio ISD in San Antonio, Texas. Our special thanks go
to Linda Bononcini, Assistant Superintendent Edgewood ISD; George Colon, Principal
and Teresa Gatell, Language Arts Teacher at Truman MS. From Northside ISD thanks to
Dr. Phil Linerode, Evaluation Specialist; Alice Fiedler, Science Curriculum Specialist;
John Folks, Superintendent; Priscilla Shaver, Gifted/ Talented Coordinator; Javier
REAPPRAISAL OF THE WISS 1473
Journal of Research in Science Teaching. DOI 10.1002/tea
Martinez, Principal and Lorenda Segura, Science Teacher at Zachry MS; and Martha
Campbell, Principal and Della Nagle, Reading Teacher from Jordan MS. From NorthEast
ISD, we thank Richard Middleton, Superintendent; Alicia Thomas, Associate Super-
intendent; Mark Sheffler, Associate Superintendent; Don Dalton, Executive Director, Curri-
culum & Instruction; Pattie Castellano, Science Curriculum Coordinator; Francene Tharp,
Health Services; Thalia Cheney, Principal and Nelda Charles and Kim Stelter, Science
Teachers at Nimitz MS; Randy Hoyer, Principal and Tamisine Neal and Melissa Moody,
Science Teachers from Bush MS; Michael Kerenan, Principal and Jocelyn Eckerman,
Science Teacher from Lee HS. From San Antonio ISD, our thanks go to Ruben Olivares,
Superintendent; Bill Vinal, Science Director; Anita Chavera, Principal and Josephine Rose,
Language Arts Teacher from Irving MS; Armando Gutierrez, Principal, Elizabeth Aguilar-
Cruz, Deborah Friesenhan, Jessica Diaz, and Francisco Lara, Science Teachers from Lowell
MS; Sylvia Lopez, Principal and Veronica Kanthu, Mathematics Teacher from Longfellow
MS. The authors also wish to thank Dr. Marilyn A. Winkleby, Department of Medicine,
Stanford University, for her valuable insight and additions to this paper.
Appendix A
Peer-Reviewed Studies Using the Women in Science Scale (WiSS)
Erb and Smith (1984) Smith and Erb (1986) Mulkey (1989)
Study population
Size 1,224 students 286 students 791 students
Age (years) 10– 16 Not specified Not specified
Gender 611 F, 613 M 156 F, 130 M Not specified
Race/ethnicity Not specified White, undefined
minority
White, hispanic, black,
Asian, other
Grade levels 6 –12 5– 8 K-4
Study sites Multiple Multiple Multiple
Regional area Not specified Multiple states Within country
Study design
Assembly Convenience Convenience Convenience
Analysis Cross sectional PPTI, quasi
experimental
Cross sectional
Reliability
Internal consistency ¼.92 ¼.91 WiSS ¼.90
Guttman ¼.87
a
Test-retest r¼.82, 2 months None reported None reported
SEM None reported None reported None reported
Validity
Content None reported None reported None reported
Discriminant Career choice, ISSS
r¼.00 to .04
None reported None reported
Convergent Career choice, ISSS
r¼.18 to .46
None reported Occupation survey
and observational
perspective r ¼.51
Contrasting groups ANOVA, t-test, Gender ANCOVA ANOVA Known Groups
Exploratory factor
analysis
None reported None reported Yes, 5 factors
Confirmatory factor
analysis
None reported None reported None reported
Usability
Number of items 27 27 27
Multiple subscales 3 None reported 3 (Continued)
1474 OWEN ET AL.
Journal of Research in Science Teaching. DOI 10.1002/tea
Appendix B
Dissertation Studies Using the Women in Science Scale (WiSS)
Al-Munea (1994) She (1994) Bailer (1998)
Study population
Size 504 18 213
Age (years) 13– 16 Not specified Not specified
Gender 243 F, 252 M 7 F, 11 M 54 F, 54 M
Race/ethnicity Not specified Not specified White, black, hispanic
Grades 7– 10 5 –6 7
Study sites Multiple Single Multiple
Regional area Within metro area Within district Within district
Appendix A (Continued)
Evans et al. (1995) Stake (2003) Wyer (2003)
Study population
Size 964 students 317 students 285 students
Age (years) Not specified Not specified Not specified
Gender Not specified Not specified 155 F, 130 M
Race/ethnicity Not specified White, black, Asian,
other
White, hispanic, black,
Asian, other
Grade levels 9 9–12 College Biology and
Engineering
Study sites Multiple Multiple Single (University)
Regional area Within state Within country Not specified
Study design
Assembly Convenience Convenience Convenience
Analysis PPTI, comparison
groups for sampling
PPTI, comparison
groups for sampling
Cross sectional
Reliability
Internal consistency None reported ¼.75 pre
¼.76 post
None reported
Test-retest None reported None reported None reported
SEM None reported None reported None reported
Validity
Content None reported None reported None reported
Discriminant None reported Science self-concept
regression
Weak odds ratio from
regression
Convergent None reported None reported None reported
Contrasting groups ANOVA ANOVA, t-test t-test, gender
Exploratory factor
analysis
PCA 7 factors None reported None reported
Confirmatory factor
analysis
None reported None reported None reported
Usability
Number of items 67 pre-test, 27 post-test 6 7 WiSS, 7 ATWS, 4 ISS
Multiple subscales 7 None reported None reported
PPTI, Pre/ Post test intervention; M, Male; F, Female.
a
Guttman split half.
(Continued)
REAPPRAISAL OF THE WISS 1475
Journal of Research in Science Teaching. DOI 10.1002/tea
References
Al-Munea, S. (1994). A descriptive study of the attitude toward women in science among the
adolescent students of different sex, grade and socio-economic status (ses) in Saudi Arabia.
Dissertation Abstracts International, 55, 2650 (UMI No. 9431501).
American Association of University Women (AAUW) Educational Foundation. (1992). How
schools short-change girls: Executive summary. Washington, DC: Author.
American Association of University Women (AAUW) Educational Foundation. (1998).
Gender gaps: Where schools still fail our children. Washington, DC: Author.
American Institute of Physics Statistical Research Center (AIPSRC). (2001). Women
physicists speak: 2001 international study of women in physics. Retrieved September 3, 2005,
from http: / /www.aip.org/ statistics/ trends/reports /iupap.pdf
Bailer, J. (1998). The effects of ‘‘Women are Scientists, Too’’ program on middle school
students’ perceptions of scientists and their attitudes toward women in science. Dissertation
Abstracts International, 59, 775 (UMI No. 9828307).
Campbell, J.R. (1991). The roots of gender inequity in technical areas. Journal of Research in
Science Teaching, 32, 243– 257.
Congressional Commission on the Advancement of Women and Minorities in Science,
Engineering and Technology Development (CAWMSET). (2000). Land of plenty: Diversity as
America’s competitive edge in science, engineering, and technology. Report of the Commission.
Washington, DC: Author.
Clewell, B.C. (2002). Equity, diversity, and retention: Using concepts of equity to achieve
diversity by increasing the retention of women and underrepresented minorities in the science
Appendix B (Continued)
Al-Munea (1994) She (1994) Bailer (1998)
Study design
Assembly Convenience Convenience Convenience
Analysis Comparison groups for
sampling
Cross sectional,
quasi-experimental
PPTI, quasi-
experimental
Reliability
Internal consistency ¼.88 ¼.92 Reported in Erb and
Smith (1984)
Test-retest Reported in Erb and
Smith (1984)
Reported in Erb and
Smith (1984)
Reported in Erb and
Smith (1984)
SEM None reported None reported None reported
Validity
Content 5 experts None reported None reported
Discriminant None reported None reported None reported
Convergent None reported WiSS, ISSS None reported
Contrasting groups t-test Intercorrelation matrix Regression
Exploratory factor
analysis
None reported None reported None reported
Confirmatory factor
analysis
None reported None reported None reported
Usability
# of Items 27 27 27
Multiple subscales Reported in Erb and
Smith (1984)
Reported in Erb and
Smith (1984)
Reported in Erb and
Smith (1984)
1476 OWEN ET AL.
Journal of Research in Science Teaching. DOI 10.1002/tea
and engineering pipeline. National Science Foundation Learning and Education: Building
knowledge, understanding its implications, article 1. Retrieved August 20, 2005, from http: / /
prospectassoc.com/NSF /Clewelll.htm
Clewell, B.C., & Burger, C.J. (2002). At the crossroads: Women, science, and engineering.
Journal of Women and Minorities in Science and Engineering, 8, 249–253.
Cross, R.M. (2005). Exploring attitudes: The case for Q methodology. Health Education
Research, 20, 206–213.
Darke, K., Clewell, B., & Sevo, R. (2002). Meeting the challenge: The impact of the National
Science Foundation’s Program for Women and Girls. Journal of Women and Minorities in Science
and Engineering, 8, 285–303.
Edwards, A.L. (1957). Techniques of attitude scale construction. New York: Appleton-
Century-Crofts.
Erb, T.O., & Smith, W.S. (1984). Validation of the Attitude Toward Women in Science Scale
for early adolescents. Journal of Research in Science Teaching, 21, 391–397.
Evans, M., Whigham, M., & Wang, M. (1995). The effect of a role model project upon the
attitudes of ninth-grade science students. Journal of Research in Science Teaching, 32, 195– 204.
Gardner, P. (1995). Measuring attitudes to science: Unidimensionality and internal
consistency revisited. Research in Science Education; 25, 283 –289.
Gardner, P. (1996). The dimensionality of attitude scales: A widely misunderstood idea.
International Journal of Science Education, 18, 913–919.
Gardner, P.L. (1975). Attitudes to science: a review. Studies in Science Education, 2, 1 –41.
Giacobbi, C. (1998). Assessment of internalization of negative female stereotypes among
Caucasian women. Dissertation Abstracts International, 58, 5117 (UMI No. 9809118).
Hammrich, P.L., Richardson, G., & Livingston, B. (2001). The sisters in science program: A
three year analysis. U.S. Department of Education. Retrieved on September 19, 2005, from http: / /
www.ed.psu.edu/ CI/ Journals/2001aets /s1_04_hammrich_richardson.rtf
Hu, L.T., & Bentler, P.M. (1999). Cutoff criteria for fit indexes in covariance structure
analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1– 55.
Huang, G., & Du, J. (2002). Computer use at home and at school: Does it relate to academic
performance? Journal of Women and Minorities in Science and Engineering, 8, 201–217.
Hughes, W. (2002). Gender attributions of science and academic attributes: An examination
of undergraduate science, mathematics and technology majors. Journal of Women and Minorities
in Science and Engineering, 8, 53 –65.
Koballa, T.R. (1988). Atttude and related concepts in science education. Science Education,
72, 115–126.
Koballa, T.R. (1989). Changing and measuring attitudes in the science classroom. Research
Matters—To the Science Teacher, a publication of the National Association for Research in
Science Teaching (NARST), 8901, 1–8. Retrieved on August 20, 2005, from http: / /
www.educ.sfu.ca/ narstsite/publications /research/ attitude.htm
Laforgia, J. (1988). The affective domain related to science education and its evaluation.
Science Education, 72, 407–421.
Marsh, H.W., Ellis, L., Parada, L., Richards, G., & Heubeck, B.G. (2005). A short version of
the Self Description Questionnaire II: Operationalizing criteria for short-form evaluation with
new applications of confirmatory factor analyses. Psychological Assessment, 17, 81 –102.
Morella, C. (2002). Recognizing a threat to American economy. Journal of Women and
Minorities in Science and Engineering, 8, 377 –380.
Mulkey, L. (1989). Validation of early childhood attitudes toward women in science scale
(ECWiSS): A pilot administration. Journal of Research in Science Teaching, 26, 737–753.
REAPPRAISAL OF THE WISS 1477
Journal of Research in Science Teaching. DOI 10.1002/tea
Munby, H. (1980). An evaluation of instruments which measure attitudes to science. In C.P.
McFadden (Ed.), World trends in science education (pp. 266– 275). Halifax, Nova Scotia: Atlantic
Institute of Education.
National Coalition for Women and Girls in Education (NCWGE). (2002). Title IX at 30:
Report card on gender equity. Washington, DC: Author.
Nunnally, J.C., & Bernstein, I.H. (1994). Psychometric theory (3rd ed.). New York: McGraw-
Hill.
Osborne, J., Simon, S., & Collins, S. (2003). Attitudes towards science: A review of the
literature and its implications. International Journal of Science Education, 25, 1049 –1079.
Pierce, C.A., Block, R.A., & Aguinis, H. (2004). Cautionary note on reporting eta-squared
values from multifactor anova designs. Educational and Psychological Measurement, 64, 916 –
924.
Ramsden, J. (1998). Mission impossible?: Can anything be done about attitudes to science?
International Journal of Science Education, 20, 125 –137.
Ridgill, I. (1975). Women employed in traditional and nontraditional occupations: A
comparison of their attitudes toward women and their work values. Dissertation Abstracts
International, 48, 839 (UMI No. 8714993).
Rosser, S., & Lane, E. (2002). A history of funding for women’s programs at the National
Science Foundation: From individual POWRE approaches to the advance of institutional
approaches. Journal of Women and Minorities in Science and Engineering, 8, 327– 346.
Sadker, D. (2000). Gender equity: Still knocking at the classroom door. Equity and
Excellence in Education, 33, 80 –83.
She, H.-C. (1994). The impact of biochemistry workshop on gifted children’s image of
science and scientists, women in science, and class participation. Dissertation Abstracts
International, 54, 2528 (UMI No. 9327866).
Simpson, R.D., Koballa, T.R., Oliver, J.S., & Crawley, F.E. (1994). Research on the affective
dimension of science learning. In D. Gabel (Ed.), Handbook of research in science teaching and
learning (pp. 211–234). New York: Macmillan.
Smith, G.T., McCarthy, D.M., & Anderson, K.G. (2000). On the sins of short-form
development. Psychological Assessment, 12, 102–111.
Smith, W.S., & Erb, T.O. (1986). Effect of women science career role models on early
adolescents’ attitudes toward scientists and women in science. Journal of Research in Science
Teaching, 23, 667– 676.
Spence, J.T., Helmreich, R., & Stapp, J. (1975). A short version of the Attitude toward Women
Scale (AWS). Princeton, NJ: Educational Testing Service.
Stake, J. (2003). Understanding male bias against girls and women in science. Journal of
Applied Social Psychology, 33, 667– 682.
Steiger, J.H., & Lind, J.M. (1980, May). Statistically based tests for the number of common
factors. Paper presented at the annual meeting of the Psychometric Society, Iowa City, IA.
Strauss, M.J. (1988). Gender bias in mathematics, science and technology: The report card
#3. Mid-Atlantic Equity Center. Retrieved on August 15, 2005, from http:/ /www.enc.org/ topics/
equity/articles /document.shtm?input¼ACQ-111578-1578
U.S. Department of Education, National Center for Education Statistics. (2004). Trends in
educational equity of girls & women: 2004 (NCES 2005-01 6). Washington, DC: U.S. Government
Printing Office.
Wyer, M. (2003). Intending to stay: Images of scientists, attitudes toward women, and gender
as influences on persistence among science and engineering majors. Journal of Women and
Minorities in Science and Engineering, 9, 1 –16.
1478 OWEN ET AL.
Journal of Research in Science Teaching. DOI 10.1002/tea