Page 1
SAT and ACT predict college GPA after removing g☆
Thomas R. Coyle⁎, David R. Pillow
University of Texas at San Antonio, USA
a r t i c l ei n f oa b s t r a c t
Article history:
Received 14 May 2007
Received in revised form 9 April 2008
Accepted 9 May 2008
Available online 18 June 2008
This research examined whether the SAT and ACT would predict college grade point average
(GPA) after removing g from the tests. SAT and ACT scores and freshman GPAs were obtained
froma universitysample (N=161) andthe 1997 NationalLongitudinal StudyofYouth (N=8984).
Structural equation modeling was used to examine relationships among g, GPA, and the SAT
and ACT. The g factor was estimated from commercial cognitive tests (e.g., Wonderlic and
Wechsler Adult Intelligence Scale) and the computer-adaptive Armed Services Vocational
Aptitude Battery. The unique variances of the SAT and ACT, obtained after removing g, were
used to predict GPA. Results from both samples converged: While the SATand ACT were highly
g loaded, both tests generally predicted GPA after removing g. These results suggest that the
SAT and ACT are strongly related to g, which is related to IQ and intelligence tests. They also
suggest that the SAT and ACT predict GPA from non-g factors. Further research is needed to
identify the non-g factors that contribute to the predictive validity of the SAT and ACT.
© 2008 Elsevier Inc. All rights reserved.
Keywords:
g factor
SAT
ACT
Grade point average
GPA
Intelligence
IQ
Structural equation modeling
SEM
National Longitudinal Study of Youth
NLSY
Armed Services Vocational Aptitude Battery
ASVAB
The g factor refers to a latent construct that represents
variance common to a large and diverse setof cognitive tests. A
cognitive test's loading on the g factor is directly related to its
predictive validity (Jensen, 1998, pp. 274–294). Cognitive tests
with high g loadings reliably predict important life outcomes
(e.g., school grades or job performance). Cognitive tests with
low g loadings less reliably predict such outcomes. The g factor
largely explains a cognitive test's validity coefficient, which
typically drops sharply in magnitude when g is controlled (e.g.,
Jensen,1998, Table 2.2 and 2.3, pp. 29–30).
The research reported here examines the g loading and
predictive validity of the SAT (formerly Scholastic Assessment
Test and, prior to that, Scholastic Aptitude Test) and the ACT
(formerly American College Testing Program Assessment).1
Both tests are given each year to millions of high school
students in the United States. The tests are four hour
Intelligence 36 (2008) 719–729
☆ This research was supported by the San Antonio Area Foundation; a
University of Texas at San Antonio Faculty Research Award; UTHSCSA GCRC
(M01-RR-01346); and the Human Brain Mapping Project, which is jointly
funded by NIMH and NIDA (P20 MH/DA52176). We thank Deborah Coyle,
Douglas Detterman, Wendy Johnson, Nathan Brody, Annette Fields, and an
anonymous reviewer for constructive comments.
⁎ Corresponding author. University of Texas at San Antonio, Department of
Psychology, One UTSA Circle, San Antonio, Texas 78249, USA. Tel.: +1210 458
7407; fax: +1 210 458 5728.
E-mail address: Thomas.Coyle@utsa.edu (T.R. Coyle).
1The SAT and ACT have been revised several times. The research
presented here examined the recentered SAT I, which includes verbal and
math subtests; and the enhanced ACT, which includes English, math,
science, and reading subtests. The SAT I was administered from March 1994
to March 2005. It was preceded by an earlier version of the SAT (called the
Scholastic Aptitude Test), which also included verbal and math subtests. The
SAT I has been replaced by a new version of the SAT (called the SAT
Reasoning Test), which includes math, critical reading, and writing subtests.
The enhanced ACT was first administered in October 1989 and is currently in
use. (“Enhanced” has been dropped from its title.) It was preceded by an
earlier version of the ACT, which also included English, math, science, and
reading subtests. The earlier version adjusted for grade level, a practice that
was discontinued in the mid-1980s.
0160-2896/$ – see front matter © 2008 Elsevier Inc. All rights reserved.
doi:10.1016/j.intell.2008.05.001
Contents lists available at ScienceDirect
Intelligence
Page 2
assessments of academic skills. The SAT measures verbal
comprehension and math skills; the latest version also
measures writing skill. The ACT measures competence in
four academic areas (English, math, reading, and science); a
writing component is optional.
The SATand ACT predict college grade point average (GPA),
itself ag loaded variable (e.g., Coyle, 2006), during the first year
of college (e.g., Bridgeman, McCamley-Jenkins, & Ervin, 2000;
Noble, 2000). Both tests have been called achievement tests,
which are used to assess prior learning and knowledge, and
aptitude tests, which are used to assess learning potential in a
specific domain (e.g., academics). The tests have also been
called intelligence tests (Frey & Detterman, 2004), but the
usage is peculiar. Neither test is marketed as an intelligence
test, which is assumed to assess general mental ability on all
mental tasks (for a discussion of problems with verbal
definitions of intelligence, see Jensen, 1998, pp. 46–49).
Detterman and colleagues have examined the g loading
of the SAT (Frey & Detterman, 2004) and the ACT (Koenig,
Frey, & Detterman, 2008). Test scores for the SAT, ACT, Armed
Services Vocational Aptitude Battery (ASVAB), and several
mental ability tests were obtained from the 1979 National
Longitudinal Study of Youth (NLSY79; N=12,686). The
ASVAB included 10 subtests of different aptitudes (e.g.,
science, arithmetic reasoning, word knowledge, coding
speed). The mental ability tests included the Differential
Aptitude Test and Otis–Lenon Mental Ability Test, among
others. SAT and ACT scores were correlated with the mental
ability tests and with g factor scores extracted from the
ASVAB. SAT and ACT scores correlated strongly and sig-
nificantly with all mental ability tests (mean r=.74,
range=.56 to .82) and with g factor scores from the ASVAB
(r=.82 and .77 for the SATand ACT, respectively). Subsequent
analyses indicated that SAT and ACT scores from indepen-
dent university samples were significantly correlated with
Raven'sAdvancedProgressiveMatrices(rcorrectedforrange
restriction=.72 and .75 for the SAT and ACT, respectively), a
test of non-verbal reasoning that is often highly g loaded
(e.g., Marshalek, Lohman, & Snow, 1983).
The current research attempts to confirm the strong g
loadings of the SAT and ACT obtained by Detterman and
colleagues (Frey & Detterman, 2004; Koenig et al., 2008)
with different sets of cognitive tests. The g loading of a
specific test is not fixed but can vary with the set of tests
used to estimate g. Such variability is possible even though g
factor scores from broadly constructed test batteries are
often completely correlated (r=1.00, Johnson, Bouchard,
Krueger, McGue, & Gottesman, 2004; see also, Johnson, te
Nijenhuis, & Bouchard, 2008). If the high g loadings of the
SAT and ACT obtained by Detterman and colleagues are
robust and reliable, then those loadings should be replicable
with other test batteries.
Although the g loading of the SAT and ACT has been
estimated by Detterman and colleagues (Frey & Detterman,
2004; Koenig et al., 2008), the extent to which g explains
SAT and ACT correlations with college GPA is, surprisingly,
unknown. The SAT is typically moderately correlated
with college GPA (r=.34 to .35, Bridgeman et al., 2000,
Table 1, p. 4), as is the ACT (median r=.27 to .36, Noble, 2000,
Table 2, p.12). The SAT and ACT are, as noted above, strongly
correlated with g (Frey & Detterman, 2004; Koenig et al.,
2008), which predicts college GPA (r=.55 to .62, Coyle, 2006,
Table 3, p. 20; see also, Jensen, 1998, p. 280). Because
removing g from cognitive tests typically produces sharp
declines in the tests' validity coefficients, removing g from
the SAT or ACT should sharply reduce SAT and ACT validity
coefficients for GPA.
The central question of the current research is whether
the SAT and ACT predict college GPA after g is removed
from the tests. SAT and ACT scores, as well as college GPAs,
were obtained from a university sample (N=161) and the
1997 National Longitudinal Study of Youth (NLSY97;
N=8984). SAT scores were the sum of the verbal and
math subtests. ACT scores were the ACT composite scores,
which represent overall performance on the ACT. SAT and
ACT composite scores have been shown to predict college
GPA better than SATand ACT subtest scores (e.g., Bridgeman
et al., 2000).
The two samples received different sets of cognitive tests.
The university sample received selected subtests from the
Wechsler Adult Intelligence Scale (Coding, Digit Span,
Information, Picture Completion), Wonderlic Personnel
Test, and Raven's Advanced Progressive Matrices. The
NLSY97 received 12 subtests from a computer-adaptive
version of the ASVAB. Each set of cognitive tests included
items with diverse content (e.g., verbal vs. non-verbal) and
processing demands (e.g., memory demands vs. no memory
demands).
Structural equation modeling (SEM) was used to examine
relationships among g, GPA, and the SAT and ACT. SEM was
also used to determine if the SAT and ACT predicted college
GPA after removing g. It was predicted that (a) the SAT and
ACT would be highly g loaded, and (b) the SATand ACT would
not predict college GPA after removing g. Prediction (a) was
based on research by Detterman and colleagues (Frey &
Detterman, 2004; Koenig et al., 2008), who found the SATand
ACT to be highly g loaded using the ASVAB with the NLSY79.
Prediction (b) was based on the general finding that removing
g from cognitive tests neutralizes their predictive validity for
academic achievement (e.g., Jensen, 1998, Table 2.2 and 2.3,
pp. 29–30).
1. Study 1: University sample
1.1. Methods
1.1.1. Participants
Participants (M age=19.07 years, SD=.78 years) were 45
males and 116 females who were recruited from the
introductory psychology subject pool at the University of
Texas at San Antonio (UTSA). The sample included 83 whites,
36 Hispanics, 20 Asians, 12 blacks, 4 American Indians, 4
people of mixed race, and 2 people who did not specifya race.
All participants (N=161) had taken the recentered SAT I. A
subset of these participants (n=88) had also taken the
enhanced ACT test. In the event that participants took either
test more than once, only test scores obtained on the first
testing attempt were used. Cognitive test scores and college
GPAs for the first semester of the freshman year were
available for all participants. College GPAs for the second
semester of the freshman year were available for 158
participants.
720
T.R. Coyle, D.R. Pillow / Intelligence 36 (2008) 719–729
Page 3
1.1.2. Variables
SAT and ACT. SAT and ACT scores were obtained from
university records. Following convention and prior research
(Frey & Detterman, 2004), SATscores were the sum of the SAT
math and verbal subtests. ACTscores were the ACTcomposite
scores, which represent aggregate performance across the
ACT subtests (English, math, reading, and science).
Cognitive tests. A diverse set of cognitive tests was used to
estimate g. The set included selected subtests from the
Wechsler Adult Intelligence Scale, Third Edition (WAIS-III;
Harcourt Assessment, San Antonio, TX). The WAIS subtests
were Digit Symbol Coding (Coding), Digit Span, Information,
and Picture Completion. The set also included the Wonderlic
Personnel Test (Wonderlic, Libertyville, IL) and Raven's
Fig.1. SEM path coefficients for the SAT model (1A) and the ACT model (1B) for the university sample (Study 1). All path coefficients are significant at pb.05, except
for the path coefficients from g to Coding (both models) and from ACT unique variance to GPA (ACT model).
Table 1
Study 1: Correlations among, and means and standard deviations for, SAT, ACT, and cognitive variables (university sample)
Variable123456789 10
M
SD
n
1. SAT
2. ACT
3. GPA1
4. GPA2
5. Coding
6. Digit span
7. Information
8. Picture completion
9. Raven
10. Wonderlic
–
.83⁎⁎
–
.37⁎⁎
.25⁎
–
.29⁎⁎
.22⁎
.61⁎⁎
–
.14
.07
.16⁎
.12
–
.23⁎⁎
.28⁎⁎
.11
.12
.12
–
.49⁎⁎
.45⁎⁎
.10
.14
−.05
.10
–
.20⁎
.24⁎
.16⁎
.13
.05
.21⁎⁎
.32⁎⁎
–
.38⁎⁎
.42⁎⁎
.19⁎
.18⁎
−.03
.12
.29⁎⁎
.23⁎⁎
–
.70⁎⁎
.71⁎⁎
.15
.11
.11
.26⁎⁎
.39⁎⁎
.18⁎
.29⁎⁎
–
1032.17
21.75
2.87
2.88
86.76
17.53
16.68
16.33
5.66
23.86
149.84
3.62
161
88
161
158
161
161
161
161
161
161
.75
.90
12.20
3.64
4.31
3.38
1.95
4.79
Note. N=161. ⁎pb.05; ⁎⁎pb.01.
721
T.R. Coyle, D.R. Pillow / Intelligence 36 (2008) 719–729
Page 4
Advanced Progressive Matrices (Raven; Harcourt Assessment,
San Antonio, TX). A short form of the Raven was used to
reduce administration time. The short form presented every
third item from the Raven (12 items total), which contains 36
items in full form.
Each cognitive test measured different skills or aptitudes.
Coding measured the speed of copying symbols paired with
numbers from a key. Digit Span measured immediate au-
ditory memory for digits. Information measured knowledge
and retrieval of facts from long term memory. Picture
Completion measured perceptual organization and knowl-
edge of objects and scenes. Raven measured matrix reason-
ing and pattern recognition for complex line drawings.
Wonderlic measured general cognitive ability. Test content
was homogenous for all tests except Wonderlic, which
contained 50 items of diverse content (e.g., number series,
logic problems, vocabulary questions). The Wonderlic
correlates highly with WAIS full-scale IQ (rz.85, Wonderlic,
1999).
The cognitive tests were administered individually to
participants by a trained experimenter, and were scored
according to testing manual instructions. Following WAIS
recommendations (Wechsler,1997, pp. 36–38), the tests were
presented in a fixed order with the general stipulation that
tests involving mostly verbal (Information) and non-verbal
(Picture Completion) stimuli not be presented contiguously.
Statistical analyses were performed using raw, unadjusted
scores (viz., items correct), which were available for all tests.
GPA: GPAs were obtained from university records. GPAs
were the average number of grade points earned during the
first (GPA1) and second (GPA2) terms of college for students
on a semester schedule (i.e., two terms per academic year).
The SAT and ACT have been shown to predict college GPA
during the freshman year (e.g., Bridgeman et al., 2000; Noble,
2000).
1.1.3. Statistics and SEM models
SEM with maximum likelihood estimation was used to
model relationships among g, GPA, and the SAT and ACT. The
g factor was estimated as a first order factor using the
cognitive tests and the SAT and ACT. Path coefficients from g
to SAT and ACT represented the g loading of the tests (cf.
Brodnick & Ree, 1995). Path coefficients from the unique
variances of the SAT and ACT to GPA estimated the validity
coefficients of the SAT and ACT for GPA with g removed from
the tests (for a similar approach involving inhibition con-
structs, see Friedman & Miyake, 2004).
The SEM models partitioned SAT and ACT effects into a g
loading, represented by a path from g to SAT or ACT; and a
unique variance loading, represented by a path from SAT or
ACT unique variance to the tests themselves (e.g., Fig. 1A and
B). This partitioning isolated the SAT and ACT g loadings from
their unique (non-g) variances, which were used to predict
GPA. Isolating the SAT and ACT unique variances permitted a
direct test of whether the SAT or ACT would predict GPA after
removing g—the central question of the study. Moreover,
isolating the SAT and ACT unique variances (from their g
loadings) avoided the problem of high multicollinearity
between g and the SAT or ACT (g loadingsz.90; Fig. 1A and
B). Such high multicollinearity would have distorted path
estimates from g to GPA if the paths from the SAT or ACT
unique variances to GPA had been omitted from the models
and the SAT or ACT had loaded directly on GPA.2
Following Kline (2005, pp. 135–142), SEM model fit was
evaluated with theχ2statistic,Comparative Fit Index (CFI), and
Root Mean Square Error of Approximation (RMSEA). Because
theχ2testsforperfectpopulationfitofthespecifiedmodel,itis
considered an overly stringent test of fit, especially for large
samples (NN200). The CFI and RMSEA do not assume perfect
population fit. The CFI estimates the improvement in fit of the
specified model compared to the independence model, which
assumes zero population covariances among observed vari-
ables. CFI values range from 0 to 1, and values greater than .90
suggest reasonably good fit (Hu & Bentler, 1999). The RMSEA
estimates the error of approximation for the specified model
and independence model, adjusting for degrees of freedom.
RMSEA values less than or equal to .05 suggest close fit, values
between .05 and .08 suggest reasonable fit, and values greater
than .10 suggest poor fit (Browne & Cudeck,1993).
Because variables with different scaling properties can
distort model fit, all variables were standardized prior to SEM.
Missing data were handled using full information maximum
likelihood (FIML), which yields less biased estimates than
pairwise or listwise deletion (Enders & Bandalos, 2001).
Standardized coefficients are reported for all SEM analyses.
Approximate values for standardized coefficients and fit
statistics are represented using the ‘≈’ symbol, which
indicates the statistics differed±.02 from their exact values
(for exact values, see Tables 1–3, and Figs.1 and 2). Borrowing
from Cohen (1988), standardized coefficients around .10, .30,
and .50 were interpreted as weak, moderate, and strong,
respectively. Significant effects are reported at pb.05.
1.2. Results
1.2.1. Preliminary analyses
Table 1 reports correlations of SATand ACTscores with the
cognitive tests and GPAs. (Correlations with missing data
were computed using pairwise deletion.) The mostly positive
correlations among the variables indicate positive manifold,
which suggests a common latent factor (g). SAT and ACT
scores were highly and significantly correlated with each
other (r=.83). SAT and ACT scores were also significantly
correlated (to varying degrees) with all cognitive variables
except Coding, which was not reliably related to SAT or ACT
2Such distortions were observed in alternative SEM models for the
university sample and the NLSY97. These models were structurally identical
to the ones reported in this article (Figs. 1 and 2), with one exception: They
replaced the path from SATor ACT unique variance to GPA with a direct path
from the SATor ACT to GPA. The models indicated that the SATand ACT were
significantly (pb.05) related to g (M coefficient=.84) and to GPA (M
coefficient=.49), with one exception. (The exception concerned the
university sample ACT model, in which the ACT was not significantly related
to GPA.) However, g was never reliably related to GPA (M coefficient=−.11,
psz.21). The latter finding is inconsistent with research indicating that g
is typically significantly and positively related to GPA (e.g., Coyle, 2006). The
non-significant relationships between g and GPA can be attributed to high
multicollinearity between g and the SAT and ACT. This multicollinearity
suppressed and, in most cases, inverted, relationships between g and GPA
after removing SAT or ACT variance from g. The SEM models used in this
article (Figs. 1 and 2) avoided the multicollinearity problem by separating
the SAT and ACT g loadings from their unique (non-g) variances, which were
used to predict GPA.
722
T.R. Coyle, D.R. Pillow / Intelligence 36 (2008) 719–729
Page 5
scores. The mean correlation (Mr) of SAT and ACT scores with
each cognitive variable was computed. SAT and ACT scores
had the strongest correlation with Wonderlic, a highly g
loaded test, and the weakest correlation with Coding, a
weakly g loaded test (Mr=.11, .26, .31, .26, .47, .22, .40, and .71
for Coding, Digit Span, GPA1, GPA2, Information, Picture
Completion, Raven, and Wonderlic, respectively). The means
and standard deviations of the SAT and ACT and cognitive
Table 2
Study 2: Correlations among, and means and standard deviations for, SAT, ACT, and cognitive variables (NLSY97)
Variable123456789 10 1112 13141516
1. SAT
2. ACT
3. GPA1
4. GPA2
5. Arithmetic reasoning
6. Assembling objects
7. Automobile information
8. Coding speed
9. Electronics information
10. General science
11. Mathematics knowledge
12. Mechanical comprehension
13. Numerical operations
14. Paragraph comprehension
15. Shop information
16. Word knowledge
–
.87
–
.36
.33
–
.35
.29
.63
–
.69
.67
.24
.23
–
.47
.54
.21
.18
.58
–
.28
.29
.05
.07
.40
.26
–
.30
.36
.15
.15
.42
.39
.18
–
.48
.47
.09
.12
.52
.42
.52
.25
–
.66
.67
.17
.18
.65
.48
.49
.32
.67
–
.60
.59
.23
.25
.76
.53
.33
.50
.48
.62
–
.53
.52
.14
.18
.59
.55
.47
.27
.60
.65
.52
–
.33
.38
.17
.18
.48
.28
.18
.52
.22
.32
.58
.25
–
.62
.62
.25
.26
.63
.51
.33
.37
.50
.63
.60
.51
.35
–
.43
.45
.05
.09
.46
.39
.53
.15
.59
.60
.38
.63
.14
.37
–
.65
.66
.19
.23
.59
.42
.41
.35
.59
.74
.60
.54
.35
.67
.50
–
Variable
M
SD
N
1. SAT
2. ACT
3. GPA1
4. GPA2
5. Arithmetic reasoning
6. Assembling objects
7. Automobile information
8. Coding speed
9. Electronics information
10. General science
11. Mathematics knowledge
12. Mechanical comprehension
13. Numerical operations
14. Paragraph comprehension
15. Shop information
16. Word knowledge
1023.23
21.53
2.99
3.03
199.11
4.80
980
898
1591
1373
1370
1360
1364
1350
1365
1371
1368
1362
1350
1368
1363
1370
.71
.67
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
1.00
.00
.00
.00
.00
.00
.00
.00
.00
.00
.00
.00
.00
Note. N=1591. All correlations are significant at pb.01 with three exceptions: automobile information with GPA1 (p=.10) and GPA2 (p=.03), and shop
information with GPA1 (p=.08). GPAs are converted to a standard 4.0 scale. ASVAB scores are converted to z scores.
Table 3
Study 2: Model fit statistics and standardized path coefficients for the NLSY97
Model fit statisticsStandardized coefficients
AnalysisModel
χ2(df)CFI RMSEA
g→GPA
.315
.311
Unique→GPA
.288
.276
g→SAT/ACT
.784
.755
1
2
SAT
ACT
1940.297 (88)
1910.500 (88)
.836
.837
.115
.114
Supplemental analyses
3
4
5
6
7
8
9
10
11
SAT-Unique
ACT-Unique
SAT-Group
ACT-Group
SATM
SATV
ACTE
ACTM
ACTR
501.007 (66)
470.374 (66)
1005.891 (85)
960.480 (85)
1962.424 (88)
2047.811 (88)
1968.078 (88)
1955.905 (88)
1942.197 (88)
.962
.964
.919
.922
.832
.826
.830
.832
.832
.064
.062
.083
.080
.116
.118
.116
.116
.115
.311
.297
.309
.304
.317
.313
.315
.307
.314
.293
.302
.285
.284
.206
.278
.256
.243
.203
.795
.752
.790
.770
.727
.734
.690
.699
.676
Note. N=1591. All χ2s and standardized coefficients are significant at pb.001.
Model: SAT = SAT model; ACT = ACT model; Unique = correlations among tests' unique variances taken into account; Group = g estimated as a second order factor
with group factors determined by exploratory factor analysis.
Model fit statistics: CFI = comparative fit index; RMSEA = root mean square error of approximation.
Standardized coefficients: g→GPA = g to GPA; Unique→GPA = SAT or ACT unique variance to GPA; gg→SAT/ACT = g loading of SAT or ACT.
723
T.R. Coyle, D.R. Pillow / Intelligence 36 (2008) 719–729
Page 6
variables did not signal problems with variance compression.
Such problems were not expected; data were obtained from a
university with high admission rates (99% for first-time, full-
time new undergraduates; UTSA Fact Book, n.d.), which
minimizes range restriction problems.3
1.2.2. SEM
The SEM models are depicted in Fig. 1A and B for the SAT
and ACT, respectively. The g factor was estimated as a first
order factor using the cognitive tests and the SAT or ACT. The
GPA factor was estimated using the two semester GPAs (GPA1
and GPA2). The unique variances (i.e., u's in Fig. 1A and B)
represented a test's unique variance after removing g. Three
paths measured relationships of interest. Paths from the
unique variance of the SAT or ACT to GPA measured the
predictive validity of the SAT or ACT for GPA with g removed
from the tests. Paths from g to SAT or ACT measured the g
loading of the SAT or ACT. Paths from g to GPA measured the
predictive validity of g for GPA. SEM was performed
separately for the SAT and ACT models.
Fig. 1A and B report standardized path coefficients for the
SAT and ACT models. Model fit was acceptable in the SAT
model (χ2=37.97, df=25, p=.05; CFI=.96, RMSEA=.06) and
the ACT model (χ2=32.51, df=25, p=.14; CFI=.97,
RMSEA=.04). Path coefficients (i.e., factor loadings) from g
to SAT or ACT were significant and very strong (coeffi-
cients≈.91). Path coefficients from g to the cognitive tests
were also significant (M coefficient=.41, range=.12 to .78),
with one exception. The exception was the path from g to
Coding, which was not significant in either model. Path
coefficients from g to GPA were significant and moderate in
magnitude (coefficients=.28 and .31 for SAT and ACT models,
respectively). Path coefficients from SAT or ACT unique
Fig. 2. SEM path coefficients for the SAT model (2A) and the ACT model (2B) for the NLSY97 (Study 2). All path coefficients are significant at pb.001.
3SAT and ACT correlations with first and second semester college GPA
were corrected for restriction of range. The standard deviations used to
correct the correlations were obtained from Dorans (1999), whose large
sample (N=103525) took both the SAT and ACT. The corrected SAT
correlations were higher than, but not significantly different from (Fisher's
r to z difference test, psN.05), the uncorrected correlations for first semester
GPA (.46 and .37, z=.97) or second semester GPA (.38 and .29, z=.90). The
corrected ACT correlations were also higher than, but not significantly
different from, the uncorrected correlations for first semester GPA (.34 and
.25, z=.64) or second semester GPA (.30 and .22, z=.56). These data suggest
that restriction of range had no material effect on SAT or ACT correlations
with GPA. The overall pattern of results for the university sample was
replicated in the large, nationally representative NLSY97 (Study 2). The
NLSY97 showed no evidence of range restriction (SDs=199 and 4.8 for SAT
and ACT, respectively; see Table 3) compared to the large Dorans (1999)
sample (SDs=194 and 4.9 for SAT and ACT, respectively).
724
T.R. Coyle, D.R. Pillow / Intelligence 36 (2008) 719–729
Page 7
variance to GPA were significant in the SAT model only
(coefficients=.36 and .10 for SATand ACTmodels,respectively).
It was possible that unique variance from the SAT or any
cognitive test would reliably predict GPA. Such a finding
would call into question the discriminant validity of the
significant path from SAT unique variance to GPA. Addres-
sing this issue were supplemental SEM analyses, which
computed path coefficients to GPA from the unique variance
of each cognitive test (Raven, WAIS subtests, or Wonderlic).
These analyses used models identical in structure to the
previous SAT model (Fig. 1A) with one exception: Each
analysis removed the path to GPA from SAT unique variance
and added a path to GPA from the unique variance of each
cognitive test. (Because ACT unique variance did not reliably
predict GPA, parallel analyses were not performed with the
ACT model.) The analyses were performed separately for
each of the six cognitive tests, resulting in a total of six
analyses.
The results of the analyses indicated that the path
coefficients to GPA from the cognitive tests' unique variances
were lower in magnitude than the analogous path coefficient
from the SAT's unique variance (coefficients=.14, .03, −.11, .09,
.05, and −.28 for Coding, Digit Span, Information, Picture
Completion, Raven, and Wonderlic, respectively). The only
cognitive test with a significant path coefficient was the
Wonderlic, whose unique variance was negatively related to
GPA. The negative relationship could be attributed to the
strong relationship of the Wonderlic and g (coefficient=.78,
Fig. 1A) and the moderate relationship of g and GPA
(coefficient=.28, Fig. 1A). These relationships suppressed
and inverted the Wonderlic's weak and positive relationship
with GPA (r≈.13, Table 1) after g was removed from the
Wonderlic.
The results of Study 1 suggest that the SAT and ACT are
highly g loaded (loadings≈.91). The significant path from SAT
unique variance to GPA suggests that the predictive validity of
the SAT is partly attributed to non-g variance. The non-
significant path from ACT unique variance to GPA could be
attributed to the relatively small number of cases with ACT
scores (88 of 161; all cases had SAT scores), which produced a
high percentage of cases with missing ACT data (45%). Such a
high percentage of cases with missing data (in a relatively
small sample) could have biased parameter estimates and
yielded non-representative findings (e.g., Enders & Bandalos,
2001, Table 9, p. 451).4Study 2 examines relationships among
g, GPA, and the SAT and ACT using the large, nationally
representative NLSY97, which received a different set of
cognitive tests.
2. Study 2: NLSY97
2.1. Methods
2.1.1. Participants
Participants were sampled from the NLSY97 (http://www.
bls.gov/nls/nlsy97.htm), which was obtained from the Inter-
university Consortium for Political and Social Research
(ICPSR; http://www.icpsr.org). The NLSY97 provides demo-
graphic, educational, and occupational data from a nationally
representative sample of youth in the United States. The full
sample consists of 8984 people (4599 males) age 12–16 years
in 1996 (M age=14.00 years, SD=1.40 years). The sample
includes 4665 whites (non-black or non-Hispanic), 2335
blacks, 1901 Hispanics, and 83 people of mixed race.
The following variables were selected from the NLSY97:
SAT and ACT scores, first and second semester college GPAs,
and cognitive test scores from the computer-adaptive Armed
Services Vocational Aptitude Battery (CAT-ASVAB). (As in
Study 1, SATscores were recentered SATscores and ACTscores
were enhanced ACT scores.) A total of 7127 participants had
scores on at least one CAT-ASVAB subtest, 3867 had first
semester college GPAs, 3142 had second semester college
GPAs, 1401 had SAT scores, and 1301 had ACT scores.
2.1.2. Variables
2.1.2.1. SAT and ACT. As in Study 1, SAT scores were the sum of
the SAT math and verbal subtests. ACT scores were the ACT
composite scores, which represent aggregate performance
across all ACT subtests.
2.1.2.2. CAT-ASVAB. The g factor was estimated using the CAT-
ASVAB (http://www.bls.gov/nls/quex/r6/y97r6cbkapp10.pdf).
The CAT-ASVAB consisted of 12 subtests: Arithmetic Reason-
ing (AR), which measured the ability to solve math word
problems; Assembling Objects (AO), which measured spatial
problem solving ability; Auto Information (AI), which mea-
sured knowledge of automotive maintenance and repair;
Coding Speed (CS), which measured the speed of matching
numbers with words from a key; Electronics Information (EI),
which measured knowledge of electricity and electronics;
General Science (GS), which measured knowledge of biologi-
cal and physical science; Mathematics Knowledge (MK),
which measured knowledge of high school math principles;
Mechanical Comprehension (MC), which measured knowl-
edge of mechanical devices and properties of materials;
Numerical Operations (NO), which measured the speed of
computing simple numerical operations; Paragraph Compre-
hension (PC), which measured the ability to understand
written material; Shop Information (SI), which measured
knowledge of wood and metal shop practices; and Word
Knowledge (WK), which measured the ability to identify
synonyms and antonyms.
Two CAT-ASVAB subtests (CS and NO) were speeded tests;
the others were power tests. The power subtests were
administered using an adaptive testing procedure, which
matched the difficulty of test items to the ability level of
participants. Because participants answered different sets of
items, correct responses could not be used to compare
participants. Instead, subtest scores based on item response
4Such bias was apparent in a separate SEM analysis of cases with ACT
scores (n=88), which omitted cases with SAT scores only. (All 161
participants had SAT scores.) New ACT fit statistics and path coefficients
were estimated using a model identical in structure to the prior model (Fig.
1B). Model fit was excellent (χ2=23.03, df=25, p=.58; CFI=1.00,
RMSEA=.01). The path coefficient from ACT unique variance to GPA (.19)
was still non-significant but nearly doubled in magnitude compared to the
prior coefficient (.10, Fig. 1B). The path coefficients from g to GPA (.24) and
from g to ACT (.89) varied less in magnitude compared to the prior
coefficients (.31 and .92 for g to GPA and g to ACT, respectively, Fig. 1B).
725
T.R. Coyle, D.R. Pillow / Intelligence 36 (2008) 719–729
Page 8
theory were computed on a comparable scale for each
participant. Lower scores indicated poorer performance;
higher scores indicated better performance. The speeded
tests were administered in a non-adaptive format, where
participants answered test items in the same order. Scores on
the speeded subtests were based on the proportion of correct
responses, adjusting for guessing and screen presentation
time.
GPA. GPAs were obtained from student self-reports, which
are typically highly correlated with GPAs from college records
(r=.90, Kuncel, Credé, & Thomas, 2005, Table 1, p. 73). GPAs
weretheaveragenumberofgradepointsearnedduringthefirst
andsecondtermsofcollegeforstudentsonasemesterschedule
(i.e., two terms per academic year). GPA distributions for both
semesters deviated from normality (skewnessN1; kurtosisN3).
Squaring the GPAs of individual subjects normalized the
distributions (skewnessb1; kurtosisb1). Because SEM is
sensitive to distributional distortions, transformed (squared)
GPAs were used in all analyses.
2.1.3. Statistics and case sampling
As in Study 1, SEM with maximum likelihood estimation
was used to determine if the SAT and ACT predicted college
GPA after removing g. All statistical analyses were performed
using only cases with SAT or ACT scores and with semester
GPAs (i.e., GPAs based on two terms per academic year)
(N=1591). Cases with semester GPAs were much more
common than cases with other GPAs (83%, 11%, 3%, and 3%
for semester, quarter, trimester, and other, respectively).
Using such cases ensured that GPAs (for each term) were
based on equivalent time units. All variables were standar-
dized prior to SEM, and missing data were handled using
FIML. Significant effects are reported at pb.05.
2.2. Results
2.2.1. Preliminary analyses
Table 2 reports correlations of SAT and ACT scores with
GPA and the CAT-ASVAB subtests. (Correlations with missing
data were computed using pairwise deletion.) The positive
correlations among the variables indicate positive manifold,
which suggests a common latent factor (g). SAT and ACT
scores were highly and significantly correlated with each
other (r=.87) and moderately to highly correlated with first
and second semester GPA (Mr=.33) and the ASVAB subtests
(Mr=.51). SAT and ACT scores correlated strongly (Mr=.64)
with ASVAB subtests of academic aptitudes (AR, GS, MK, PC,
WK), and moderately (Mr=.42) with ASVAB subtests of non-
academic (or less obviously academic) aptitudes (AO, AI, CS,
EI, MC, NO, SI). The mean difference between SAT and ACT
correlations with the ASVAB subtests and GPAs was trivial (M
difference=−.01).
2.2.2. SEM
The SEM models are depicted in Fig. 2A and B for the SAT
and ACT, respectively. The models are structurally equivalent
to the previous models (Fig. 1A and B), except that g was
estimated using the CAT-ASVAB subtests.
Fig. 2A and B report standardized path coefficients for the
SAT and ACT models. Model fit statistics and selected path
coefficients are reported in Table 3 (Analyses 1 and 2). Model
fit was poor in both models (CFIs=.84, RMSEAs≈.11). The poor
fits could be attributed to correlations among the ASVAB
subtests' unique variances, which are examined in the next
section (Supplemental Analyses of NLSY97). The pattern of path
coefficients was generally consistent with the pattern in
Study 1, except that ACT unique variance now significantly
predicted GPA. Path coefficients from SAT or ACT unique
variance to GPA were significant and moderate in magnitude
(coefficients≈.28),5as were path coefficients from g to GPA
(coefficients≈.31).6Path coefficients from g to SAT and ACT
were significant and strong (coefficients≈.77), as were path
coefficients from g to the ASVAB subtests (M coefficient=.68,
range=.46 to .85).7
Additional SEM analyses examined the discriminant
validity of the paths from SAT and ACT unique variances to
GPA vis-à-vis the paths from the ASVAB subtests' unique
variances to GPA. These analyses used models identical in
structure to the previous models (Fig. 2A and B) with one
exception: Each analysis removed the path to GPA fromSATor
ACT unique variance and added a path to GPA from each
ASVAB subtest's unique variance. The analyses were per-
formed separately for each ASVAB subtest using the SAT and
ACT SEM models, resulting in a total of 24 analyses (12 ASVAB
subtests×2 SEM models). The path coefficients to GPA from
each ASVAB subtest's unique variance were lower (M coef-
ficient=−.02, range=−.18 to .14) than the path coefficients to
GPA from SAT or ACT unique variance (Fig. 2A and B). The
5Path coefficients from the SAT and ACT unique variances to GPA were
replicated using an alternative SEM model, which included both the SAT and
ACT. This model was structurally identical to the ones reported in this article
(Fig. 2A and B), with one exception: It included both the SAT and ACT and
their unique variances. The SAT and ACT unique variances were linked by a
double-sided arrow, which represented their covariance (covariance=.69).
Separate paths from the SAT and ACT unique variances were drawn to GPA.
These paths were used to examine whether the tests predicted GPA after
removing g. The path coefficients from the SAT and ACT unique variances to
GPA were small but significant (coefficients=.15 and .17 for the SAT and ACT,
respectively). These coefficients represented only direct effects. Using SEM
tracing rules, the total effects were .27 for both tests. (Total effects were
based on the SAT or ACT unique variance's direct effect to GPA plus the
product of the path linking both tests' unique variances and the path linking
the other test's unique variance to GPA.) The difference between the total
effects based on the alternative model and the analogous effects based on
the separate models in this article was trivial (difference±.02).
6It was possible that the predictive validity of g (for GPA) depended partly
on its association with the SAT or ACT, which were strongly related to g and
moderately related to GPA. Arguing against this possibility was an
alternative SEM model, which estimated g and its relation to GPA using
the ASVAB subtests only. This SEM model was structurally identical to the
original models (Fig. 2A and B), except that g was estimated without the SAT
or ACT. The results indicated that the new path coefficient from g to GPA
(coefficient=.31) was nearly identical to the prior path coefficients from g to
GPA (M coefficient≈.31, Fig. 2A and B).
7A total of 220 (of 1591) cases were missing all ASVAB data, and 21
additional cases were missing some (but not all) ASVAB data. These cases
could have distorted FIML estimates of SEM parameters. Addressing this
possibility were two supplemental SEM models. These models were
structurally identical to the prior models (Fig. 2A and B), except that only
cases with no missing ASVAB data were analyzed. The results of these
analyses were almost identical to the prior analyses (Table 3, Analyses 1 and
2). Fit statistics (CFI and RMSEA) differed only fractionally (i.e., less than
±.02) from the prior analyses. Paths from SAT and ACT unique variance to
GPA were still moderate in magnitude (coefficients≈.28), as were the paths
from g to GPA (coefficients≈.31). The paths from g to SAT or ACT were still
very strong (coefficients≈.76).
726
T.R. Coyle, D.R. Pillow / Intelligence 36 (2008) 719–729
Page 9
Electronics Information subtest's unique variance had the
highest (absolute) magnitude path coefficient to GPA (coeffi-
cient=−.18) in both the SAT and ACT analyses.
The results of Study 2 confirm and extend those of Study 1
with a larger sample of participants and cognitive tests. The
significant and strong paths from g to SATor ACT suggest that
the SAT and ACT are strongly related to g. The significant and
moderate paths from SAT and ACT unique variance to GPA
suggest that the predictive validity of the SATand ACT for GPA
can be partly attributed to non-g variance. The next section
reports supplemental analyses of the NLSY97, which confirm
and extend these results.
2.2.3. Supplemental analyses of NLSY97
Relations among the unique variances of the ASVAB
subtests could have inflated error and reduced model fit
(Reddy, 1992). Supplemental analyses examined this possibi-
lity by correlating the unique variances of ASVAB subtests
with similar attributes. The correlations were identified in
two exploratory factor analyses, one with the SAT and ASVAB
subtests and another with the ACTand ASVAB subtests. These
analyses used maximum likelihood extraction and oblique
promax rotation. Both analyses yielded a face-valid concep-
tual structure with the same three factor solution. The factors
were labeled (tests with highest loadings on each factor):
Mechanical Aptitude (AI, EI, MC, SI), Math Aptitude (AO, AR,
CS, MK, NO), and Academic Aptitude (GS, PC, WK, SAT, ACT).
New SAT and ACT models were created by correlating the
unique variances of all tests that loaded highest on each
factor. The models were structurally identical to the original
models (Fig. 2A and B), except that they included correlations
among the unique variances.
The new SAT and ACT models were used to estimate new
fit statistics and path coefficients (Table 3, Analyses 3 and 4).
Fit statistics were acceptable (CFIs=.96, RMSEAs=.06). The
acceptable fits suggest that correlations among the unique
variances of the tests contributed to the poor fits of the
original models (Table 3, Analyses 1 and 2), which ignored
these correlations. The mean difference between the path
coefficients of the new models (Table 3, Analyses 3 and 4) and
original models (Table 3, Analyses 1 and 2) was trivial (M
difference=.01).
It was also possible thatthe poor fits of the original models
could be attributed to modeling g as a first order factor rather
than a second order factor. Modeling g as a first order factor
produces a traditional “Spearman model” (Jensen, 1998, pp.
75–76), named after the discoverer of g. Such models ignore
“group” factors that estimate variance among tests with
common attributes. Modeling g as a second order factor
considers group factors, which can improve model fit (Jensen,
1998, pp. 78–81). More generally, modeling g in different
ways can be used to test the reliability of path coefficients,
which can change with the model of g.
Supplemental analyses modeled g as a second order factor,
which included group factors. The group factors were taken
from the previously described exploratory factor analyses,
which yielded three factors (Mechanical, Math, and Academic
Aptitude). (The SAT and ACT loaded highest on Academic
Aptitude.) These factors were used to create new SEM models
for the SATand ACT. The new models werestructurallysimilar
to the original models (Fig. 2A and B), except that g was
modeled as a second order factor. Paths were drawn from g to
each group factor, from each group factor to tests with the
highest loading on each factor, from SAT or ACT unique
variance to GPA, and from g to GPA.
The new SAT and ACT models were used to estimate new
fit statistics and path coefficients with g modeled as a second
order factor (Table 3, Analyses 5 and 6). Fit was at the low end
of the acceptability range in both models (CFIs=.92,
RMSEAs=.08), but was better than in the original models
(Table 3, Analyses 1 and 2), which modeled g as a first order
factor. The g factor was almost perfectly correlated with
Academic Aptitude (coefficients≈1.00) and strongly corre-
lated with the other group factors (M coefficient=.84,
range=.83 to .85). The mean difference between the path
coefficients of the new models (Table 3, Analyses 5 and 6) and
original models (Table 3, Analyses 1 and 2) was trivial (M
difference=.01).
Finally, it was possible that the predictive validity of SATor
ACT composite scores (for GPA) could be attributed to the SAT
or ACT subtest scores. Analyzing this possibility were supple-
mental analyses, which replaced SAT and ACT composite
scores with subtest scores. Subtest scores were available for
the SAT verbal (SATV) and math (SATM) subtests, and the ACT
English (ACTE), math (ACTM), and reading (ACTR) subtests.
(The ACT science subtest was not included in the ICPSR
NLSY97 dataset.) The SAT and ACT subtests were used to
create five new models, one for each available SAT and ACT
subtest. These new models were structurally identical to the
original models (Fig. 2A and B), except that composite scores
werereplacedwithsubtest scores, whichwereusedtopredict
GPA after removing g.
Selected path coefficients from the SAT and ACT subtest
modelsarereported inTable3(Analyses7–11).Consistentwith
the original analyses (Table 3, Analyses 1 and 2), path
coefficients from the SAT and ACT subtests' unique variances
to GPA were significant and showed little variability in
magnitude (range=.20 to .28, Table 3, Analyses 7–11). Path
coefficients to GPA from SATV and ACTR unique variance were
highest and lowest (in magnitude), respectively. Path coeffi-
cients from g to GPA were significant and similar in magnitude
to the composite score coefficients (range=.31 to .32), as were
path coefficients fromg tothe SATand ACTsubtests (range=.68
to.73).ThemeandifferencebetweenSATandACTsubtestscore
coefficients and their respective composite score coefficients
was very small (M difference=.04).
The similarity of subtest and composite score findings
could be attributed to the high correlations among the two
sets of scores. SAT subtest scores were highly correlated with
SAT composite scores (r=.93 and .92 for SATM and SATV,
respectively), and ACT subtest scores were highly correlated
with ACT composite scores (r=.90, .86, and .90 for ACTE,
ACTM, and ACTR, respectively).
3. Discussion
This research examined whether the SAT and ACT would
predict college GPA after removing g from the tests. SAT and
ACT scores and college GPAs were obtained from a university
sample (N=161) and the NLSY97 (N=8984). SEM was used to
estimate relationships among g, GPA, and the SAT and ACT.
The unique variances of the SAT and ACT, obtained after
727
T.R. Coyle, D.R. Pillow / Intelligence 36 (2008) 719–729
Page 10
removing g, were used to predict GPA. Results from the two
samples converged: While the SAT and ACT were highly g
loaded, both tests generally predicted college GPA after
removing g. The former finding suggests that the SAT and
ACT are strongly related to g, which is strongly related to IQ
and intelligence tests (Frey & Detterman, 2004; Koenig et al.,
2008). The latter finding suggests that the SAT and ACT
predict GPA from non-g factors.
The strong g loading of the SAT and ACT confirms and
extends research by Detterman and colleagues (Frey &
Detterman, 2004; Koenig et al., 2008). Detterman and
colleagues correlated the SAT and ACT with the g factor,
which was extracted using principal factor analysis (PFA) of a
paper-and-pencil ASVAB with the NLSY79. The research
presented here also correlated the SAT and ACT with the g
factor, which was extracted using SEM of the CAT-ASVAB with
the NLSY97. Despite differences in factor extraction (PFA vs.
SEM), ASVAB administration (paper and pencil vs. computer
adaptive) and NLSY cohort (1979 vs. 1997), the results from
this research and Detterman's research were very similar. The
g factor was strongly related to the SAT (coefficients=.78 and
.82 for Fig. 2A and Frey & Detterman, 2004, respectively) and
the ACT (coefficients=.76 and .77 for Fig. 2B and Koenig et al.,
2008, respectively). Given the strong g loading of intelligence
tests, it could be argued that the SAT and ACT (which are also
strongly g loaded) are intelligence tests.
The generally significant path coefficients from SAT and
ACT to college GPA after removing g were not expected. The
SAT and ACT correlated moderately with college GPAs
(Mr=.31, Tables 1 and 2) and strongly with g (M coeffi-
cient=.84, Figs. 1 and 2). The g factor correlated moderately
with college GPA (M coefficient=.30, Figs.1 and 2). Given the
strong g loading of the SAT and ACT, and the moderate
correlation of g and GPA, one might predict that SAT and ACT
relationships with GPA, removing g, would be near floor. But
this was not the case: The SATand ACTcorrelated moderately
and, with one exception, significantly, with GPA after
removing g (M coefficient=.26, Figs. 1 and 2). The exception
was the non-significant path coefficient from ACT to GPA,
removing g, in the university sample (coefficient=.10, Fig.1B).
This non-significant coefficient could be attributed to the
small number of cases with ACT scores (88 of 161), which
could have biased estimates and yielded non-representative
results (see Footnote 4). Consistent with this possibility,
parallel analysis of thelargerand morerepresentative NLSY97
yielded a larger magnitude (and significant) path coefficient
from ACT to GPA after removing g (coefficient=.26, Fig. 2B).
This is the first time that the SATand ACT have been shown
topredictcollegeGPAafterremovinggfromthetests.Theeffect
wasreplicated using(a) samples from a regional universityand
the NLSY97, (b) GPAs from university records (university
sample) and student self-reports (NLSY97), and (c) g estimated
from commercial cognitive tests (university sample) and the
CAT-ASVAB (NLSY97). Path coefficients from other cognitive
tests (besides SAT or ACT) provided discriminant validity.
Whereas path coefficients from SAT or ACT to GPA, removing
g, were generally moderate in magnitude (M coefficient=.26,
Figs.1 and 2), path coefficients from other cognitive tests (e.g.,
ASVAB or WAIS subtests) to GPA, removing g, were generally
small in magnitude (M coefficient=−.02). Moreover, path
coefficients from SAT or ACT to GPA, removing g, were similar
in magnitude to those from g to GPA (M coefficient=.30, Figs.1
and2).Thelatterfindingindicatesthatnon-gvariancefromthe
SAT or ACT predicted GPA about as well as g. This finding is
remarkable in light of prior research indicating that non-g
factors usually have limited predictive validity after g is taken
into account (e.g., Schmidt & Hunter,1998).
Supplemental analyses of the NLSY97 replicated the
pattern of path coefficients despite wide variability in
model fit. Model fit was poor with g estimated as a first
order factor (CFIs≈.84, Table 3, Analyses 1 and 2), barely
acceptable with g estimated as a second order factor
(CFIs≈.92, Table 3, Analyses 5 and 6), and acceptable with
correlations among the tests' unique variances accounted for
(CFIs≈.96,Table3,Analyses3 and4). Despitethis variability in
model fit, path coefficients from g (or non-g variance) to other
variables were very stable, as indicated by their small range
across different models (range foreach setof coefficients≤.04,
Table 3, Analyses 1–6). Such stability is not surprising. The g
factors obtained in different models are typically completely
correlated despite differences in model fit (e.g., Johnson et al.,
2004), increasing the likelihood of replication with different
models of g.
It was possible that the predictive validity of the SAT or
ACT for college GPA, removing g, could be attributed to SAT
and ACT subtest scores (SAT verbal and math; ACT English,
math, and reading), which were used to create the composite
scores (SAT verbal plus math; ACT composite). Arguing
against this possibility were supplemental analyses of the
NLSY97, which examined the subtests' predictive validity for
GPA after removing g. The results of these analyses were very
similar to the results of the composite score analyses. In
particular, the subtests were significantly and (mostly)
moderately related to GPA after removing g (M coeffi-
cient=.24, Table 3, Analyses 7–11). The similarity of the
subtest and composite score results is not surprising. SAT and
ACT subtest scores are typically highly correlated with their
composite scores (Mr=.90, this research; Mr=.91, Dorans,
1999, Table A.4, p. 18), which increases the likelihood of
replication with either set of scores.
An important question is whether the findings obtained
here would be replicated if g were estimated using different
cognitive tests. The tests used to estimate g were fairly
numerous and diverse in both samples. The NLSY97 received
the CAT-ASVAB, which included 12 diverse cognitive tests
(e.g., general science, mechanical knowledge, coding speed).
The university sample received a battery of commercial tests,
which included seven diverse scales (e.g., WAIS and Raven).
Because diverse batteries of cognitive tests are generally
strongly g loaded, and because g factors derived from such
batteries are typically highly or completely correlated
(r=1.00, Johnson et al., 2004; see also, Johnson et al., 2008),
the prospect of replication with different sets of tests seems
high.
Further research is needed to identify the sources of non-g
variance that contribute to the predictive validity of the SAT
and ACT for college GPA. Such sources may include (a)
domain-specific aptitudes such as ability tilt (i.e., perfor-
mance differences on math and verbal items), which predicts
later achievements but not general mental ability (Park,
Lubinski,&Benbow,2007); (b) proxies of academic motivation
such as preparing for high-stakes tests or studying for college
728
T.R. Coyle, D.R. Pillow / Intelligence 36 (2008) 719–729
Page 11
courses, which predicts college grades (e.g., Stinebrickner &
Stinebrickner, 2007); and (c) personality traits such as
conscientiousness and openness, which also predict college
grades (O'Connor & Paunonen, 2007). Because some of these
variables, notably personality traits, may be g loaded (e.g.,
Moutafi, Furnham, & Crump, 2006), g must be removed from
these variables before examining their predictive validity as
non-g factors.
The research presented here examined the recentered SAT,
which includes math and verbal subtests. The latest version of
the SAT includes three subtests: critical reading, which is
similar to the verbal subtest of the recentered SAT (sans
analogies); math, which is similar to the math subtest of the
recentered SAT; and writing, a new subtest that requires
writing essays and answering questions about grammar and
composition. (The ACT has not recently been revised.) An
important question is whether the results obtained here will
be replicated with the latest version of the SAT. Given the very
high correlations between the latest and prior versions of the
SAT (rs=.95 to .97; Kobrin & Schmidt, 2005), the prospect of
replication with the latest version of the SAT seems high.
References
Bridgeman, B., McCamley-Jenkins, L., & Ervin, N. (2000). Predictions of
freshman grade-point average from the revised and recentered SAT I:
reasoning test (College Board Report No. 2000-1). New York: College
Entrance Examination Board.
Brodnick, R. J., & Ree, M. J. (1995). A structural model of academic
performance, socioeconomic status, and Spearman's g. Educational and
Psychological Measurement, 55, 583−594.
Browne,M.W.,&Cudeck,R.(1993).Alternativewaysofassessingmodelfit.InK.A.
Bollen, & J. S. Long (Eds.), Testing structural equation models (pp. 136−162).
Newbury Park, CA: Sage.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences, (2nd ed.)
Mahwah, NJ: Erlbaum.
Coyle, T. R. (2006). Test–retest changes on scholastic aptitude tests are not
related to g. Intelligence, 34, 15−27.
Dorans, N. J. (1999). Correspondences between ACT and SAT I scores (College
Board Report No. 99-1). New York: College Entrance Examination Board.
Enders, C. K., & Bandalos, D. L. (2001). The relative performance of full
information maximum likelihood estimation for missing data in
structural equation models. Structural Equation Modeling, 8, 430−457.
Frey, M. C., & Detterman, D. K. (2004). Scholastic assessment or g? The
relationship between the Scholastic Assessment Test and general
cognitive ability. Psychological Science, 15, 373−378.
Friedman, N. P., & Miyake, A. (2004). The relations among inhibition and
interference control functions: A latent-variable analysis. Journal of
Experimental Psychology: General, 133, 101−135.
Hu, L. -T., & Bentler, P. M. (1999). Cutoff criteria for fit indices in covariance
structure analysis: Conventional criteria versus new alternatives. Struc-
tural Equation Modeling, 6, 1−55.
Jensen, A. R. (1998). The g factor: the science of mental ability. Westport, CT:
Praeger.
Johnson, W., Bouchard, T. J., Krueger, R. F., McGue, M., & Gottesman, I. I.
(2004). Just one g: Consistent results from three test batteries. Intelli-
gence, 32, 95−107.
Johnson, W., te Nijenhuis, J., & Bouchard, T. J., Jr. (2008). Still just 1 g:
Consistent results from five test batteries. Intelligence, 36, 81−95.
Kline, R. B. (2005). Principles and practice of structural equation modeling, (2nd
ed.) Guildford: New York.
Kobrin, J. L., & Schmidt, A. E. (2005). The research behind the new SAT (College
Board Research Summary RS-11). New York: College Entrance Examina-
tion Board.
Koenig, K. A., Frey, M. C., & Detterman, D. K. (2008). ACTand general cognitive
ability. Intelligence, 36, 153−160.
Kuncel, N. R., Credé, M., & Thomas, L. L. (2005). The validity of self-reported
grade point averages, class ranks, and test scores: A meta-analysis and
review of the literature. Review of Educational Research, 75, 63−82.
Marshalek, B., Lohman, D. F., & Snow, R. E. (1983). The complexity continuum
in the radex and hierarchical models of intelligence. Intelligence, 7,
107−127.
Moutafi, J., Furnham, A., & Crump, J. (2006). What facets of openness and
conscientiousness predict fluid intelligence score? Learning and Indivi-
dual Differences, 16, 31−42.
Noble, J. (2000). Effects of differential prediction in college admissions for
traditional and nontraditional-aged students (ACT Research Report Series
2000-9). Iowa City, IA: ACT.
O'Connor, M. C., & Paunonen, S. V. (2007). Big Five personality predictors of
post-secondary academic performance. Personality and Individual Differ-
ences, 43, 971−990.
Park, G., Lubinski, D., & Benbow, C. P. (2007). Contrasting intellectual patterns
predict creativity in the arts and sciences: Tracking intellectually
precocious youth over 25 years. Psychological Science, 18, 948−952.
Reddy, S. K. (1992). Effects of ignoring correlated measurement errors in
structuralequationmodels.Educational and Psychological Measurement,52,
549−570.
Schmidt, F. L., & Hunter, J. E. (1998). The validity and utility of selection
methods in personnel psychology: Practical and theoretical implications
of 85 years of research findings. Psychological Bulletin, 124, 262−274.
Stinebrickner, T. R., & Stinebrickner, R. (2007). The causal effect of studying on
academic performance (NBER Working Paper No. 13341). Cambridge, MA:
National Bureau of Economic Research.
UTSAFact Book (n.d.). Retrieved August22, 2007, fromthe University of Texas
at San Antonio, Office of Institutional Research Web site: http://www.
utsa.edu/ir/Factbook/Fact_Book_2006.pdf
Wechsler, D. (1997). WAIS-III administration and scoring manual. San Antonio,
TX: The Psychological Corporation.
Wonderlic (1999). Wonderlic personnel test and scholastic level exam: user's
manual. Libertyville, IL: Wonderlic, Inc.
729
T.R. Coyle, D.R. Pillow / Intelligence 36 (2008) 719–729
Download full-text