ArticlePDF Available

Pupil Size and Intelligence: A Large-Scale Replication Study

  • Ulster Institute for Social Research

Abstract and Figures

A recent study by Tsukahara et al. (2016) found correlations between pupil size and measures of intelligence, with r values around .30. We attempted to replicate this association in a large dataset of US military personnel (n = 4,462). General intelligence, g, was extracted from 19 diverse tests. We first confirmed that right and left eye pupil size measures are strongly correlated (r = .97), suggesting high measurement reliability for this phenotype. However, unlike Tsukuhara et al., we could establish only small to nonexistent associations between cognitive ability and pupil size (r's-.01 to .06, r with g specifically = .05). Regression analyses, controlling for multiple covariates, revealed that the association in this large representative sample was entirely attributable to confounding with race/ethnicity. Mean pupil size (mm) was 3.56, 3.35, and 3.23 for whites, Hispanics, and blacks, respectively. Relative to whites, this corresponds to effect sizes of 0.22 and 0.34 d. It is unclear why our results differ from those reported by Tsukahara et al. (2016), but the ethnic size rank order suggests an evolutionary explanation in terms of geo-bio-climatic selection for pupil size.
Content may be subject to copyright.
MANKIND QUARTERLY 2020 60:4 525-538
Pupil Size and Intelligence: A Large-Scale Replication
Emil O. W. Kirkegaard*
Ulster Institute for Social Research, London, UK
Helmuth Nyborg
Aarhus University (1968-2007), Denmark
*Corresponding author. Email:
A recent study by Tsukahara et al. (2016) found correlations
between pupil size and measures of intelligence, with r values
around .30. We attempted to replicate this association in a large
dataset of US military personnel (n = 4,462). General intelligence, g,
was extracted from 19 diverse tests. We first confirmed that right and
left eye pupil size measures are strongly correlated (r = .97),
suggesting high measurement reliability for this phenotype.
However, unlike Tsukuhara et al., we could establish only small to
nonexistent associations between cognitive ability and pupil size (r’s
-.01 to .06, r with g specifically = .05). Regression analyses,
controlling for multiple covariates, revealed that the association in
this large representative sample was entirely attributable to
confounding with race/ethnicity. Mean pupil size (mm) was 3.56,
3.35, and 3.23 for whites, Hispanics, and blacks, respectively.
Relative to whites, this corresponds to effect sizes of 0.22 and 0.34
d. It is unclear why our results differ from those reported by
Tsukahara et al. (2016), but the ethnic size rank order suggests an
evolutionary explanation in terms of geo-bio-climatic selection for
pupil size.
Key Words: Intelligence, Cognitive ability, Pupil size, Pupil
diameter, Evolution, Replication
Physical correlates of intelligence have long been of interest to researchers.
Galton already observed a positive relationship between intelligence and height
as early as the 19th century (Galton, 1869). More than a century later, Jensen and
Sinha (1993), in a seminal book chapter, presented a 100-page review of links
between intelligence and a wide variety of physical traits. Of main interest to
researchers have been the relationships of intelligence to height and measures
of brain size and its proxies (Rushton & Ankney, 2009). Less obvious
relationships have also been reported by many others (e.g., blood groups,
compatibility of blood groups between mother and child or between twins, serum
uric acid/gout).
Recently, a study by Tsukahara et al. (2016) attracted considerable attention
(46 citations at the time of writing) by reporting fairly large correlations between
pupil size and intelligence, or rather, selected aspects of intelligence. In the first
sample in their study, they compared 20 subjects with high working memory
capacity (WMC), as measured by 3 tests, to 20 subjects with low WMC. They
measured the pupil size at baseline and found a 1.10 d (0.97 mm) difference in
size. In the second sample, they recruited 114 subjects (roughly half each with
high and low WMC) and measured their pupil sizes three times, weeks apart. At
time 1 (first measurement), the high WMC group had 0.62 mm (0.57 d) larger
pupils at baseline than the low WMC group. Measurement reliability or stability
was fairly high: the correlations between measurements were approximately .80.
In the third sample, they examined 337 subjects with 6 mental tests as well as
pupil size, and split their cognitive tests into two measures, WMC and fluid ability.
Both correlated with pupil size at baseline (r = .24 and .35 respectively). The
control for various demographic characteristics (ethnicity, age, and sex) did not
eliminate these relationships.
This potentially interesting pupil size-intelligence relationship seems fairly
robust, but sample sizes were small and the cognitive testing seemed limited in
scope. Accordingly, we think that their findings deserved a large-scale replication
study. For this we analyzed the relationship between pupil size and intelligence
in a much larger dataset with an extensive battery of very diverse cognitive tests.
Subjects, Methods, and Tests
Archival data were taken from the Vietnam Experience Study (VES, dataset
website The VES is a large
US military study in which a large sample of men were examined at enlistment
between 1965 and 1971 and followed up with an intensive physical,
psychological, psychiatric, and socio-economic examination between 1985 and
1986 (Centers for Disease Control Vietnam Experience Study, 1988a,b,c).
Though the dataset is legally in the public domain (not protected by copyright or
confidentiality agreements), there is no public repository for it. However, various
intelligence researchers have obtained copies of it and used it to test numerous
hypotheses, primarily in the area of cognitive epidemiology (relationships
between health and intelligence; Batty et al., 2008; Gale et al., 2008; Phillips et
al., 2009), but other uses include examining Spearman’s hypothesis (Nyborg &
Jensen, 2000), the longitudinal stability of intelligence (Larsen et al., 2008), and
the predictive validity of intelligence for income and education in different
racial/ethnic groups (Nyborg & Jensen, 2001).
Race/ethnicity in the VES breaks down as follows: 3,654 whites, 525 blacks,
200 Hispanics, 49 Native Americans, and 34 Asians. Mean age at the follow-up
examination was 38 (SD 2.5). Due to the modest sample size of some of these
groups, we concentrated on analyzing only data for the white, black and Hispanic
subsamples. Subjects were given a total of 19 cognitive tests:
1. Grooved Pegboard Test (GPT), right hand: A measure of manual dexterity
and fine motor speed (Ruff & Parker, 1993). The speed score is the reciprocal
of the number of seconds taken to place a set of pegs in a grooved hole as
quickly as possible.
2. GPT, left hand.
3. Paced Auditory Serial Addition Test (PASAT): A measure of mental control,
speed, and computational and attentional abilities (Tombaugh, 2006). The
subject mentally adds a sequence of numbers in rapid succession. Score is
the total number of correct responses.
4. Rey-Osterrieth Complex Figure Drawing (CFD): A measure of visuospatial
ability and memory (Shin et al., 2006). The direct copy score (CFDD) is given
from a subject reproducing a complex spatial figure while the figure is in full
5. CFD, copy from immediate recall: The immediate recall score (CFDI) is given
from a subject reproducing a complex spatial figure immediately after being
shown it.
6. CFD, copy from delayed recall: The delayed recall score (CFDL) is given
from a subject being exposed to a complex spatial figure and, after 20
minutes of other activities, drawing it.
7. Wechsler Adult Intelligence Scale-Revised (WAIS-R), general information: A
test of general knowledge (Leckliter et al., 1986).
8. WAIS-R, block design: A test of spatial ability.
9. Word List Generation Test (WLGT): A measure of verbal fluency. The
subject generates as many words as possible which begin with the letters F,
A, and S for 60 seconds. The score is the total number of words generated.
10. Wisconsin Card Sort Test (WCST): A measure of executive function (Greve
et al., 2005). The score is the ratio of correct responses to countable
11. Wide Range Achievement Test (WRAT): Measures ability to read aloud a
list of single words (untimed) (Witt, 1986).
12. California Verbal Learning Test (CVLT): A measure of verbal learning and
memory (Elwood, 1995). The subject recalls a list of 16 words over 5
repeated learning trials. The score is the total correct over 5 trials.
13. Army Classification Battery (ACB): A verbal test administered at induction
(VE time 1) (Bayroff & Fuchs, 1970).
14. ACB verbal: Administered at the follow-up interview (VE time 2).
15. ACB arithmetic reasoning test: An arithmetic test administered at induction
(AR time 1).
16. ACB arithmetic: Administered at the follow-up interview (AR time 2).
17. Pattern Analysis Test (PAT): A measure of pattern recognition administered
at induction.
18. General Information Test (GIT): A test of general knowledge administered
at induction.
19. Armed Forces Qualification Test (AFQT): A general aptitude battery. This
measure is the total score on four subtests (word knowledge, paragraph
comprehension, arithmetic reasoning, mathematics knowledge)
administered at induction.
Five of the tests (13, 15, 17-19) were given at induction and the remaining at
the follow-up interview.
Pupil size of both eyes was measured as part of further testing for visual
performance and problems during the full examination in 1985-1986. Trained
personnel used the semiautomatic Optec 2000 stereoscopic instrument when
concealed in a closed housing (i.e. no windows to outside or other rooms),
allowing only the tester and the subject being tested to view the targets. This was
a controlled environment such that lighting was the same for all subjects, thus
avoiding any confounding caused by light levels that would otherwise affect pupil
sizes. The illumination inside the apparatus was activated only when the subject
maintained steady forehead pressure during testing. The appendix contains an
excerpt of the manual provided by the CDC about the machine and the visual
testing (Centers for Disease Control, 1989a, p. 331ff). The appendix also provides
a photo of the machine. Finally, during data collection, a small subset of subjects
were remeasured for many variables by different observers. The resulting scores
were then tested for inter-observer variability. However, this was not found for the
vast majority of variables examined, including pupil size (Centers for Disease
Control, 1989b, Table 20).
The correlation between pupil sizes by eye may be taken as an estimate of
measurement reliability though it includes method variance related to the context
and personnel. We found a nearly perfect correlation of .97 between pupil size
across eyes. We then took the average pupil size as the single best measure. It
should be noted that pupil size was measured in whole millimeters, so the data
was only quasi-continuous. Figure 1 shows a histogram of pupil size.
Figure 1. Histogram of pupil size (mm) by eye.
As in prior studies (Nyborg & Jensen, 2000, 2001), we extracted a g factor
from all the tests and scored individuals on it. The subtest g loadings spanned .33
to .85, and the g factor accounted for 42% of the total variance. We similarly
computed gs for the early and later test sessions. The test battery has been found
to be free of racial/ethnic bias with respect to the white, Hispanic, and black
samples (Lasker et al., in prep.). Table 1 provides correlations among cognitive
test g-values and pupil size.
Table 1. Correlation matrix for cognitive tests, g, and pupil size. Time 1 =
enlistment (1965-1971), time 2 = follow-up interview (1985-1986).
Pupil size right
Pupil size left
Pupil size mean
VE time1
AR time1
VE time2
AR time2
Copy direct
Copy immediate
Copy delayed
GPT left
GPT right
g time1
g time2
Pupil size right
Pupil size left
Pupil size mean
The overall pattern of correlations in Table 1 indicates that pupil size is
positively but very weakly related to cognitive ability, no matter how it is estimated.
The standard error is approximately 0.015, so any value above |0.03| has p<.05.
We also carried out regression models to see whether confounding from multiple
diverse variables was an issue. Table 2 presents the results.
Table 2. Regression model results. Dependent variable = pupil size mean
(across eyes, in mm). Standard errors in parentheses. * p<.01; ** p<.005; ***
p<.001. Betas are not standardized. Nonlinear effects modeled with a spline.
race =
0 (ref)
race =
race =
race =
race =
past year
per month
per day
pupil hour
nonlinear nonlinear nonlinear nonlinear
R2 adj.
Our regressions revealed that the association between pupil size and
intelligence was entirely due to confounding with race/ethnicity. There was no
association within groups, whether or not covariates were included. Figure 2
shows the scatterplots of pupil size by race/ethnicity.
Table 3 provides the mean descriptive statistics of key variables by
race/ethnicity. The standardized effect sizes for pupil size are 0.22 and 0.34 for
Hispanics and blacks, respectively, compared with whites.
Figure 2. Scatterplots of relations between pupil size (average across eyes) and
g by race/ethnicity. There are no statistically significant relationships in the plots.
Table 3. Mean ± standard deviation of pupil size (mm) and g by race.
Pupil size left
3.56 ± 0.97
3.34 ± 0.84
3.23 ± 0.98
Pupil size right
3.56 ± 0.96
3.36 ± 0.87
3.23 ± 0.95
Pupil size mean
3.56 ± 0.96
3.35 ± 0.85
3.23 ± 0.96
0.00 ± 1.00
-0.78 ± 0.89
-1.24 ± 0.87
A recent study by Tsukahara et al. (2016) reported that pupil size is relatively
strongly related to intelligence at baseline, correlating about .30. We carried out
a replication study of this potentially interesting observation in a large sample, but
were unable to replicate this relationship. In fact, we find correlations hovering
around zero, with small standard errors. We see no obvious reasons for the
One might suspect that it is due to the relatively crude measure of pupil size
in the VES study when compared to modern equipment, but the high correlation
across eyes (0.97) speaks against this interpretation. Neither do analyses of the
same data by others find null results (Silva et al., 2012; Vanderploeg et al., 2005,
2007), nor does the use of the same equipment by others (e.g. Liou & Chiu, 2001).
The rounding of the pupil size to whole millimeters in our study does not explain
the lack of relationship, as this discretization of the data would be expected to
reduce the correlation only slightly, as can be demonstrated by the Interactive
Statistics Simulator (
discretization). The control for race/ethnicity does not explain the discrepancy,
because Tsukahara et al. also conducted a regression analysis with control for
race/ethnicity, and still find a significant size association with measures of
intelligence. They found that race/ethnicity was related to pupil size, but reported
only ANOVA results, so we cannot see the direction of effect.
We observed a negative association with age across all models (beta = -0.03
in the full model and in the white subsample). This was also observed by
Tsukahara et al., though their slope was about twice as strong (-0.07). Because
we compared the unstandardized slope of age on pupil size (in mm), the age
distribution difference between the samples should not matter, so the source of
the difference is still unclear. We included current smoking status as a covariate,
because Tsukahara et al. found a strong size correlation (-.21) to nicotine use,
but could not establish any relationship to this variable either.
In conclusion, we used large-scale measurements of pupil size and related
the outcome to a large battery of diverse cognitive tests, but were still unable to
establish any meaningful relationship between intelligence and pupil size,
whether or not covariates were adjusted for. We did replicate negative
associations with age.
Interestingly, we established a correlated rank order of pupil size and general
intelligence, g, to race/ethnicity with whites>Hispanics>blacks. This particular
order of race/ethnicity begs an evolutionary explanation. Populations which
migrated farther from the equator would spend relatively more of their waking time
during periods of darkness or dim light than populations living closer to the
equator, and so would be gradually selected for optimizing light intake (see also
Christopher et al., 2013; Pearce & Dunbar, 2012). A test of this climatic
hypothesis is the subject for a forthcoming paper.
We would like to thank the Centers for Disease Control, USA, for collecting
and releasing the Vietnam Experience Study dataset, which continues to offer
much insight for science. Materials and full statistical output from the study can
be found at
Batty, G.D., Shipley, M.J., Mortensen, L.H., Boyle, S.H., Barefoot, J., Grønbæk, M.,
Gale, C.R. & Deary, I.J. (2008). IQ in late adolescence/early adulthood, risk factors in
middle age and later all-cause mortality in men: The Vietnam Experience Study. Journal
of Epidemiology & Community Health 62: 522-531.
Bayroff, A.G. & Fuchs, E.F. (1970). The Armed Services Vocational Aptitude Battery.
U.S. Army Behavior and Systems Research Laboratory.
Centers for Disease Control Vietnam Experience Study (1988a). Health status of
Vietnam veterans I: Psychosocial characteristics. Journal of the American Medical
Association 259: 2701-2707.
Centers for Disease Control Vietnam Experience Study (1988b). Health status of
Vietnam veterans II: Physical health. Journal of the American Medical Association 259:
Centers for Disease Control Vietnam Experience Study (1988c). Health Status of
Vietnam Veterans III: Reproductive outcomes and child health. Journal of the American
Medical Association 259: 2715-2719.
Centers for Disease Control (1989a). Health status of Vietnam veterans, supplement C:
Medical and psychological procedure manuals and forms.
Centers for Disease Control (1989b). Health status of Vietnam veterans, supplement B:
Medical and psychological data quality.
Christopher, M., Scheetz, T.E., Mullins, R.F. & Abràmoff, M.D. (2013). Selection of
phototransduction genes in Homo sapiens. Investigative Ophthalmology & Visual
Science 54 https:/
Elwood, R.W. (1995). The California Verbal Learning Test: Psychometric characteristics
and clinical application. Neuropsychology Review 5(3): 173-201.
Gale, C.R., Deary, I.J., Boyle, S.H., Barefoot, J., Mortensen, L.H. & Batty, G.D. (2008).
Cognitive ability in early adulthood and risk of 5 specific psychiatric disorders in middle
age: The Vietnam Experience Study. Archives of General Psychiatry 65: 1410-1418.
Galton, F. (1869). Hereditary Genius. Macmillan & Co.
Greve, K.W., Stickle, T.R., Love, J.M., Bianchini, K.J. & Stanford, M.S. (2005). Latent
structure of the Wisconsin Card Sorting Test: A confirmatory factor analytic study.
Archives of Clinical Neuropsychology 20: 355-364.
Larsen, L., Hartmann, P. & Nyborg, H. (2008). The stability of general intelligence from
early adulthood to middle-age. Intelligence 36: 29-34.
Lasker, J., Nyborg, H. & Kirkegaard, E.O.W. (in prep.). Spearman’s hypothesis in the
Vietnam Experience Study and National Longitudinal Survey of Youth ’79.
Leckliter, I.N., Matarazzo, J.D. & Silverstein, A.B. (1986). A literature review of factor
analytic studies of the WAIS-R. Journal of Clinical Psychology 42: 332-342.<332::AID-JCLP2270420220>3.0.CO;2-
Liou, S.-W. & Chiu, C.-J. (2001). Myopia and contrast sensitivity function. Current Eye
Research 22: 81-84.
Nyborg, H. & Jensen, A.R. (2000). Black–white differences on various psychometric
tests: Spearman’s hypothesis tested on American armed services veterans. Personality
and Individual Differences 28: 593-599.
Nyborg, H. & Jensen, A.R. (2001). Occupation and income related to psychometric g.
Intelligence 29: 45-55.
Pearce, E. & Dunbar, R. (2012). Latitudinal variation in light levels drives human visual
system size. Biology Letters 8: 90-93.
Phillips, A.C., Batty, G.D., Gale, C.R., Deary, I.J., Osborn, D., MacIntyre, K. & Carroll, D.
(2009). Generalized anxiety disorder, major depressive disorder, and their comorbidity
as predictors of all-cause and cardiovascular mortality: The Vietnam Experience Study.
Psychosomatic Medicine 71(4): 395.
Ruff, R.M. & Parker, S.B. (1993). Gender- and age-specific changes in motor speed and
eye-hand coordination in adults: Normative values for the finger tapping and grooved
pegboard tests. Perceptual and Motor Skills 76(3s): 1219-1230.
Rushton, J.P. & Ankney, C.D. (2009). Whole brain size and general mental ability: A
review. International Journal of Neuroscience 119: 692-732.
Shin, M.-S., Park, S.-Y., Park, S.-R., Seol, S.-H. & Kwon, J.S. (2006). Clinical and
empirical applications of the ReyOsterrieth Complex Figure Test. Nature Protocols
1(2): 892-899.
Silva, M.A., Donnell, A.J., Kim, M.S. & Vanderploeg, R.D. (2012). Abnormal neurological
exam findings in individuals with mild traumatic brain injury (mTBI) versus psychiatric
and healthy controls. Clinical Neuropsychologist 26: 1102-1116.
Tombaugh, T.N. (2006). A comprehensive review of the Paced Auditory Serial Addition
Test (PASAT). Archives of Clinical Neuropsychology 21: 53-76.
Tsukahara, J.S., Harrison, T.L. & Engle, R.W. (2016). The relationship between
baseline pupil size and intelligence. Cognitive Psychology 91: 109-123.
Vanderploeg, R.D., Curtiss, G. & Belanger, H.G. (2005). Long-term neuropsychological
outcomes following mild traumatic brain injury. Journal of the International
Neuropsychological Society 11: 228-236.
Vanderploeg, R.D., Curtiss, G., Luis, C.A. & Salazar, A.M. (2007). Long-term morbidities
following self-reported mild traumatic brain injury. Journal of Clinical and Experimental
Neuropsychology 29: 585-598.
Witt, J.C. (1986). Review of the Wide Range Achievement Test-Revised. Journal of
Psychoeducational Assessment 4: 87-90.
Excerpt from test manual for visual examination
The following text is quoted from the medical examination manual for the
visual testing (Health Status of Vietnam Veterans: Supplement C, Medical and
Psychological Procedure Manuals and Forms, p. 331ff). A photo of an identical
machine is reproduced below.
N. Vision Testing
1. Introduction
The procedures outlined in this manual are to be used in conjunction with the physical
assessment to detect abnormalities in vision. Those abnormalities to be evaluated include
near vision, far vision and peripheral vision.
2. Equipment
a. Optec 2000
(1) The Optec 2000 is a precision designed stereoscopic instrument for
measuring visual performance and thereby detecting visual problems. The instrument
is semiautomatic with an illuminated control panel. It weighs 13.5 lbs and can be used
on a desk since it requires less than 2 sq. ft. of space. All tests are concealed in a
closed housing allowing only the tester and the subject being tested to view the
(2) For the operator, all switches are located on one panel within easy reach.
Each switch is illuminated for quick, easy identification. The dial which controls the
slides is located on the side of the instrument.
(3) Some interface features of the Optic 2000 include an advanced light system
which renders a white light, resulting in high contrast images and truer color
reproduction. A built-in baffle assembly isolates the left and right eyes, thus
eliminating unwanted reflective light. By eliminating crossover, true binocular and
monocular tests are guaranteed. The front surface mirror offers a ghost-free image.
Up to 12 test slides can be mounted on a rotatable drum. We will be using only the
near and far vision letter charts in this study.
(4) External features include a forehead trigger which controls illumination inside
the Vision Tester. It will only activate the illumination when the subject maintains
pressure for testing. When forehead pressure is applied to the bar, the green "Ready"
indicator will illuminate and the subject is ready to be tested. The lens system consists
of two lenses. The upper lens is for FAR POINT testing (simulated distance of 20
feet). The lower lens is for NEAR POINT testing (simulated distance of 14 inches).
FAR and NEAR indicator lights indicate how the instrument is set to test, yellow for
FAR and blue for NEAR. The colors will correspond with the FAR/NEAR switch on
the control panel. The test dial is used to change slides in the viewing area. The
numbers on the dial correspond to the numbers on the record form for identifying the
slide test.
5. Procedure
a. Operation of the OPTEC 2000
(1) The unit should be placed on a flat table top.
(2) The power plug should be inserted into a 110-120 V AC power outlet.
(3) To turn the unit power on, press the red switch located on the rear panel to
the "in" position. If the unit is receiving power from the power outlet the switch light
will turn on.
b. Subject Preparation
(1) The participant should be asked if he wears corrective lenses or contact
lenses. If he answers yes, the initial exam of visual acuity for near and far vision
should be performed without corrective or contact lenses. (Participants who wear
contacts are asked not to put in their contacts on Medical Day morning, but to wear
glasses and bring their contacts with them. This request is made during the
orientation, the evening of Arrival Day.) The participant should then be re-examined
with the corrective lenses.
(2) The subject should be informed that he should keep both eyes open at all
times and should always look straight ahead.
(3) Prior to administering the test, the subject should be seated in front of the unit
and the unit height adjusted til conform to the subject's height. This is done by
pressing the light grey button located on the unit base and moving the upper portion
of the unit up or down.
(4) For hygienic purposes, new tissue inserts should be in place on the forehead
trigger for each subject tested.
(5) Prior to administering the vision tests, the subject should place his forehead
firmly against the forehead trigger located at the middle upper edge of the unit.
OPTEC 2000
A photo of the machine is reproduced below.
... drawing), delayed recall, verbal ability, and spatial ability (block design). These have been described in detail elsewhere (Kirkegaard & Nyborg, 2020), and the appendix contains a summary of the tests and the factor analysis. As in the other studies, we computed the g factor from the 19 tests and saved it for further analysis. ...
... This text is copied from (Kirkegaard & Nyborg, 2020). ...
Full-text available
Prior research has indicated that one can summarize the variation in psychopathology measures in a single dimension, labeled P by analogy with the g factor of intelligence. Research shows that this P factor has a weak to moderate negative relationship to intelligence. We used data from the Vietnam Experience Study to reexamine the relations between psychopathology assessed with the MMPI (Minnesota Multiphasic Personality Inventory) and intelligence (total n = 4,462: 3,654 whites, 525 blacks, 200 Hispanics, and 83 others). We show that the scoring of the P factor affects the strength of the relationship with intelligence. Specifically, item response theory-based scores correlate more strongly with intelligence than sum-scoring or scale-based scores: r’s = -.35, -.31, and -.25, respectively. We furthermore show that the factor loadings from these analyses show moderately strong Jensen patterns such that items and scales with stronger loadings on the P factor also correlate more negatively with intelligence (r = -.51 for 566 items, -.60 for 14 scales). Finally, we show that training an elastic net model on the item data allows one to predict intelligence with extremely high precision, r = .84. We examined whether these predicted values worked as intended with regards to cross-racial predictive validity, and relations to other variables. We mostly find that they work as intended, but seem slightly less valid for blacks and Hispanics (r’s .85, .83, and .81, for whites, Hispanics, and blacks, respectively).
... The dataset includes data from 19 different cognitive tests. These have been described in detail in several previous papers and include measures of verbal reasoning, arithmetic, spatial ability, psychomotor ability, and memory (Kirkegaard & Nyborg, 2020;Nyborg & Jensen, 2000, 2001. At the follow-up, the mean age was 38 (SD 2.5). ...
... We scored intelligence using exploratory factor analysis of the 19 tests, as done in prior studies using the same dataset (Kirkegaard & Nyborg, 2020;Nyborg & Jensen, 2000, 2001. Before analysis, we imputed the missing data using the IRMI algorithm in the vim package (Templ et al., 2015). ...
Full-text available
A recent study by Dutton et al. (J Relig Health 59:1567–1579., 2020) found that the religiousness-IQ nexus is not on g when comparing different groups with various degrees of religiosity and the non-religious. It suggested, accordingly, that the nexus related to the relationship between specialized analytic abilities on the IQ test and autism traits, with the latter predicting atheism. The study was limited by the fact that it was on group-level data, it used only one measure of religiosity that measure may have been confounded by the social element to church membership and it involved relatively few items via which a Jensen effect could be calculated. Here, we test whether the religiousness-IQ nexus is on g with individual-level data using archival data from the Vietnam Experience Study, in which 4462 US veterans were subjected to detailed psychological tests. We used multiple measures of religiosity—which we factor-analysed to a religion-factor—and a large number of items. We found, contrary to the findings of Dutton et al. (2020), that the IQ differences with regard to whether or not subjects believed in God are indeed a Jensen effect. We also uncovered a number of anomalies, which we explore.
... A separate issue, somewhat beyond the scope of these guidelines, is related to interpreting differences among ethnic groups. Differences in resting pupil diameter have been reported to vary with race/ethnicity (Kirkegaard & Nyborg, 2020), which may be related to differences in iris pigmentation, which might also affect differences in the amplitude of the light reflex (Kardon et al., 2013). When assessing pupillary reactivity as it varies with race or ethnicity during task performance, it is critical to consider contributing cultural factors (van der Wel & van Steenbergen, 2018;Verney et al., 2005). ...
Full-text available
A variety of psychological and physical phenomena elicit variations in the diameter of pupil of the eye. Changes in pupil size are mediated by the relative activation of the sphincter pupillae muscle (decrease pupil diameter) and the dilator pupillae muscle (increase pupil diameter), innervated by the parasympathetic and sympathetic branches, respectively, of the autonomic nervous system. The current guidelines are intended to inform and guide psychophysiological research involving pupil measurement by (1) summarizing important aspects concerning the physiology of the pupil, (2) providing methodological and data‐analytic guidelines and recommendations, and (3) briefly reviewing psychological phenomena that modulate pupillary reactivity. Because of the increased ease and tractability of pupil measurement, the goal of these guidelines is to promote accurate recording, analysis, and reporting of pupillary data in psychophysiological research. This report provides guidelines for publishing pupillary studies in psychophysiology. In addition to reporting criteria, there are general recommendations, and background on physiology and recording techniques for pupillary studies
... The dataset includes data from 18 different cognitive tests. These have been described in detail in several previous papers, and include measures of verbal reasoning, arithmetic, spatial ability, psychomotor ability, and memory (Kirkegaard & Nyborg, 2020;Nyborg & Jensen, 2000, 2001. About 30% of the tests were given at induction into the military about 20 years earlier, and the remaining were given at the follow-up. ...
Full-text available
There are a few scattered reports that uric acid level predicts various forms of academic achievement beyond any associations with intelligence, but all these studies are old and small. Given the potential importance of this relationship for interventions, there is a need for a more recent, larger study. We use archival data from the Vietnam Experience Study, in which 4,454 US veterans were subjected to detailed psychological and physical examinations and blood analysis around age 38. Uric acid was not measured directly, but a well-known clinical manifestation of this is gout, which was measured as a binary diagnosis with a prevalence of 86 out of 4,454 (1.9%). We used regressions to examine the predictive ability of gout for education, occupational status, and income, both alone and with covariates (intelligence, age, race). We find neither main effects nor interaction effects of gout on any outcome measure. Analysis of medical history data suggested the diagnoses were likely reliable. Analysis of the NHANES 2017 dataset, which contains both gout diagnosis and uric acid level measures, however, suggests that our results have low statistical precision. Thus, more large-scale studies are needed to examine this hypothesis.
Full-text available
The Rey–Osterrieth Complex Figure Test (ROCF), which was developed by Rey in 1941 and standardized by Osterrieth in 1944, is a widely used neuropsychological test for the evaluation of visuospatial constructional ability and visual memory. Recently, the ROCF has been a useful tool for measuring executive function that is mediated by the prefrontal lobe. The ROCF consists of three test conditions: Copy, Immediate Recall and Delayed Recall. At the first step, subjects are given the ROCF stimulus card, and then asked to draw the same figure. Subsequently, they are instructed to draw what they remembered. Then, after a delay of 30 min, they are required to draw the same figure once again. The anticipated results vary according to the scoring system used, but commonly include scores related to location, accuracy and organization. Each condition of the ROCF takes 10 min to complete and the overall time of completion is about 30 min.
Full-text available
The regressions of occupational status and income on psychometric g factor scores were examined in large samples of White (W) and Black (B) American armed forces veterans in their late 30s and who are fairly representative of the population of employed W and B males. These results indicate that when Bs and Ws are matched on g scores, there is no evidence of discrimination unfavorable to Bs for job status at any level of g. Nor are Bs with the same g scores as Ws disadvantaged in income when they are above the median level of g in the total sample. In fact, on both variables — job status and income — Ws turn out to be the relatively more disadvantaged group when the level of g is taken into account.
Full-text available
Early cross-sectional studies suggested that cognitive functions begin to decline in young adulthood, whereas the first longitudinal studies suggested that they are mainly stable in adulthood. A number of more contemporary longitudinal studies support the stability hypothesis. However, drop out effects have the consequence that most longitudinal studies end up with relatively few subjects.In the present study we determined absolute as well as differential stability in general intelligence g, and in verbal and arithmetic abilities, longitudinally for 4000+ adult male veterans drawn from the Vietnam Experience Study (VES). The subjects were given five cognitive tests in their early adulthood. Approximately 18 years later, 14 cognitive tests were administered. Two tests, one verbal and one arithmetic, were administered on both occasions. A Principal Axis Factor analysis was conducted separately on the tests from first and second testing in order to extract both a “gyoung” and a “gold” general intelligence factor. gyoung was then correlated with gold to determine the differential stability of g. The absolute scores from the recurrent tests were correlated to determine the differential stability and compared using an ordinary t-test in order to estimate the absolute stability.The differential stability coefficients were: 0.85 for g; 0.79 for arithmetic; and 0.82 for verbal ability. With respect to absolute stability of the specific tests, we found a significant increase in verbal score (mean scores; 107.16, 116.52), but no change in arithmetic score. Problems associated with different concepts of stability, level of analysis and potential practice effects were discussed.
Pupil dilations of the eye are known to correspond to central cognitive processes. However, the relationship between pupil size and individual differences in cognitive ability is not as well studied. A peculiar finding that has cropped up in this research is that those high on cognitive ability have a larger pupil size, even during a passive baseline condition. Yet these findings were incidental and lacked a clear explanation. Therefore, in the present series of studies we systematically investigated whether pupil size during a passive baseline is associated with individual differences in working memory capacity and fluid intelligence. Across three studies we consistently found that baseline pupil size is, in fact, related to cognitive ability. We showed that this relationship could not be explained by differences in mental effort, and that the effect of working memory capacity and fluid intelligence on pupil size persisted even after 23 sessions and taking into account the effect of novelty or familiarity with the environment. We also accounted for potential confounding variables such as; age, ethnicity, and drug substances. Lastly, we found that it is fluid intelligence, more so than working memory capacity, which is related to baseline pupil size. In order to provide an explanation and suggestions for future research, we also consider our findings in the context of the underlying neural mechanisms involved.
The Vietnam Experience Study was a multidimensional assessment of the health of Vietnam veterans. From a random sample of enlisted men who entered the US Army from 1965 through 1971, 7924 Vietnam and 7364 non-Vietnam veterans participated in a telephone interview; a random subsample of 2490 Vietnam and 1972 non-Vietnam veterans also underwent a comprehensive medical examination. During the telephone interview, Vietnam veterans reported current and past health problems more frequently than did non-Vietnam veterans, although results of medical examinations showed few current objective differences in physical health between the two groups. The Vietnam veterans had more hearing loss. Also, among a subsample of 571 participants who had semen samples evaluated, Vietnam veterans had lower sperm concentrations and lower mean proportions of morphologically “normal” sperm cells. Despite differences in sperm characteristics, Vietnam and non-Vietnam veterans have fathered similar numbers of children.
To investigate the evidence of recent positive selection in the human phototransduction system at both single nucleotide polymorphism (SNP) and gene level. Methods: SNP genotyping data from the International HapMap Project for European, Eastern Asian, and African populations . Differences in haplotype length (extended haplotype homozygosity) and allele frequency (Fst) between these populations were computed for each SNP to measure evidence of recent positive selection. These were aggregated into gene-level metrics of selection and percentile scores for each gene were computed. The level of recent positive selection in phototransduction genes was evaluated and compared to a set of genes previously shown to be under recent selection and a set of highly conserved genes as positive and negative controls, respectively. Results: Six of 20 phototransduction genes evaluated had gene-level selection metrics above the 90th percentile: RGS9, CNGB1, GNB1, PDE6G, GNAT1, and SLC24A1. The selection signal across these genes was found to be of similar magnitude to the positive control genes and much greater than the negative control genes. Previous work has implicated the selected phototransduction genes in retinal adaptation to changing light levels. We hypothesize that new habitats encountered by the European and Asian populations were characterized by more quickly changing light levels and variance in albedo. This conferred a selective advantage on more rapid retinal adjustment to changes in illuminance. Uncovering the underlying genetics of evolutionary adaptations in phototransduction not only allows greater understanding of vision and visual diseases, but also the development patient-specific diagnostic and intervention strategies.
Argues that the revised edition of the Wide Range Achievement Test (WRAT—R) is not very different from its predecessors and continues to embody some of the worst characteristics and attributes of the testing industry. A major improvement in the standardization sample along with minor changes in format in item content are reported. It is concluded that unless a "quick and dirty" assessment of achievement is desired, there is no reason to use the WRAT—R. (2 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
In those with a history of mild traumatic brain injury (mTBI), cognitive and emotional disturbances are often misattributed to that preexisting injury. However, causal determinations of current symptoms cannot be conclusively determined because symptoms are often nonspecific to etiology and offer virtually no differential diagnostic value in postacute or chronic phases. This population-based study examined whether the presence of abnormalities during neurological examination would distinguish between mTBI (in the chronic phase), healthy controls, and selected psychiatric conditions. Retrospective analysis of data from 4462 community-dwelling Army veterans was conducted. Diagnostically unique groups were compared on examination of cranial nerve function and other neurological signs. Results demonstrated that individuals with mTBI were no more likely than those with a major depressive disorder, generalized anxiety disorder, posttraumatic stress disorder, or somatoform disorder to show any abnormality. Thus, like self-reported cognitive and emotional symptoms, the presence of cranial nerve or other neurological abnormalities offers no differential diagnostic value. Clinical implications and study limitations are presented.
Psychometric data (19 variables) on the cognitive abilities of large samples of American white (W) and black (B) male armed services veterans were factor analyzed to test Spearman's hypothesis that variation in the size of the mean W–B difference on various cognitive tests is directly related to variation in the size of the tests' loadings on the g factor. The hypothesis is strongly borne out by the data. Other factors independent of g showed no significant relationship to W–B differences in this battery of diverse tests.
Ambient light levels influence visual system size in birds and primates. Here, we argue that the same is true for humans. Light levels, in terms of both the amount of light hitting the Earth's surface and day length, decrease with increasing latitude. We demonstrate a significant positive relationship between absolute latitude and human orbital volume, an index of eyeball size. Owing to tight scaling between visual system components, this will translate into enlarged visual cortices at higher latitudes. We also show that visual acuity measured under full-daylight conditions is constant across latitudes, indicating that selection for larger visual systems has mitigated the effect of reduced ambient light levels. This provides, to our knowledge, the first support that light levels drive intraspecific variation in visual system size in the human population.