ArticlePDF Available

The Wonderlic Personnel Test: Reliability and Validity in an Academic Setting

Authors:

Abstract

Based on total of 290 undergraduates, the split-half reliability of the Wonderlic Personnel Test was .87 and the Pearson correlation between test score and mean grade was .21. Implications are presented for the use of this test in an academic setting.
Psychological Reports,
1989, 65, 161-162.
@
Psychological Reports 1989
THE WONDERLIC PERSONNEL TEST: RELIABILITY AND
VALIDITY IN AN ACADEMIC SETTING
'
STUART J.
McKELVIE
Bishop's Universib
Summary.-Based on total of
290
undergraduates, the split-half reliability of the
Wonderlic Personnel Test was .87 and the Pearson correlation between test score and
mean grade was .21. Implications
are
presented for the use
of
this test in
an
academic
setting.
The Wonderlic Personnel Test (WPT) is a 12-min., 50-item spiral-
omnibus test of "problem-solving ability" (Wonderlic, 1983) that is widely
used as a screening device in business and industry (Murphy, 1984).
However, it may also be viewed as a test of general intelligence (Davou
&
McKelvie, 1984) since its items
are
based on the original Otis Test of
Mental Ability (Wonderlic, 1983) and since scores correlate fairly well
(56
to .80) with aptitude G (General Learning Ability) of the General Aptitude
Test Battery (Wonderlic, 1983) and very highly (.93) with the
WAIS Full
Scale
IQ
(Dodrill, 1981). Given that general intelligence tests are often used
in academic settings, the test might be useful, particularly since it is avail-
able in 16 alternative forms. This note provides data concerning the internal
consistency and concurrent validity of the test with undergraduate students.
As part of a larger project, samples of 290 full-time undergraduates in
three consecutive years (1987-1989;
n
=
79,
99,
112) were administered the
Wonderlic under standard conditions. The samples were chosen to be repre-
sentative of the local university population, which was stratified by division
(Natural Science, Social Science, Humanities, Business), year of study
(three), and sex. Testing was conducted by students in a course on psycho-
logical testing. Each student administered the test to about 10 subjects, who
were recruited for this part of the project if they permitted the author (not
the testers) to view their transcripts.
Odd-even split-half reliability coefficients were computed and adjusted
with the Spearman-Brown formula, to give values of .89, .89, and .83 for
the three calendar years, the number for the combined samples being .87.
This estimate agrees with the values (.88, .94) provided in the manual
(Wonderlic, 1983), showing that the internal consistency of the test is
acceptable in both academic and nonacademic settings. To calculate grades
'I
thank
d
those who participated and assisted
in
this study, particularly
Mary
Latuli
pe,
Barbara McLeUan, Susie Shields, Sue Stuart, and Lynn White who helped to collate the fata,
and Sandra Gallichon and Kim Passey, who retrieved the transcripts. Reprint requests should
be sent to Smart
J.
McKelvie, Department of Psychology, Bishop's Univenity, Lennoxville,
Quebec, JIM
127
Canada.
162
S.
J.
McKELVIE
on the same basis, only courses taken in the most recent semester were con-
sidered, numbers for individual students ranging from three to six, the mode
being five. For the three consecutive years, Pearson correlations between
test scores and mean grades (concurrent validity) were
.23
(p< .05), .17
(p
<
.lo),
and
.23
(p
<
.02),
respectively; the combined value was
.21.
Notably, this estimate is identical to the one given in the manual for the
correlation of test score with
GPA.
In addition, the Ms and SDs in the three
calendar years were
26.1
and
6.6 (1987), 27.0
and
6.4 (1988),
and
26.8
and
5.8 (1989).
These means were not significantly different
(F,,,,,
=
.67, p
>
.20),
and the over-all
M
and SD were
26.7
and
6.2.
The mean falls between the
test scores in the manual for high school
(20.8)
and college
(29.6)
graduates,
suggesting that the present over-all mean score is consistent with previous
findings.
These data show that the Wonderlic is internally consistent for under-
graduates and that their mean score falls as expected on the basis of
Wonderlic's norms. However, the low validity coefficient reported in the
manual and confirmed here suggests that the test has little practical value as
a predictor of individual grades, accounting as it does for only
4.4%
of the
variance.
Of
course, this value might be slightly depressed by a restriction in
range, since only enrolled students were tested. Indeed, the current SDs
were slightly lower than the most relevant ones in the manual, which were
about
7.0.
On the other hand, the test might be useful as a general screen-
ing device or even as an addition to other admission data, particularly if the
selection ratio at the institution were low (Anastasi,
1988,
p.
174).
Probably
the most useful function for the Wonderlic in the academic setting, how-
ever, would be as a research instrument when a quick estimate of general
intelligence is needed. So
I
advise my students.
REFERENCES
ANASTASI, A.
(1988)
Psychological testing.
(6th
ed.) New York: Macmillan.
D~vou,
D.,
&
MCKELVIE,
S.
J.
(1984)
Relationship between study habits and performance on
an intelligence test with limited and unlimited time.
Psychological Reports,
54, 367-371.
MURPHY,
K. R.
(1984)
The Wonderlic Personnel Test. In
D.
J.
Keyser
&
R. C. Sweetland
(Eds.),
Test critiques.
Vol.
1.
Pp.
769-775.
WONDERLIC,
E.
F.
(1983)
Wonderlic Personnel Test manual.
NorthField,
IL:
E.
F. Wonderlic
&
Assoc.
Accepted July
28,
1989.
... We were looking for a measure that explicitly assessed the cognitive ability to handle complex visual tasks-due to the nature of the explanation provided in the experimental studies. Standardized psychological tests of cognitive ability typically measure core cognitive abilities-e.g., the Primary Mental Abilities (PMA) test [78] or the Wonderlic Personnel Test (WPT) [79]. Prior research indicates that individuals are capable of differentiating between distinct cognitive abilities when providing self-ratings [80]. ...
... This time constraint did not allow us to include a more complex cognitive ability test. For example, tests such as the Primary Mental Abilities (PMA) test [78] or the Wonderlic Personnel Test (WPT) [79] would have increased the study's length by up to 30 minutes [82]. Though there are shorter versions of cognitive ability tests, they still take 8-10 min and usually only the full version has been validated. ...
... Still, there is a long and extensive history of the development and use of cognitive tests for psychological and HCI research. Hence, future research should investigate whether the study's findings can be replicated with classic, more extensive cognitive ability tests [78,79]. We suggest the use of validated self-report measures of cognitive ability in addition to-not instead of-classic cognitive ability tests. ...
Article
Full-text available
Human-AI collaboration has become common, integrating highly complex AI systems into the workplace. Still, it is often ineffective; impaired perceptions—such as low trust or limited understanding—reduce compliance with recommendations provided by the AI system. Drawing from cognitive load theory, we examine two techniques of human-AI collaboration as potential remedies. In three experimental studies, we grant users decision control by empowering them to adjust the system's recommendations, and we offer explanations for the system's reasoning. We find decision control positively affects user perceptions of trust and understanding, and improves user compliance with system recommendations. Next, we isolate different effects of providing explanations that may help explain inconsistent findings in recent literature: while explanations help reenact the system's reasoning, they also increase task complexity. Further, the effectiveness of providing an explanation depends on the specific user's cognitive ability to handle complex tasks. In summary, our study shows that users benefit from enhanced decision control, while explanations—unless appropriately designed for the specific user—may even harm user perceptions and compliance. This work bears both theoretical and practical implications for the management of human-AI collaboration.
... Successful completion of these physically demanding courses requires extreme mental fitness as well as the obvious physical fitness (Van Hoof et al., 1992;Zazanis et al., 1999). With regard to cognitive measures, it was shown that Soldiers scoring in the bottom 20% on the cognitive Wonderlic test (McKelvie, 1989) were less likely to succeed at SFAS (Zazanis et al., 1999). Predicted course success to varying degrees is related to personal grit and psychological measures such as intelligence, aptitude, and resilience (Bartone et al., 2008;Beal, 2010;and Zazanis et al., 1999). ...
Article
Full-text available
More than 75 women have successfully graduated from the U.S. Army Ranger Course since the integration of women into elite military combat training. This study sought to identify the psychological characteristics and sociological variables that contributed to their motivation and success. A guided interview and demographic and psychological questionnaires were used to assess characteristics of 13 women who successfully completed elite military combat training. Collectively, these women were college graduates and had well educated fathers, possessed high levels of grit and resiliency, and described themselves as self-competitive challenge seekers. These women all had a strong male influence in their lives. The characteristics of these pioneer women may be unique from subsequent cohorts as female participation in elite military combat training becomes the norm and as attitudes and experiences change for graduates of female combat training over time.
... lity scores remained basically the same for both conditions, F(267, 267) = 1.00-1.27; p > .05. Corresponding effect sizes were in the lower range of previous research results(Birkeland et al., 2006;Viswesvaran & Ones, 1999). The mean score on the WPT (M = 23.15, SD = 6.88) corresponded to that from other studies in the educational context(M = 26.7;McKelvie, 1989). In Sample 3, we also found significant correlations between participants' agreea-bleness and their competitive worldviews in both conditions, although lower than in the other two samples (r Honest condition = −.31 and r Applicant condition = −.12 in Sample 3 compared to r Honest condition = −.57 and r Applicant condition = −.49 in Samp ...
Article
Full-text available
Recent research has highlighted competitive worldviews as a key predictor of faking— the intentional distortion of answers by candidates in the selection context. According to theoretical assumptions, applicants’ abilities, and especially their cognitive abilities, should influence whether faking motivation, triggered by competitive worldviews, can be turned into successful faking behavior. Therefore, we examined the influence of competitive worldviews on faking in personality tests and investigated a possible moderation of this relationship by cognitive abilities in three independent high school and university student samples (N1 = 133, N2 = 137, N3 = 268). Our data showed neither an influence of the two variables nor of their interaction on faking behavior. We discuss possible reasons for these findings and give suggestions for further research.
... In Kombination mit der ebenfalls hohen Korrelation des correct statistical reasoning mit dem WPT ( = .55), einem Maß für allgemeine kognitive Fähigkeit(McKelvie, 1989, Hicks, Harrison & Engle, 2015 und der um ca. einen kleinen Effekt(Cohen = .14 ...
Thesis
Full-text available
In der Arbeit wird das ins Deutsche übersetzte Statistical Reasoning Assessment (SRA, Garfield, 2003) hinsichtlich seiner Eignung zur Erfassung statistischer Kompetenz bewertet. Da die Messintention des SRA (statistical reasoning) als Konstrukt, Lehrziel oder kognitiver Prozess verstanden werden kann, erfolgte die Validierung hinsichtlich dieser drei Aspekte. Das Instrument wurde in vier Stichproben (Psychologiestudierende im ersten und zweiten Semester, n = 31 und n = 51, Sonderpädagogikstudierende, n = 277, sowie Masterstudierende mit wenig statistischem Vorwissen, n = 34) jeweils zu Beginn und am Ende einer einsemestrigen Statistiklehrveranstaltung eingesetzt. Zusätzlich zum SRA wurden kognitive Leistungsmaße (mathematische Fertigkeiten, deduktives Schließen, Figurreihenfortsetzungsaufgaben) sowie Einstellungen zu Statistik (Survey of Attitudes Towards Statistics) erhoben. Die explorativen und konfirmatorischen faktoriellen Analysen des SRA für die 8 Stichproben ergaben keine klaren Hinweise auf eine erwartete ein-, vier- oder achtdimensionale Struktur für das correct statistical reasoning. Der Vergleich der Summe gelöster Items und Itemschwierigkeiten zwischen Stichproben (Psychologie vs. Sonderpädagogik, Master) und im Zeitverlauf (Beginn vs. Ende des Semesters) ergaben nur für Studierende der Psychologie im ersten Semester niedrige bis moderate Effekte. Korrelations- und Regressionsanalysen zeigten für den SRA-Wert am Ende des Semesters geringe inkrementelle Varianzaufklärung durch Figurreihenfortsetzungsaufgaben über andere kognitive Maße, Einstellungen zu Statistik sowie den SRA-Wert zu Beginn des Semesters hinaus. Insgesamt ließen sich wenige Hinweise auf die Inhaltsvalidität, die konvergente und diskriminante Konstrukt- sowie Lehrzielvalidität beobachten. Auf Basis der Ergebnisse wird eine Empfehlung für die Konstruktion eines validen Instruments zur Erfassung von Statistikkompetenz aufgezeigt.
... The company Wonderlic itself conducted internal research that states the Internal Consistency Reliability coefficient of the test ranges from .88 to .94 (McKelvie, 1989). Based on the 'General Guidelines for ...
Research
The Wonderlic Personnel Test is a general paper-and-pen cognitive ability test and has been used worldwide as job evaluation and screening tool in business and industry organizations since 1937. The Wonderlic test is based on the original Otis Test of Mental Ability that is arranged in a spiral-omnibus format, that is, in ascending order of difficulty which yields to a single end score. (Wonderlic, 1983) The Wonderlic Test is used to measure a candidate’s ability to learn, adapt, solve problems and understand instructions that assist in determining a candidate’s readiness for various jobs. Moreover, this test provides various methods of predicting job performance across a wide variety of occupations and positions (Wonderlic Product Catalog, 2017).
... Scores are also available in the traditional normalized distribution with a mean of 100. Previous research has supported the psychometric properties of the Wonderlic test (e.g., McKelvie, 1989). ...
Article
Many have expressed concerns with respect to the development of managerial skills in business graduate programs. Time constraints and limited practice opportunities are, arguably, the key roadblocks for development. The author explores the outcomes of a course that addresses these roadblocks by providing an instruction that cuts across academic terms and utilizes core courses as settings for repetitive practice opportunities. Results support the effectiveness of the course in advancing critical thinking and oral communication skills. From the results, the author details the strengths of the course and the features that could be replicated across business schools.
... We assessed general mental ability using the Wonderlic Personnel Test, a timed, 12-minute assessment with 50 items, scored as the number of correct responses. The Wonderlic has demonstrated correlations ranging from 0.85 to 0.93 with the Wechsler Adult Intelligence Scale full scale (Dodrill, 1981;Dodrill & Warner, 1988), and has strong validity (McKelvie, 1989) and test-retest reliability (Dodrill, 1983). We utilized items from Goldberg's (1999) International Personality Item Pool to measure conscientiousness and emotional stability, the two Big Five dimensions that prior research has most closely linked to performance and career success (Barrick & Mount, 1991;Judge, Higgins, Thoresen, & Barrick, 1999;. ...
... We assessed general mental ability using the Wonderlic Personnel Test, a timed, 12-minute assessment with 50 items, scored as the number of correct responses. The Wonderlic has demonstrated correlations ranging from 0.85 to 0.93 with the Wechsler Adult Intelligence Scale full scale (Dodrill, 1981;Dodrill & Warner, 1988), and has strong validity (McKelvie, 1989) and test-retest reliability (Dodrill, 1983). We utilized items from Goldberg's (1999) International Personality Item Pool to measure conscientiousness and emotional stability, the two Big Five dimensions that prior research has most closely linked to performance and career success (Barrick & Mount, 1991;Judge, Higgins, Thoresen, & Barrick, 1999;. ...
Article
Despite strong claims for the importance of emotional intelligence (EI) in the workplace, few studies have empirically examined the influence of emotional intelligence on career success. Theoretically, emotional intelligence should help employees to develop stronger interpersonal relationships and leadership skills, leading to higher financial compensation. To test this proposed relationship, we examine whether an ability-based measure of emotional intelligence in 126 college students predicts their salaries 10 to 12 years post workforce entry, controlling for personality, general mental ability, gender, and college GPA. We find that emotional intelligence has a significant, positive effect on subsequent salary levels, and that this effect is: 1) mediated by having a mentor and 2) stronger at higher organizational levels than at lower levels. Our results suggest that emotional intelligence helps individuals to acquire the social capital needed to be successful in their careers. Implications for theory and practice are discussed.
Article
Full-text available
Sam Shepard was born by the name Samuel Shepard Rogers on November 5, 1943 in Fort Sheridan, Illinois. Sam Shepard has gained fame as one of America’s foremost living playwrights. He became an archetypal symbol of American self-made man and has been identified “a true American hero”. His first full-length play, La Turista, was performed at the American Place Theatre in 1967. His popular play Buried Child won the Pultizer Prize. In the first Act of this play, Shepard highlighted the bewilderment between characters, just because of innocent son and crazy performance of an old man. Second Act focused to reveal secret of family from the past. Tension and violence increased through the way of dishonest and the act of incest. In the Third Act, Shepard finalized the character Vince as the real heir of that family. Innocent character Tilden appeared with Buried child finally.
Article
Two groups of undergraduates, identified as high scorers ( n = 24) or low scorers ( n = 23) on the Study Habits section of the Survey of Study Habits and Attitudes, were administered alternate forms of the Wonderlic Personnel Test on two occasions, one with limited and one with unlimited time. Since high-scoring subjects performed better than low scorers on both timed forms but more so with limited time, and since they attempted more questions (limited) and were faster (unlimited), it was concluded that the superiority of students with high scores on study habits is based on both power and speed.
The Wonderlic Personnel Test
  • K R Murphy
MURPHY, K. R. (1984) The Wonderlic Personnel Test. In D. J. Keyser & R. C. Sweetland (Eds.), Test critiques. Vol. 1. Pp. 769-775.