Situational Judgement Tests and the Measurement of Personality: Some Answers and More
Timothy A. Judge
The Ohio State University
Vrije Universiteit Brussel
To cite this article:
Judge, T.A., Hofmans, J., & Wille, B. (2017). Situational Judgement Tests and the
Measurement of Personality: Some Answers and More Questions. European Journal of
Personality, 31, 463-464.
Work psychologists have devoted a great deal of attention to studying how personality traits
can best be conceptualized and assessed in “high stakes” contexts such as selection or hiring
decisions. Lievens argues that two selection methods, Situational Judgment Tests (SJTs) and
Assessment Centers (ACs), by standardizing and contextualizing personality measurement,
offer many advantages to personality psychology. In hopes of clarifying this argument, we ask
two fundamental questions: (1) What aspects of personality do SJTs/ACs fail to measure (are
SJTs/ACs deficient); and (2) What do SJTs/ACs actually measure (are SJTs/ACs contaminated)?
Situational Judgement Tests and the Measurement of Personality: Some Answers and More
In his focal article, Filip Lievens made a general point with which we have full agreement,
and several more specific points which we argue require further analysis and debate. Lievens’
general argument is that personality and applied psychologists (who study selection decisions)
have much to contribute to their respective disciplines. As researchers who have immersed
themselves in both disciplines, we endorse this argument without qualification. Organizational
psychologists have benefitted greatly from the scientific progress in personality psychology
and, like Lievens, we believe personality psychology would benefit from a closer integration of
research into how personality traits are studied in work contexts.
Lievens goes on to argue that one means personality psychologists can learn from work
psychology is the research on situational judgement tests (SJTs) and assessment centers (ACs).
He argues that these selection methods are worth considering because they: (1) offer more
precision and control than standard methods through standardization of the situations across
participants; and (2) have incremental validity above and beyond the more traditional
instruments. However, to fully appreciate their potential, we maintain that two important
questions need to be answered regarding SJTs and ACs. First, what do we fail to measure by
standardizing the situations? Second, what do we actually measure using these methods?
What Do Situational Judgment Tests (SJTs) and Assessment Centers (ACs) Fail to Measure?
In the target article, it is argued that SJTs and ACs—as compared to experience–
sampling studies—allow for more control in the testing of within-person variability through the
standardization of situations across individuals. Whereas we fully support the claim that
research on within-person variability can benefit from more information on the different
sources of within-person variability, an important question is whether the solution to this issue
lies in the standardization of the situations. In what follows, we will argue that there are at least
two issues related to the proposed solution.
A first issue is that decades of research on the topic of situation selection has shown
that people do not passively encounter situations, but instead are actively involved in the
selection, modification, and creation of them (Rauthmann, Sherman, Nave, & Funder, 2015).
For example, Bolger and Schilling (1991) have shown that exposure to stressors accounted for
one-third—and reactivity to stressors for two-thirds—of the meaningful variance in the
relationship between Neuroticism and distress in daily life. Moreover, through the choice and
enactment of situations conducive to one’s personality, self-selection processes are believed
to partially explain personality stability (Costa & McCrae, 1980). Because personality plays a key
role in creating and enacting environments individuals experience, standardizing situations
across individuals poses several challenges. First, because people do not encounter situations
randomly, standardization runs the risk of presenting people with situations that are not
representative of those they typically encounter in everyday life. Second, it reduces personality
to reactivity to (potentially unrepresentative) situations, which means that a significant part of
the meaningful within-person variance is excluded.
What Do Situational Judgment Tests (SJTs) and Assessment Centers (ACs) Actually Measure?
Whereas the previous question concerned construct deficiency (standardizing
situations removes much meaningful variation in personality), one may also question the
degree to which both SJTs and ACs are contaminated measures of personality. With regard to
SJTs, it has been argued that SJT items are often characterized by heterogeneity at the item
level, such that they show correlations with constructs that are not related to each other
(McDaniel, List, & Kepes, 2016). Similarly, a recurring observation in the AC literature has been
that scores on one dimension of an AC exercise correlate highly with scores on other
dimensions of the same exercise (i.e., low discriminant validity), whereas when people are
rated on the same dimension in more than one exercise, there is little correlation among the
dimensional ratings (i.e., low convergent validity). Although taking specific design
recommendations into account can help to boost their construct validity (e.g., Lievens, 2001),
an ongoing concern is that the multidimensional nature of the situations that are presented in
these assessments will always require assessees to respond on the basis of (the complex
interaction of) multiple underlying traits.
Moreover, in addition to assessing multiple traits in the same context (i.e., a single
AC/SJT exercise/item), we can also expect the behavioral reactions evoked in these
assessments to reflect constructs that do not belong to the personality domain. Indeed, one
may argue that AC and SJT scores reflect procedural knowledge (job-relevant skills and
knowledge) and cognitive ability as much as they do personality. Indeed, Lievens and Sackett
(2012) argue SJTs can be “…viewed as measures of procedural knowledge in a specific domain
(e.g., interpersonal skills)” (p. 460). Moreover, this confounding with non-personality
constructs might actually explain why these measures are shown to offer incremental validity
on top of trait measures (i.e., because they are, in part, not measuring personality at all). These
concerns over contamination complicates the usefulness of SJTs and ACs as “pure” or
unconfounded measures of personality.
Like Lievens, we believe that personnel selection research has offered and continues to
offer valuable insights into personality psychology. As our two questions suggest, however, we
wonder whether the promise of SJTs and ACs—as measures of personality traits—is
counterbalanced by significant concerns about their construct validity. Lievens welcomes
further advancements on this front, and we believe one promising avenue are further
investigations of the degree to which SJTs both mediate and moderate the relationship
between personality traits and work-related behavior. Moreover, these methods might be
valuable because they allow access to new constructs with which personality psychologists are
less familiar. One example is implicit trait policies (ITPs), which tap into procedural knowledge
regarding the effectiveness of different trait levels. We believe that adding these constructs to
the repertoire of personality researchers (or at least making personality psychologists think
more about the criterion side) might help us to move a step closer to the ultimate goal:
explaining why people behave, feel and think the way they do. This is certainly a goal shared by
personality and work psychologists alike.
Bolger, N., & Schilling, E. A. (1991). Personality and problems of everyday life: The role of
neuroticism in exposure and reactivity to daily stressors. Journal of Personality, 59, 355–
Costa, P.T., Jr., & McCrae, R. R. (1980). Still stable after all these years: Personality as a key to
some issues in adulthood and old age. In P. B. Baltes & O. G. Brim, Jr. (Eds.), Life span
development and behavior (Vol. 3, pp. 65–102). New York: Academic Press.
Lievens, F. (2001). Assessors and use of assessment center dimensions: A fresh look at a
troubling issue. Journal of Organizational Behavior, 65, 1–19.
Lievens, F., & Sackett, P. R. (2012). The validity of interpersonal skills assessment via situational
judgment tests for predicting academic success and job performance. Journal of Applied
Psychology, 97, 460–468.
McDaniel, M.A., List, S.K., & Kepes, S. (2016). The “Hot Mess” of situational judgment test
construct validity and other issues. Industrial and Organizational Psychology:
Perspectives on Science and Practice, 9, 47–51.
Rauthmann, J. F., Sherman, R. A., Nave, C. S., & Funder, D. C. (2015). Personality-driven situation
experience, contact, and construal: How people's personality traits predict
characteristics of their situations in daily life. Journal of Research in Personality, 55, 98–