ArticlePDF Available

Situational Judgement Tests and Personality Measurement: Some Answers and More Questions



Work psychologists have devoted considerable attention to studying how personality traits can best be conceptualized and assessed in ‘high-stakes’ contexts such as selection or hiring decisions. Lievens argued that two selection methods, Situational Judgement Tests and Assessment Centre exercises, by standardizing and contextualizing personality measurement, offer many advantages to personality psychology. In hopes of clarifying this argument, we ask two fundamental questions: (1) What aspects of personality do these methods fail to measure (are they deficient) and (2) What do they actually measure (are they contaminated)?
Situational Judgement Tests and the Measurement of Personality: Some Answers and More
Timothy A. Judge
The Ohio State University
Joeri Hofmans
Vrije Universiteit Brussel
Bart Wille
Universiteit Antwerpen
To cite this article:
Judge, T.A., Hofmans, J., & Wille, B. (2017). Situational Judgement Tests and the
Measurement of Personality: Some Answers and More Questions. European Journal of
Personality, 31, 463-464.
Work psychologists have devoted a great deal of attention to studying how personality traits
can best be conceptualized and assessed in “high stakes” contexts such as selection or hiring
decisions. Lievens argues that two selection methods, Situational Judgment Tests (SJTs) and
Assessment Centers (ACs), by standardizing and contextualizing personality measurement,
offer many advantages to personality psychology. In hopes of clarifying this argument, we ask
two fundamental questions: (1) What aspects of personality do SJTs/ACs fail to measure (are
SJTs/ACs deficient); and (2) What do SJTs/ACs actually measure (are SJTs/ACs contaminated)?
Situational Judgement Tests and the Measurement of Personality: Some Answers and More
In his focal article, Filip Lievens made a general point with which we have full agreement,
and several more specific points which we argue require further analysis and debate. Lievens’
general argument is that personality and applied psychologists (who study selection decisions)
have much to contribute to their respective disciplines. As researchers who have immersed
themselves in both disciplines, we endorse this argument without qualification. Organizational
psychologists have benefitted greatly from the scientific progress in personality psychology
and, like Lievens, we believe personality psychology would benefit from a closer integration of
research into how personality traits are studied in work contexts.
Lievens goes on to argue that one means personality psychologists can learn from work
psychology is the research on situational judgement tests (SJTs) and assessment centers (ACs).
He argues that these selection methods are worth considering because they: (1) offer more
precision and control than standard methods through standardization of the situations across
participants; and (2) have incremental validity above and beyond the more traditional
instruments. However, to fully appreciate their potential, we maintain that two important
questions need to be answered regarding SJTs and ACs. First, what do we fail to measure by
standardizing the situations? Second, what do we actually measure using these methods?
What Do Situational Judgment Tests (SJTs) and Assessment Centers (ACs) Fail to Measure?
In the target article, it is argued that SJTs and ACsas compared to experience–
sampling studies—allow for more control in the testing of within-person variability through the
standardization of situations across individuals. Whereas we fully support the claim that
research on within-person variability can benefit from more information on the different
sources of within-person variability, an important question is whether the solution to this issue
lies in the standardization of the situations. In what follows, we will argue that there are at least
two issues related to the proposed solution.
A first issue is that decades of research on the topic of situation selection has shown
that people do not passively encounter situations, but instead are actively involved in the
selection, modification, and creation of them (Rauthmann, Sherman, Nave, & Funder, 2015).
For example, Bolger and Schilling (1991) have shown that exposure to stressors accounted for
one-third—and reactivity to stressors for two-thirds—of the meaningful variance in the
relationship between Neuroticism and distress in daily life. Moreover, through the choice and
enactment of situations conducive to one’s personality, self-selection processes are believed
to partially explain personality stability (Costa & McCrae, 1980). Because personality plays a key
role in creating and enacting environments individuals experience, standardizing situations
across individuals poses several challenges. First, because people do not encounter situations
randomly, standardization runs the risk of presenting people with situations that are not
representative of those they typically encounter in everyday life. Second, it reduces personality
to reactivity to (potentially unrepresentative) situations, which means that a significant part of
the meaningful within-person variance is excluded.
What Do Situational Judgment Tests (SJTs) and Assessment Centers (ACs) Actually Measure?
Whereas the previous question concerned construct deficiency (standardizing
situations removes much meaningful variation in personality), one may also question the
degree to which both SJTs and ACs are contaminated measures of personality. With regard to
SJTs, it has been argued that SJT items are often characterized by heterogeneity at the item
level, such that they show correlations with constructs that are not related to each other
(McDaniel, List, & Kepes, 2016). Similarly, a recurring observation in the AC literature has been
that scores on one dimension of an AC exercise correlate highly with scores on other
dimensions of the same exercise (i.e., low discriminant validity), whereas when people are
rated on the same dimension in more than one exercise, there is little correlation among the
dimensional ratings (i.e., low convergent validity). Although taking specific design
recommendations into account can help to boost their construct validity (e.g., Lievens, 2001),
an ongoing concern is that the multidimensional nature of the situations that are presented in
these assessments will always require assessees to respond on the basis of (the complex
interaction of) multiple underlying traits.
Moreover, in addition to assessing multiple traits in the same context (i.e., a single
AC/SJT exercise/item), we can also expect the behavioral reactions evoked in these
assessments to reflect constructs that do not belong to the personality domain. Indeed, one
may argue that AC and SJT scores reflect procedural knowledge (job-relevant skills and
knowledge) and cognitive ability as much as they do personality. Indeed, Lievens and Sackett
(2012) argue SJTs can be “…viewed as measures of procedural knowledge in a specific domain
(e.g., interpersonal skills)” (p. 460). Moreover, this confounding with non-personality
constructs might actually explain why these measures are shown to offer incremental validity
on top of trait measures (i.e., because they are, in part, not measuring personality at all). These
concerns over contamination complicates the usefulness of SJTs and ACs as “pure or
unconfounded measures of personality.
Like Lievens, we believe that personnel selection research has offered and continues to
offer valuable insights into personality psychology. As our two questions suggest, however, we
wonder whether the promise of SJTs and ACs—as measures of personality traits—is
counterbalanced by significant concerns about their construct validity. Lievens welcomes
further advancements on this front, and we believe one promising avenue are further
investigations of the degree to which SJTs both mediate and moderate the relationship
between personality traits and work-related behavior. Moreover, these methods might be
valuable because they allow access to new constructs with which personality psychologists are
less familiar. One example is implicit trait policies (ITPs), which tap into procedural knowledge
regarding the effectiveness of different trait levels. We believe that adding these constructs to
the repertoire of personality researchers (or at least making personality psychologists think
more about the criterion side) might help us to move a step closer to the ultimate goal:
explaining why people behave, feel and think the way they do. This is certainly a goal shared by
personality and work psychologists alike.
Bolger, N., & Schilling, E. A. (1991). Personality and problems of everyday life: The role of
neuroticism in exposure and reactivity to daily stressors. Journal of Personality, 59, 355
Costa, P.T., Jr., & McCrae, R. R. (1980). Still stable after all these years: Personality as a key to
some issues in adulthood and old age. In P. B. Baltes & O. G. Brim, Jr. (Eds.), Life span
development and behavior (Vol. 3, pp. 65–102). New York: Academic Press.
Lievens, F. (2001). Assessors and use of assessment center dimensions: A fresh look at a
troubling issue. Journal of Organizational Behavior, 65, 1–19.
Lievens, F., & Sackett, P. R. (2012). The validity of interpersonal skills assessment via situational
judgment tests for predicting academic success and job performance. Journal of Applied
Psychology, 97, 460468.
McDaniel, M.A., List, S.K., & Kepes, S. (2016). The “Hot Mess of situational judgment test
construct validity and other issues. Industrial and Organizational Psychology:
Perspectives on Science and Practice, 9, 4751.
Rauthmann, J. F., Sherman, R. A., Nave, C. S., & Funder, D. C. (2015). Personality-driven situation
experience, contact, and construal: How people's personality traits predict
characteristics of their situations in daily life. Journal of Research in Personality, 55, 98
... That is, ITPs reveal an individual's effectiveness rating of specific behaviors, which can be attributed to the trait-level expression of the behavior (Lievens, 2017a). Although the concept of ITPs is currently closely related to the method and theory of Situational Judgment Tests (SJTs; Lievens & Motowidlo, 2016), several researchers agreed that ITPs may be fruitful for personality research in general (e.g., Judge et al., 2017;Lievens, 2017a;Motowidlo, 2017;Wright, 2017). ...
Full-text available
In response to recent calls to incorporate Implicit Trait Policies (ITPs) into personality research, the current study examined the construct-related validity of ITP measures. ITPs are defined as implicit beliefs about the effectiveness of behaviors that reflect a certain trait. They are assessed by utilizing the methodology of Situational Judgment Tests. We empirically examined (N = 339) several underlying key assumptions of ITP theory, including trait-specificity, the relation to personality traits, their context-independence, and the relation to general domain knowledge. Overall, our results showed little support for these assumptions. Although we found some confirmation for expected correlations between ITPs and personality traits, most of the observed variance in ITP measures was either method-specific or due to measurement error. We conclude that the herein examined ITP measures lack construct-related validity and discuss implications for ITP theory and assessment.
... That is, ITPs reveal an individual's effectiveness rating of specific behaviors, which can be attributed to the trait-level expression of the behavior . Although the concept of ITPs is currently closely related to the method and theory of Situational Judgment Tests (SJTs; , several researchers agreed that ITPs may be fruitful for personality research in general (e.g., Judge et al., 2017;Motowidlo, 2017;Wright, 2017). ...
Full-text available
In recent years, more and more psychological assessments aimed at capturing interactions between the person and situations. Situational Judgment Tests (SJTs) are built on a similar premise, as they were designed as low-fidelity simulations of situations. These tests incorporate short situation descriptions with several behavioral response options. However, the validity and underlying psychological processes of SJTs generally remained subject to debate as a growing body of research suggested that SJTs may reflect context-independent measures. Within this debate, other scholars argued in favor of the relevance of person-situation processes for SJT responses. So far, sufficient evidence that unravels the true underlying processes of SJTs is missing. This dissertation aims at closing this gap and at contributing to a deeper understanding of SJTs as psychological as-sessment methods. Four empirical research papers provide theory-driven insights on context-independent and person-situation processes as potential determinants of SJT responses. First, the construct-related validity of Implicit Trait Policies is examined and therefore the notion of SJTs as context-independent measures. Next, situation construal (i.e., the perception of situations), and processes postulated by Trait Activation Theory are considered as relevant theoretical underpinnings for SJTs. Results overall supported the relevance of person-situation interactions as underlying processes and particularly challenged SJTs as measures of Implicit Trait Policies. Especially situation construal explained SJT responses consistently across three studies. However, the results also showed that not situation descriptions but response options were often crucial for relevant person-situation processes as captured in SJT response. This lack of impact of situation descriptions also potentially limited the explanatory power of Trait Activation Theory in the context of SJT items. The results are discussed in regard to the debate about underlying processes of SJT responses. All in all, these studies raise the question whether key design features of common SJTs (i.e. situation descriptions and response options) are optimally developed for the assessment of person-situation interactions. The final paper of this dissertation introduces Standardized State Assessment as narrower and theory-driven methodological framework for the assessment of psychological states in hypothetical situations. Limitations of this dissertation, as well as implications for research and practice of psychological assessments based on situation descriptions are discussed.
... Our items apparently included more context than theirs and correspondingly, were also longer. Researchers have suggested that SJT scores may reflect cognitive ability and that this may help to explain their incremental validity (e.g., Judge et al., 2017). Further, there is a current debate among researchers about the value of even including "the situation" in SJT items-implying that context may not matter (e.g., Krumm et al., 2015;Rockstuhl et al., 2015). ...
Full-text available
The study extends personality and situational judgment test (SJT) research by using an SJT to measure achievement striving in a contextualized manner. Employed students responded to the achievement striving SJT, traditional personality scales, and workplace performance measures. The SJT was internally consistent, items loaded on a single factor, and scores converged with other measures of achievement striving. The SJT provided incremental criterion-related validity for the performance criteria beyond less-contextualized achievement striving measures. Findings suggest that achievement-related work scenarios may provide additional criterion-relevant information not captured by measures that are less contextualized.
... This linearity might be seen as contrived because it does not match how actual situations unfold. As the prototypical SJT puts constraints on individuals' propensities to select situations, shape existing situations, and create novel situations, it means that one only gets insight into personality reactivity (Judge, Hofmans, & Wille, 2017). ...
Full-text available
In employment and education settings, Situational Judgment Tests (SJTs) have made strong inroads. So far, however, they are still underutilized in personality research. The objective of this chapter is to outline how SJTs might be adapted to measure personality traits, shed light onto the person-situation interplay, and stimulate research on it. We start by discussing the traditional simulation-based view on SJTs, including information on their development and research results. Next, we show how more recent versions have started to assess people’s knowledge of relevant behavior related to personality traits. Finally, we specify various strategies as to how SJTs might be further adapted to shed light on the personality-situation interplay. Along these lines, we show how SJTs might be used to assess within-person variability across situations, situation-trait contingencies, proactive transactions, behavioral responses, narratives and goals, and personality disorders.
... Despite these advantages, an important downside of SJTs is that the situations are preselected and linearly presented, which means that they are identical for all candidates (Judge, Hofmans, & Wille, 2017). By standardizing the situations across participants, one fails to take into account that in real life people actively select, modify, and create situations (Rauthmann, Sherman, Nave, & Funder, 2015). ...
Full-text available
Recently, there have been repeated calls in the literature for an integrative approach to personality, in which both between- and within-person fluctuations are simultaneously considered. Although the integrative approach to personality offers a compelling extension of the traditional trait approach, one of the major challenges is its applicability in applied settings. In the present chapter, we address this challenge for the domain of personnel selection, showing that an integrative approach to personality assessment in selection settings is possible through careful consideration of available theories and selection methods. By explaining and delineating how existing concepts can be used and how existing selection methods can be adjusted and expanded to measure these dynamic personality constructs, the present chapter contributes to a better assessment and understanding of personality in selection contexts, which in turn should result in better predictive validities.
... Importantly, such a congruency principle not only operates for political attitudes and behaviors, but holds for attitudes and behavior in general. That is, research in a wide variety of applied settings has shown that one important way through which personality "gets outside the skin" is by selecting situations that allow for the expression of one's personality traits (Hampson, 2012;Frederickx and Hofmans, 2014;Judge et al., 2017). ...
Full-text available
We examined the relationship between Big Five personality and the political ideology of elected politicians. To this end, we studied 303 politicians from Flanders, Wallonia, and Canada, relating their self-reported Big Five scores to a partisanship-based measure of political ideology. Our findings show that, in line with the congruency model of personality, Openness to Experience is the best and most consistent correlate of political ideology, with politicians high on Openness to Experience being more likely to be found among the more progressive left-wing political parties.
Full-text available
In situation perceptions, the objective situation and its unique construal are confounded. We propose a multiple-rater approach where situations are rated by raters in-situ (who experienced the situations first-hand) and raters ex-situ (who read participants’ factual descriptions of the situations). Two multi-wave studies (Austria: N=176-179, 3 waves; USA: N=202, 4 waves), examined associations between personality traits (Big Five OCEAN) and four sources of ratings of situation characteristics (Situational Eight DIAMONDS), namely (a) in-situ (situation experience), (b) ex-situ (situation contact, conservative), (c) what is shared between in-situ and ex-situ (situation contact, liberal), and (d) in-situ controlled for ex-situ (situation construal). Replicable evidence was found that personality is associated with the situations people encounter as well as their construal of them.
Full-text available
This study provides conceptual and empirical arguments why an assessment of applicants' procedural knowledge about interpersonal behavior via a video-based situational judgment test might be valid for academic and postacademic success criteria. Four cohorts of medical students (N = 723) were followed from admission to employment. Procedural knowledge about interpersonal behavior at the time of admission was valid for both internship performance (7 years later) and job performance (9 years later) and showed incremental validity over cognitive factors. Mediation analyses supported the conceptual link between procedural knowledge about interpersonal behavior, translating that knowledge into actual interpersonal behavior in internships, and showing that behavior on the job. Implications for theory and practice are discussed.
Full-text available
This article investigates mechanisms through which neuroticism leads to distress in daily life. Neuroticism may lead to distress through exposing people to a greater number of stressful events, through increasing their reactivity to those events, or through a mechanism unrelated to environmental events. This article evaluates the relative importance of these three explanations. Subjects were 339 persons who provided daily reports of minor stressful events and mood for 6 weeks. Exposure and reactivity to these minor stressors explained over 40% of the distress difference between high- and low-neuroticism subjects. Reactivity to stressors accounted for twice as much of the distress difference as exposure to stressors. These results suggest that reactions within stressful situations are more important than situation selection in explaining how neuroticism leads to distress in daily life.
The construct validity of situational judgment tests (SJTs) is a “hot mess.” The suggestions of Lievens and Motowidlo (2016) concerning a strategy to make the constructs assessed by an SJT more “clear and explicit” (p. 5) are worthy of serious consideration. In this commentary, we highlight two challenges that will likely need to be addressed before one can develop SJTs with clear and explicit constructs. We also offer critiques of four positions presented by Lievens and Motowidlo that are not well supported by evidence.
Previous studies on the construct validity of assessment centres have generally produced puzzling results. The premise of this study is that these prior studies were relatively one-sided. Actually, most previous studies were field studies, which typically used the multitrait–multimethod approach to distinguish between two sources of variance (i.e., exercises and dimensions). Therefore, this study aims to shed light on the issue of assessment centre construct validity by addressing substantive and methodological concerns inherent in previous research. In this study, 85 industrial and organizational psychology students and 39 managers rated videotaped assessment centre candidates in three exercises on six dimensions. Results from generalizability analyses showed that assessors' ratings were relatively veridical. In addition, when assessors rated candidates whose performances varied across dimensions and whose performances were relatively consistent across exercises, they were reasonably able to differentiate among the various dimensions. They also rated such candidate profiles similarly on the various dimensions across exercises. When assessors rated a candidate profile without clear performance fluctuations across dimensions, distinctions about dimensions were more blurred. Results from student and managerial assessors were similar, although managers distinguished somewhat less between the various dimensions. The research and practical implications of these findings are discussed. Copyright © 2001 John Wiley & Sons, Ltd.
Still stable after all these years: Personality as a key to some issues in adulthood and old age
  • P T Costa
  • Jr
  • R R Mccrae
Costa, P.T., Jr., & McCrae, R. R. (1980). Still stable after all these years: Personality as a key to some issues in adulthood and old age. In P. B. Baltes & O. G. Brim, Jr. (Eds.), Life span development and behavior (Vol. 3, pp. 65-102). New York: Academic Press.