Article

A Meta‐Analytic Investigation of Job Applicant Faking on Personality Measures

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This study investigates the extent to which job applicants fake their responses on personality tests. Thirty-three studies that compared job applicant and non-applicant personality scale scores were meta-analyzed. Across all job types, applicants scored significantly higher than non-applicants on extraversion (d=.11), emotional stability (d=.44), conscientiousness (d=.45), and openness (d=.13). For certain jobs (e.g., sales), however, the rank ordering of mean differences changed substantially suggesting that job applicants distort responses on personality dimensions that are viewed as particularly job relevant. Smaller mean differences were found in this study than those reported by Viswesvaran and Ones (Educational and Psychological Measurement, 59(2), 197–210), who compared scores for induced “fake-good” vs. honest response conditions. Also, direct Big Five measures produced substantially larger differences than did indirect Big Five measures.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... To obtain accurate assessments of the extent to which a personnel selection context triggers a systematic change in response sets among applicants requires both low-stakes and high-stakes scores from the same applicants (Tett & Simonet, 2011). However, because it is extremely challenging to collect such data (see Donovan et al., 2014 for an excellent summary of these challenges), much of the faking research has relied on experimental faking studies (e.g., MacCann, 2013), or comparisons of data collected from applicants to those collected from non-applicants (e.g., Birkeland et al., 2006). Viswesvaran and Ones (1999) meta-analyzed studies that compared honest and faked personality scores under 'fake-good' instructions and found that Likert-based measures of all Big Five personality domains are quite easily fakable. ...
... Research in which applicant scores are compared to those of non-applicants (e.g., employees or research participants) shows smaller effect sizes and more nuanced findings. In a meta-analysis, Birkeland et al. (2006) observed small to moderate group differences on the Big Five domains (d = 0.11-0.45). Furthermore, Anglim et al. (2017), comparing job applicants to age-and gender-matched non applicants, found differences ranging from d = 0.09 (openness to experience) to 1.06 (agreeableness) on the HEXACO domains. ...
... First, our findings seem to suggest that faking might be less prevalent than often feared. Although the position of firefighter is highly coveted, the applicants in our sample, on average, faked much less than participants in experimental fake-good studies (e.g., MacCann, 2013) and somewhat less than applicants in some other field studies (e.g., Arthur et al., 2010;Birkeland et al., 2006;Ellingson et al., 2007). The difference in faking prevalence observed here with instructed faking experiments is to be expected, the applicants in this study were not explicitly instructed to manage impressions, thus not every applicant may have felt the need to fake. ...
Article
Full-text available
This study investigated whether faking behavior on a personality inventory can be predicted by two indicators of the ability to fake (cognitive ability and the ability to identify criteria; ATIC) and two indicators of the motivation to fake (perceived faking norms and honesty–humility). Firefighter applicants first completed a personality inventory under high‐stakes conditions and, three months later, under low‐stakes conditions (n = 128). Analyses revealed very little faking behavior on average. Cognitive ability and ATIC were both negatively related to personality score elevation, but only cognitive ability exhibited a statistically significant association. Neither perceived faking norms nor honesty–humility were significantly related to personality score elevation and only perceived competition was positively related to overclaiming (a proxy of faking).
... With the widespread use of personality assessment in employee selection, some practitioners are concerned about the effect of applicant faking on the fairness and accuracy of these assessments (Hough & Oswald, 2008;Morgeson et al., 2007aMorgeson et al., , 2007bOnes et al., 2007;Robie et al., 2021;Tett & Christiansen, 2007). Research shows that people can and do alter their responses when completing personality assessments in high-stakes settings (e.g., Anglim et al., 2017;Anglim, Bozic, et al., 2018;Birkeland et al., 2006;Griffith et al., 2007;Morgeson et al., 2007b). Job applicants show elevated means on scales measuring traits such as Conscientiousness and Extraversion (Anglim et al., 2017;Birkeland et al., 2006;Jeong et al., 2017), and scale SDs tend to decline as responses become more compressed around a perceived ideal (Anglim et al., 2017;Hooper, 2007;Salgado, 2016). ...
... Research shows that people can and do alter their responses when completing personality assessments in high-stakes settings (e.g., Anglim et al., 2017;Anglim, Bozic, et al., 2018;Birkeland et al., 2006;Griffith et al., 2007;Morgeson et al., 2007b). Job applicants show elevated means on scales measuring traits such as Conscientiousness and Extraversion (Anglim et al., 2017;Birkeland et al., 2006;Jeong et al., 2017), and scale SDs tend to decline as responses become more compressed around a perceived ideal (Anglim et al., 2017;Hooper, 2007;Salgado, 2016). Nonetheless, the extent to which this response distortion reduces validity remains an active topic of research, with some researchers suggesting that it is a serious problem (e.g., Rothstein & Goffin, 2006) and others that it is not (Hogan et al., 1996;Hogan et al., 2007;Hough et al., 1990;Ones & Viswesvaran, 1998;Ones et al., 2007). ...
... A large body of research shows that people respond in a more socially desirable way in high-stakes settings and this affects a range of test properties. First, meta-analyses and large sample studies show that applicant scale means are higher in the socially desirable direction than nonapplicant scale means, in both real-world applicants (e.g., Anglim et al., 2017;Birkeland et al., 2006;Hu & Connelly, 2021;Jeong et al., 2017) and lab studies using simulated application and instructed faking designs (Birkeland et al., 2006;Hooper, 2007;Viswesvaran & Ones, 1999). Second, repeatedmeasures designs comparing honest and applicant responses show that respondents vary in the degree to which they distort their responses in high-stakes settings (Griffith et al., 2011;McFarland & Ryan, 2000). ...
Article
Full-text available
This study examined the effect of job applicant faking on the validity of personality assessments, including self-other correlations, criterion validity, and cognitive ability correlates. By using a large sample, multiple other-raters, a repeated-measures design, and a realistic simulated job application, it sought to provide the most precise estimates to date of the effect of the applicant context on self-other correlations, as well as the influence of cognitive ability on faking. Undergraduate psychology students (n = 584) completed a measure of Big Five personality (i.e., International Personality Item Pool NEO) in both a low-stakes and a simulated job applicant context. Participants completed measures of intelligence (i.e., International Cognitive Ability Resource) and personality-relevant objective criteria (e.g., university grades), and had an average of 3 other raters rate their personality (n = 1831). Responses to the Big Five scales were more socially desirable in the applicant context (average d = 0.58), with notable decreases in reported Neuroticism and increases in Conscientiousness, Agreeableness, and Extraversion. Average self-other correlations declined by 24% from .59 in the low-stakes to .45 in the applicant context. Cognitive ability was positively correlated with magnitude of faking. In the applicant context, criterion validities declined minimally. Results suggest response distortion by job applicants results in modest reductions in the accuracy and criterion validity of personality assessments. K E Y W O R D S Big Five, faking, job applicant, other-ratings, personality, response distortion, social desirability Practitioner points • Five hundred and eighty-four participants completed a personality measure in both low-stakes and applicant contexts • Participants also had three others rate them on the same personality measure • Responses were more socially desirable in the applicant context • Self-other correlations decreased by 24% in the applicant context • Criterion validity decreased slightly in the applicant context • Cognitive ability was correlated with magnitude of faking Int J Sel Assess. 2022;1-14. wileyonlinelibrary.com/journal/ijsa | 1 This is an open access article under the terms of the Creative Commons Attribution-NonCommercial-NoDerivs License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non-commercial and no modifications or adaptations are made.
... However, due to the characteristics of this type of answer, SS questionnaires have frequently been criticised for their potential susceptibility to faking behavior (e.g., Christiansen et al., 2005;Griffith & McDaniel, 2006;McFarland & Ryan, 2000;Rosse et al., 1998;Viswesvaran & Ones, 1999). Meta-analytic research of Viswesvaran and Ones (1999), Birkeland et al. (2006), andSalgado (2016) pointed out that individuals can distort their scores on SS instruments if they are motivated to fake. ...
... Due to the adverse characteristics of this behavior, there is considerable concern to understand the potential negative effects that faking could have on SS personality questionnaires (see, for instance, Birkeland et al, 2006;Salgado, 2016;Viswesvaran & Ones, 1999). Specifically, research has focused on the effects of faking behavior on (a) the scores, (b) the reliability, (c) the validity, and (d) the ranking of candidates (hiring decisions). ...
... The effects on the mean of scores are among the most studied. The meta-analyses of Viswesvaran and Ones (1999), Birkeland et al. (2006), Hooper (2007, and Salgado (2016) have shown that faking causes an increase in the scores of personality measures and that this effect is greatest for conscientiousness and emotional stability in all cases. Likewise, meta-analytic evidence showed that faking also reduces the magnitudes of standard deviations of scores obtained with SS questionnaires (Salgado, 2016;Viswesvaran & Ones, 1999). ...
Article
Full-text available
Research has shown that faking behavior affects the factor structure of single-stimulus (SS) personality measures. However, no published research has analyzed the effects of this phenomenon on the factor structure of forced-choice (FC) personality inventories. This study examines the effects of faking, induced in a laboratory setting, on the construct validity of a quasi-ipsative FC personality inventory based on the Five-Factor Model. It also examines the moderator effect of the type of experimental design (between-subject and within-subject design) on factor analyses. The results showed that (a) data fit to a structure of five-factors in the two conditions (honest and faking) in both experimental designs; (b) model fit indices are also good or excellent in all cases; and (c) Burt-Tucker’s congruence coefficients between convergent factors of conditions analyzed are very high. These findings provide evidence that the quasi-ipsative FC format is a robust instrument that controls the effects of faking on factor structure. Finally, we discuss theoretical and practical implications of these findings for personnel selection and assessment.
... Previous studies have revealed that faking can diminish the validity of measurements (e.g., Tett & Christiansen, 2007), significantly muddle the rank order of participants (e.g., Birkeland, Manson, Kisamore, Brannick, & Smith, 2006;Griffith, Chmielowski, & Yoshita, 2007;Peterson, Griffith, & Converse, 2009) and-in the worst-case scenario-influence the choice of suitable applicants (e.g., Morgeson, 2004). Due to these research findings and to the wide range of existing methods for combatting faking, Burns and Christiansen (2011) stated that to date, no method of avoiding applicant faking has proven sufficient. ...
... Indeed, these tests are among the most faked methods of assessment (McFarland, & Ryan, 2000). Moreover, integrity tests are faked by participants in both anonymous (Marcus, 2006) and simulated selection settings (Gerber-Braun, 2010;Jackson, Wroblewski, & Ashton, 2000) as well as by applicants in personnel selection procedures (Birkeland et al., 2006). ...
... to d = .45 (Birkeland et al., 2006). Moreover, another meta-analysis revealed that faking increases scores on integrity tests by about one-half and up to one standard deviation depending on the kind of integrity test used (d = .59 ...
Thesis
This dissertation focuses on the construct and criterion validity of integrity tests and aims to enhance both. To accomplish this goal, three approaches were adopted: First, an overview and systematic comparison of integrity tests was conducted with reference to the construction and application of the tests. Second, the nomological network of integrity tests was expanded with reference to honesty-humility and organizational citizenship behavior at their factor and facet level. Third, two promising methods to reduce faking on integrity tests were tested: the double rating method (Hui, 2001) and the indirect questioning technique. In line with previous research, the results of the overview and comparison of integrity measures confirmed that integrity tests are multidimensional and heterogenous. A clear definition of integrity is urgently needed. The personality trait of honesty-humility and its facets of fairness, and modesty revealed the most significant relationships to integrity. Moreover, organizational citizenship behavior and its facets of altruism, conscientiousness, and sportsmanship were found to significantly relate to integrity. Furthermore, integrity tests were able not only to predict organizational citizenship behavior but also to incrementally predict job performance and organizational citizenship behavior beyond the factor and facet level of the personality traits of conscientiousness and honesty-humility. In contrast to the indirect questioning technique, the double rating method, which includes an other rating and a self rating, was shown to be able to significantly reduce faking on integrity tests in an anonymous survey setting. This dissertation makes an important contribution to better explain the construct and nomological network of integrity, provide a more detailed view on integrity tests and their protection against faking, and expand the predictive and incremental validity of these tests. The implications for future research and practice are further discussed.
... A substantial literature has investigated the effects of faking on noncognitive measures, including personality inventories, biodata, and integrity tests (Allen et al., 2004;Alliger & Dwight, 2000;Becker & Colquitt, 1992;Birkeland et al., 2006;Buehl et al., 2019;Dalen et al., 2001;Graham et al., 2002;Levashina & Campion, 2007;McFarland et al., 2002;Van Iddekinge et al., 2005;Viswesvaran & Ones, 1999). Two strategies for investigating faking on these measures predominate. ...
... With regard to field studies, Birkeland et al. (2006) found that across job types, applicants score significantly higher than non-applicants on the Big Five dimensions of extraversion, emotional stability, conscientiousness, and openness (ds range from 0.11 to 0.45). However, a meta-analysis by Edens and Arthur (2000) suggests that the magnitude of response distortion on self-report measures may be lower in field settings (d = 0.30 in their meta-analysis) than in lab studies (d = 0.73). ...
... Studies have demonstrated that job applicants score higher than incumbents on personality measures, supporting the thesis that faking occurs on these measures in high-stakes contexts (Birkeland et al., 2006;Edens & Arthur, 2000). However, there has been some variability in studies investigating the susceptibility of SJTs to faking in quasi-experimental studies of applicants and incumbents (e.g., Reynolds et al., 1999;Schmidt & Wolfe, 2003;Weekley et al., 2003). ...
Article
Full-text available
Susceptibility to faking is a key issue in the operational use of SJTs. We administered an SJT with both knowledge-based (“should do”) and behavioral (“would do”) response instructions in a low-stakes developmental context to 946 current medical residents and in a high-stakes selection context to 275 applicants to medical residency programs. Results indicated that (a) controlling for instruction condition, SJT scores were higher in the selection context and (b) controlling for context, SJT scores were lower in the behavioral instruction condition. However, instruction condition moderated the effect of faking on SJTs in these contexts, such that differences between SJT scores in the “would do” and “should do” instruction conditions were greater in the developmental versus selection context. Implications are discussed.
... An extensive number of studies have examined the capacity of personality measures to predict several occupational and academic outcomes since they are a widely used assessment procedure in organizational and educational settings [7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22]. However, most of this research has been carried out using traditional single-stimulus (SS) personality measures, which are more susceptible to the potential adverse effects of faking (response distortion) [23][24][25][26]. ...
... Usually, this type of test presents a yes/no, true/false, or Likert scale answer format. For this reason, some authors have noted that this format shows a potential susceptibility to answer distortion [37][38][39][40][41]. Specifically, the metaanalytical findings of Birkeland et al. [23], Salgado [42], and Viswesvaran and Ones [26] have indicated that SS personality inventories can be deliberately distorted by individuals if they are motivated to fake. ...
... Faking behavior has been defined as a tendency of individuals to respond in a manner that will offer a portrayal of themselves that favors their evaluation process [36,40,43,44]. Therefore, faking is an intentional distortion of the response to the selection instruments, especially to the personality inventories [23,42,45,46]. Therefore, this phenomenon is a serious problem in applied settings when important hiring decisions are taken using SS personality measures. ...
Article
Full-text available
Faking behavior is one of the main problems of personality measures. For this reason, determining the potential effects of faking on personality assessment procedures is relevant. The aim of this study has been to examine the impact of faking, induced in a laboratory setting, on the predictive validity of a quasi-ipsative forced-choice (FC) inventory based on the five-factor model. It also examined whether the magnitude of the predictive validity varied depending on the type of criteria analyzed (self-reported performance ratings and grade point average). The participants were 939 students from the University of Santiago de Compostela. As expected, the results showed that: (1) conscientiousness is the best predictor of performance even under faking response conditions; (2) conscientiousness predicts performance better when it is assessed using rating scales; and (3) reliability and validity were attenuated under faking conditions. Finally, we discuss the implications of these findings for the research and practice of personnel selection.
... For example, classical lab studies of faking involving imagined job application, "fake good," or a prize for the "best response" manipulations reveal that faking is clearly possible, but not the extent to which, why, nor when it occurs in practice. By contrast, approaches that compare applicant samples to non-applicant samples can provide realistic estimates of the sizes of faking effects (Birkeland et al., 2006), however, these designs rarely allow for theory testing because they often rely on archival applicant data and because between-subjects designs preclude the direct observation of faking. ...
... Specifically, across the sample, as intended by the stimuli we presented in the high-stakes assessment phase, we observed the greatest score elevations on diligence and perfectionism, with respective increases from low-to high-stakes of about a third to half a standard deviation of the differences. Compared to meta-analytically estimated effect sizes of faking, the effects observed here were smaller than those observed in classical "directed faking" lab studies (d z = 0.89 for conscientiousness; Viswesvaran & Ones, 1999), similar to those observed in applicant-non applicant comparisons (d s = 0.45 for conscientiousness; Birkeland et al., 2006), and smaller than those observed in applicant within-person studies (corrected d z = 0.72 for conscientiousness; Hu & Connelly, 2021), though we note that the effect sizes from the latter meta-analysis were highly variable. ...
... Thus, it remains an open question as to whether the sizes of the effects observed here will generalize to other settings. Nonetheless, we are reassured by the fact that our observed faking effect sizes were similar to those observed in the Birkeland et al. (2006) meta-analysis. ...
Article
Full-text available
We investigated individual differences in faking in simulated high-stakes personality assessments through the lens of expectancy (VIE) theory, using a novel experimental paradigm. Three hundred ninety-eight participants (MTurk) completed a “low-stakes” HEXACO personality assessment for research purposes. Three months later, we invited all 398 participants to compete for an opportunity to complete a genuine, well-paid, one-off MTurk job, and 201 accepted. After viewing the selection criteria, which described high levels of perfectionism as critical for selection, these participants completed the HEXACO personality assessment as part of their applications (“high-stakes”). All 201 participants were then informed their applications were successful and were invited to complete the performance task, with 189 accepting the offer. The task, which involved checking text data for inconsistencies, captured two objective performance criteria. We observed faking on measures of diligence and perfectionism. We found that perceived job desirability (valence) was the strongest (positive) determinant of individual differences in faking, along with perceived instrumentality and expectancy. Honesty-humility was also associated with faking however, unexpectedly, the association was positive. When all predictors were combined, only perceived job desirability remained a significant motivational determinant of faking, with cognitive ability also being a positive predictor. We found no evidence that cognitive ability moderated the relations of motivation and faking. To investigate the role of faking on predictive validity, we split the sample into those who had faked to a statistically large extent, and those who had not. We found that the validity of high-stakes assessments was higher amongst the group that had faked.
... First, the test-taker must be able to recognize the desired behavior. They must clearly understand what behaviors would be most favorable to those collecting the assessment responses (Birkeland et al., 2006;Goffin & Boyd, 2009;Raymark & Tafero, 2009). Second, the individual needs to understand the evaluative components of the assessment. ...
... We chose to study a sales manager position because other researchers have found that people are generally familiar with sales positions and the tasks and skills required in this role improve the likelihood of faking these behaviors (Birkeland et al., 2006). Results from a meta-analysis found that when they compared sales versus non-sales positions as a moderator, individuals were able to distort their personality scores based on their understanding of the relevant personality characteristics needed for a sales position. ...
... Researchers believed that participants perceived good sales people as assertive individuals who would be less agreeable (Birkeland et al., 2006). This was supported by another meta-analysis that also reported that higher extraversion and lower agreeableness were the significant traits that job applicants faked for salesman positions (Salgado et al., 2015). ...
Experiment Findings
Full-text available
Abstract The use of game-based approaches is growing in popularity both in research and practice. However, little research has been done on faking behaviors in game-based assessments (GBAs). Understanding faking in GBAs is relevant as organizations continue to develop and integrate GBAs for selection purposes. This study examines the relationships of faked GBA scores with honest and faked scores from a self-report measure of personality. We collected measures of personality using the Five Factor Model, evaluating four traits relevant to a sales manager position (i.e., high conscientiousness, high extraversion, low neuroticism, and low agreeableness). From our group of participants, we evaluated the degree to which participants faked the self-report measures (i.e., faking extent). We used this measure to identify individuals who faked well (i.e., correctly distorted scores across personality subfactors). These good fakers were compared to poor fakers with results demonstrating significantly improved faked self-report scores but not faked GBA personality scores. This provides preliminary evidence that good fakers can generally manipulate faked scores in the desirable way on self-report measures but may have experienced more difficulty manipulating their scores on the GBA measures used in this particular study. Our findings may be relevant for researchers and practitioners seeking to use GBAs in situations where test-takers may have an incentive to fake (e.g., recruitment and hiring practices).. Our results also contribute to a much-needed research area exploring the various uses and functions of GBAs when compared to traditional measures. Keywords: game-based assessments, self-report, faking extent, personality assessment, Big Five
... In addition to concerns about unfairly discriminating against applicants with particular ideological perspectives (Baron & Jost, 2019;Van de Werfhorst, 2020), most psychometric assessments of personal values rely on selfreport by applicants and are vulnerable to response distortion. Indeed, research has revealed that job applicants provide more socially desirable responses on many other types of assessments including personality questionnaires (Birkeland et al., 2006;Cao & Drasgow, 2019;Schmit & Ryan, 1993;Ziegler et al., 2011) and interviews (Melchers et al., 2020), yet the impact of the employee selection context on the assessment of values remains unclear. This study aims to contribute to this important debate about the role of values assessment in personnel selection by investigating the ways in which job applicants respond to personal values assessments and the effect this has on the psychometric properties of such assessments. ...
... We observed moderate to large differences in socially desirable responding on Schwartz basic values in the employee selection context. The magnitude of these differences in values is broadly similar to those seen in the personality assessment domain (Anglim, Bozic et al., 2018;Anglim et al., 2017;Birkeland et al., 2006). This ranged from much lower levels of endorsement of power to moderately larger levels of endorsement of universalism, conformity, and security, and moderately lower levels of selfdirection, as well as slightly higher levels of tradition, and slightly lower levels of stimulation, hedonism, and achievement. ...
Preprint
Some scholars suggest that organizations could improve their hiring decisions by measuring the personal values of job applicants, arguing that values provide insights into applicants’ cultural fit, retention prospects, and performance outcomes. However, others have expressed concerns about response distortion and faking. The current study provides the first large-scale investigation of the effect of the job applicant context on the psychometric structure and scale means of a self-reported values measure. Participants comprised 7,884 job applicants (41% male; age M = 43.32, SD = 10.76) and a country-, age-, and gender-matched comparison sample of 1,806 non-applicants (41% male; age M = 44.72, SD = 10.97), along with a small repeated-measures, cross-context sample. Respondents completed the 57-item Portrait Values Questionnaire (PVQ) measuring Schwartz’ universal personal values. Compared to matched non-applicants, applicants reported valuing power and self-direction considerably less, and conformity and universalism considerably more. Applicants also reported valuing security, tradition, and benevolence more than non-applicants, and reported valuing stimulation, hedonism, and achievement less than non-applicants. Despite applicants appearing to embellish the degree to which their values aligned with being responsible and considerate workers, invariance testing suggested that the under- lying structure of values assessment is largely preserved in job applicant contexts.
... Shadowing empirical support for personality-job performance relationships (e.g., Hogan & Holland, 2003;Hurtz & Donovan, 2000;Tett et al., 1999) has been research on the susceptibility of self-report scales to deliberate response distortion or faking. Dozens of studies show that people can fake on personality tests and actually do fake when the stakes are high, as in selection settings (Birkeland et al., 2006;Levashina et al., 2014). An equally important but more fundamental question is how much does faking matter? ...
... Second, job applicants are assumed to be motivated, to varying degrees, to present an overall favorable impression when completing a self-report personality test. A motivation effect is clearly evident in meta-analyses showing heightened test score means (on positively valued traits) in applicants relative to incumbents (e.g., Jeong et al., 2017;Salgado, 2016), especially on job-relevant scales (Birkeland et al., 2006). The meaning and importance of the motivational shift are near the heart of the FIG/FIB debate, but the shift itself is not in question. ...
Article
Full-text available
The unitarian understanding of construct validity holds that deliberate response distortion in completing self-report personality tests (i.e., faking) threatens trait-based inferences drawn from test scores. This “faking-is-bad” (FIB) perspective is being challenged by an emerging “faking-is-good” (FIG) position that condones or favors faking and its underlying attributes (e.g., social skill, ATIC) to the degree they contribute to predictor–criterion correlations and are job relevant. Based on the unitarian model of validity and relevant empirical evidence, we argue the FIG perspective is psychometrically flawed and counterproductive to personality-based selection targeting trait-based fit. Carrying forward both positions leads to variously dark futures for self-report personality tests as selection tools. Projections under FIG, we suggest, are particularly serious. FIB offers a more optimistic future but only to the degree faking can be mitigated. Evidence suggesting increasing applicant faking rates and other alarming trends makes the FIB versus FIG debate a timely if not urgent matter.
... These studies have generally found that applicants describe themselves in more socially desirable ways than do incumbents, with meta-analytic d's between .13 and .52 across the Big Five traits (Birkeland, Manson, Kisamore, Brannick, & Smith, 2006). However, such between-groups designs assume that applicant and incumbent samples have the same 'true' means on personality traits themselves and differences in means are attributable to response distortion. ...
... This assumption may be untenable, as many between-groups studies have not even matched applicants and incumbents on jobs; those that have produce more conservative estimates of mean differences (.10 < d's < .31; Birkeland et al., 2006). In this regard, withinsubjects studies (wherein a single sample completes the personality measure in both incumbent and applicant settings) are more informative by ensuring that applicants' and incumbents' underlying true scores are equal. ...
... The decision to use PIM is often rational and multidetermined, and influenced both by context features and by person characteristics (Rogers, 2018;Ziegler et al., 2012). Individuals modify responses differentially to present as desirable for specific contexts (Birkeland et al., 2006), and underreporting is more likely to be found in settings where a positive image and no symptoms of psychopathology are likely valued, such as in child custody litigations and personnel selection (Baer & Miller, 2002). When personality measures are used to make important decisions in the organizational context, this represents a high-stakes assessment, with an incentive for distortion (Ellingson et al., 2007). ...
... In the personality psychopathology scales, the organizational sample has a too low mean value in NEGE, and this may be due to the fact that this scale presents more clinical symptomology associated with high face validity, which may lead participants with high underreporting attitude to deny this symptomology to the fullest. Moreover, MMPI-2 NEGE is the psychopathological side of the normative personality dimension Neuroticism of the Big Five Factor Model, and meta-analytic research has found that job applicants inflated their scores to a much larger degree on the emotional stability, along with the conscientiousness dimension, seeming that respondents view this construct as being particularly desirable by employers (Birkeland et al., 2006). ...
Article
This study uses several MMPI-2 validity scales, the context of the sample data collection, and personality dimensions to assess global underreporting and two underreporting subtypes: defensiveness and social desirability. This study uses a differential prevalence group design to compare organizational (N = 344), community (N = 339), and clinical (N = 347) samples. Composite indexes of global underreporting, defensiveness and social desirability were tested. As hypothesized, the high stakes organizational sample showed stronger global and specific underreporting than the other two samples. The proposed composite indexes performed better than the individual scales on detecting global underreporting, defensiveness and social desirability. Additionally, considering the organizational and community samples, a path-analysis found that the context of assessment (i.e., high stakes and no high stakes) had a stronger effect than respondent personality on underreporting. In sum, the proposed composite indexes demanding a joint elevation of validity scales, seem to be powerful criteria that only identify individuals with underreporting levels that represent Positive Impression Management. This is very relevant mostly in high stakes assessment contexts.
... socio-7 analytical theory, Johnson & Hogan, 2006, see also Cable & Kay, 2012). This helps explaining empirical findings that mean desirability effects are significantly weaker in the field than in the laboratory (Birkeland et al., 2006;Viswesvaran & Ones, 1999). ...
... Furthermore, our approach to empirically separate motivation and ability components in those responses is a unique feature of our method. In the probably most widely used situationally induced "faking score" (the difference in scale scores under honest and faking conditions after all items are summed up, e.g., Birkeland et al., 2006;Cook, 2009), those components are inextricably confounded. Interpreting mean shifts as self-presentation is also based on the implicit assumption that lay test takers understand how a specific item is keyed to improve one's score. ...
Preprint
Full-text available
Behaviour in selection situations as an adaptation to external expectations: Testing a theory of self-presentation Self-presentation in a selection setting has largely been viewed as deviant and detrimental for validity, often simplified by the label "faking behaviour". Yet, applicants may also express meaningful skills and motivation when presenting themselves. In this paper, we present an empirical test of Marcus' (2009) theory of self-presentation, which takes this position. By simulating a complete selection process, from choosing a position to final decision making about job offers, we test several key assumptions the model made. If motivation was operationalized as willingness to deviate from true self-image, findings provide partial support for proposed antecedents of initial motivation, for motivational changes during the selection process, for the hypothesis that greater discrepancy between true self-image and perceived expectations lower the motivation to self-present, and for expected effects of analytical self-presentation skills. Hardly any support was found for propositions if motivation was operationalized as willingness to adapt to perceived employer's ideals, and for proposed antecedents of analytical skills. Link to published version: https://doi.org/10.1080/1359432X.2021.1981866
... socioanalytical theory, Johnson & Hogan, 2006, see also Cable & Kay, 2012). This helps explaining empirical findings that mean desirability effects are significantly weaker in the field than in the laboratory (Birkeland et al., 2006;Viswesvaran & Ones, 1999). ...
... Furthermore, our approach to empirically separate motivation and ability components in those responses is a unique feature of our method. In the probably most widely used situationally induced "faking score" (the difference in scale scores under honest and faking conditions after all items are summed up, e.g., Birkeland et al., 2006;Cook, 2009), those components are inextricably confounded. Interpreting mean shifts as self-presentation is also based on the implicit assumption that lay test takers understand how a specific item is keyed to improve one's score. ...
Article
Self-presentation in a selection setting has largely been viewed as deviant and detrimental for validity, often simplified by the label “faking behaviour”. Yet, applicants may also express meaningful skills and motivation when presenting themselves. In this paper, we present an empirical test of a theory of self-presentation, which takes this position. By simulating a complete selection process, from choosing a position to final decision-making about job offers, we test several key assumptions the model made. If motivation was operationalized as willingness to deviate from true self-image, findings provide partial support for proposed antecedents of initial motivation, for motivational changes during the selection process, for the hypothesis that greater discrepancy between true self-image and perceived expectations lower the motivation to self-present and for expected effects of analytical self-presentation skills. Hardly any support was found for propositions if motivation was operationalized as willingness to adapt to perceived employer’s ideals and for proposed antecedents of analytical skills.
... In addition to concerns about unfairly discriminating against applicants with particular ideological perspectives (Baron & Jost, 2019;van de Werfhorst, 2020), most psychometric assessments of personal values rely on self-report by applicants and are vulnerable to response distortion. Indeed, research has revealed that job applicants provide more socially desirable responses on many other types of assessments including personality questionnaires (Birkeland et al., 2006;Cao & Drasgow, 2019;Schmit & Ryan, 1993;Ziegler et al., 2011) and interviews (Melchers et al., 2020), yet the impact of the employee selection context on the assessment of values remains unclear. This study aims to contribute to this important debate about the role of values assessment in personnel selection by investigating the ways in which job applicants respond to personal values assessments and the effect this has on the psychometric properties of such assessments. ...
... We observed moderate to large differences in socially desirable responding on Schwartz basic values in the employee selection context. The magnitude of these differences in values is broadly similar to those seen in the personality assessment domain (Anglim, Bozic, et al., 2018;Anglim et al., 2017;Birkeland et al., 2006). This ranged from much lower levels of endorsement of power to moderately larger levels of endorsement of universalism, conformity, and security, and moderately lower levels of self-direction, as well as slightly higher levels of tradition, and slightly lower levels of stimulation, hedonism, and achievement. ...
Article
Full-text available
Some scholars suggest that organizations could improve their hiring decisions by measuring the personal values of job applicants, arguing that values provide insights into applicants’ cultural fit, retention prospects, and performance outcomes. However, others have expressed concerns about response distortion and faking. The current study provides the first large-scale investigation of the effect of the job applicant context on the psychometric structure and scale means of a self-reported values measure. Participants comprised 7,884 job applicants (41% male; age M = 43.32, SD = 10.76) and a country-, age-, and gender-matched comparison sample of 1,806 non-applicants (41% male; age M = 44.72, SD = 10.97), along with a small repeated-measures, cross-context sample. Respondents completed the 57-item Portrait Values Questionnaire (PVQ) measuring Schwartz’ universal personal values. Compared to matched non-applicants, applicants reported valuing power and self-direction considerably less, and conformity and universalism considerably more. Applicants also reported valuing security, tradition, and benevolence more than non-applicants, and reported valuing stimulation, hedonism, and achievement less than non-applicants. Despite applicants appearing to embellish the degree to which their values aligned with being responsible and considerate workers, invariance testing suggested that the underlying structure of values assessment is largely preserved in job applicant contexts.
... As a result of these studies, it is now widely accepted that people can intentionally distort their responses (Byle & Holtgraves, 2008;Grubb & McDaniel, 2007;Ziegler et al., 2007) and applicants do fake responses in the selection settings (Arthur et al., 2010;Fell & König, 2016;Griffith & Converse, 2012;Peterson et al., 2011;Roulin & Krings, 2020). Additionally, applicants differ in their ability and motivation to fake responses (Ellingson & McFarland, 2011;Griffith et al., 2011;Marcus, 2009;McFarland & Ryan, 2000;Pavlov et al., 2019;Roulin et al., 2016), and faking responses generally have a negative impact on psychometric property, decision making (e.g., rank order), and the overall utility of personality tests (Birkeland et al., 2006;Donovan et al., 2014;Hough & Dilchert, 2017;Lee et al., 2017;Mueller-Hanson et al., 2003;Zickar et al., 2004). ...
... Earlier researchers on personality faking have concentrated mainly on the mean differences in scale scores across personality characteristics to assess faking effects (e.g., Birkeland et al., 2006;Byle & Holtgraves, 2008;Griffith et al., 2007;Hough et al., 1990;Lee et al., 2017;Stark et al., 2001;Viswesvaran & Ones, 1999). Our study is the first to disentangle trait-based response processes (i.e., indifference, direction, and intensity) from a single latent trait and to investigate the decision-making process of Likerttype personality items under honest and motivated testing conditions. ...
Article
Full-text available
The item response tree (IR-tree) model is increasingly used in the field of personality assessments. The IR-tree model allows researchers to examine the cognitive decision-making process using a tree structure and evaluate conceptual ideas in accounting for individual differences in the response process. Recent research has shown the feasibility of applying IR-tree models to personality data; however, these studies have been exclusively focused on an honest or incumbent sample rather than a motivated sample in a high-stakes situation. The goal of our research is to elucidate the internal mechanism behind how respondents in different testing situations (i.e., honest and motivated test conditions) experience decision-making processes through the three-process IR-tree model (Böckenholt, 2012). Our findings generally corroborate the response mechanism of the direction–extremity–midpoint sequence in both honest and motivated test settings. Additionally, samples in motivated test settings generally exhibit stronger directional and extreme response preferences but weaker preferences of midpoint selection than samples in unmotivated test settings. Furthermore, for actual job applicants, social desirability had a substantial effect on all directional, extreme, and midpoint response preferences. Our findings will aid researchers and practitioners in developing a nuanced understanding of the decision-making process of test-takers in motivated test environments. Furthermore, this research will help researchers and practitioners develop more fake-resistant personality assessments.
... Empirical research provides ample evidence that high-stakes responses fail to comply with measurement models developed for low-stakes settings (e.g. Birkeland et al., 2006;Schmit & Ryan, 1993). Psychometrically, faking behavior can manifest itself in multiple ways: ...
... 1) Skewed distributions of item and scale scores. Distributions of scores for desirable/undesirable attributes are negatively/positively skewed and often show ceiling/floor effects, respectively (Birkeland et al., 2006). Distributions of scores for ambivalent attributes can even be bimodal (Kuncel & Borneman, 2007). ...
Article
Full-text available
In high stakes assessments of personality and similar attributes, test takers may engage in impression management (aka faking). This article proposes to consider responses of every test taker as a potential mixture of "real" (or retrieved) answers to questions, and "ideal" answers intended to create a desired impression, with each type of response characterized by its own distribution and factor structure. Depending on the particular mix of response types in the test taker profile, grades of membership in the "real" and "ideal" profiles are defined. This approach overcomes the limitation of existing psychometric models that assume faking behavior to be consistent across test items. To estimate the proposed faking-as-grade-of-membership (F-GoM) model, two-level factor mixture analysis is used, with two latent classes at the response (within) level, allowing grade of membership in "real" and "ideal" profiles, each underpinned by its own factor structure, at the person (between) level. For collected data, units of analysis can be item or scale scores, with the latter enabling analysis of questionnaires with many measured scales. The performance of the F-GoM model is evaluated in a simulation study, and compared against existing methods for statistical control of faking in an empirical application using archival recruitment data, which supported the validity of latent factors and classes assumed by the model using multiple control variables. The proposed approach is particularly useful for high-stakes assessment data and can be implemented with standard software packages. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
... However, the results of studies that have explored the impact of constructs have been less clear than the results of studies on other faking conditions. For example, Steffens (2004) demonstrated more faking on extraversion than on conscientiousness in IATs and self-reports, whereas Birkeland et al. (2006), who investigated only self-reports, demonstrated more faking on conscientiousness than on extraversion. ...
Article
Research has shown that even experts cannot detect faking above chance, but recent studies have suggested that machine learning may help in this endeavor. However, faking differs between faking conditions, previous efforts have not taken these differences into account, and faking indices have yet to be integrated into such approaches. We reanalyzed seven data sets (N = 1,039) with various faking conditions (high and low scores, different constructs, naïve and informed faking, faking with and without practice, different measures [self-reports vs. implicit association tests; IATs]). We investigated the extent to which and how machine learning classifiers could detect faking under these conditions and compared different input data (response patterns, scores, faking indices) and different classifiers (logistic regression, random forest, XGBoost). We also explored the features that classifiers used for detection. Our results show that machine learning has the potential to detect faking, but detection success varies between conditions from chance levels to 100%. There were differences in detection (e.g., detecting low-score faking was better than detecting high-score faking). For self-reports, response patterns and scores were comparable with regard to faking detection, whereas for IATs, faking indices and response patterns were superior to scores. Logistic regression and random forest worked about equally well and outperformed XGBoost. In most cases, classifiers used more than one feature (faking occurred over different pathways), and the features varied in their relevance. Our research supports the assumption of different faking processes and explains why detecting faking is a complex endeavor.
... One reason is that applicants have an inherent incentive to present themselves as better than they are (Bangerter et al., 2012). This might lead to misrepresentation or even fraud in resumés (e.g., Henle et al., 2019), to faking in personality inventories (e.g., Birkeland et al., 2006) and situational judgment tests (e.g., Peeters & Lievens, 2005) and to impression management in interviews (e.g., Ingold et al., 2015) and Assessment Centres (ACs; e.g., McFarland et al., 2003). A second reason is that many personnel selection methods require human decisions, making them prone to human biases. ...
Chapter
Full-text available
Machine learning (ML) approaches, a subfield of artificial intelligence (AI), promise advancements in the field of personnel selection. This chapter introduces ML approaches to personnel selection practitioners and researchers in a non-technical way. We review the empirical research to date, specifically research that has looked at the potentials of ML approaches (in particular the increased prediction power) as well as the challenges and disadvantages of such approaches. We explain that the assumption of bias-free ML approaches is unwarranted, and that there might be negative reactions among applicants and users. We close this chapter by providing suggestions for highly needed research to demonstrate the validity of ML approaches to selection, to analyse the human-AI interface, and to more closely examine the reactions of applicants, users, and further stakeholders.
... Although the induced faking design has the methodological advantage of controlling for selection bias, it can be criticized as lacking external validity (Birkeland et al., 2006) and not representing the motivation of real job applicants. Another limitation due to the design of Study 1 is that the target jobs were chosen to clearly represent the corresponding RIASEC domains, which may make it easier for respondents to identify and inflate their scores on the matched domain. ...
Article
Full-text available
With the increasing popularity of non-cognitive inventories in personnel selection, organizations typically wish to be able to tell when a job applicant purposefully manufactures a favorable impression. Past faking research has primarily focused on how to reduce faking via instrument design, warnings, and statistical corrections for faking. This paper took a new approach by examining the effects of faking (experimentally manipulated and contextually driven) on response processes. We modified a recently introduced item response theory tree modeling procedure, the three-process model (Böckenholt, 2013), to identify faking in two studies. Study 1 examined self-reported vocational interest assessment responses using an induced faking experimental design. Study 2 examined self-reported personality assessment responses when some people were in a high-stakes situation (i.e., selection). Across the two studies, individuals instructed or expected to fake were found to engage in more extreme responding. By identifying the underlying differences between fakers and honest respondents, the new approach improves our understanding of faking. Percentage cut-offs based on extreme responding produced a faker classification precision of 85% on average.
... Under the lying condition, interviewers indeed spotted participants who showed various deception cues. However, explicitly asking participants to lie may overestimate faking results (cf., Birkeland et al., 2006). Studies observing the more nuanced and subtle deceptive IM happening during regular selection interviews (Roulin & Powell, 2018;Schneider et al., 2015) yielded less consistent results for behavioral cues or CBCA. ...
Article
Full-text available
Impression management (IM), especially deceptive IM (faking), is a cause for concern in selection interviews. The current study combines findings on lie detection with signaling theory to address how candidates’ deceptive versus honest IM shows in verbal deception cues, which then relate to interview ratings of candidates’ interview performance. After completing a structured interview rated by two trained interviewers, 182 candidates reported their deceptive and honest IM. Verbal deception cues (plausibility, verbal uncertainty) were coded from video recordings. Results supported the hypotheses: Deceptive IM directly raised interviewer ratings (intended positive signal) but lowered the responses’ plausibility and enhanced verbal uncertainties (unintended negative signals). Honest IM raised responses’ plausibility. Plausibility related positively to interviewer ratings (receiver reaction), thus accounting for a negative indirect effect of deceptive IM and a positive indirect effect of honest IM on interviewer ratings. This study contributes to theory and practice regarding faking detection in employment interviews.
... However, a candidate will not necessarily choose the ''right answer'' even if they can spot it. It is hypothesized that whether a candidate will choose the ''right answer'' over the real answer depends on their character, the size of the difference in social desirability of items, and the stakes of the assessment (e.g., Birkeland et al., 2006). It follows that the threshold at which socially desirable responding becomes a problem could vary depending on the assessment setting and purpose, with highstakes assessments demanding stricter social desirability balancing, while low-stakes assessments being able to use more lenient criteria. ...
Article
Several forced-choice (FC) computerized adaptive tests (CATs) have emerged in the field of organizational psychology, all of them employing ideal-point items. However, despite most items developed historically follow dominance response models, research on FC CAT using dominance items is limited. Existing research is heavily dominated by simulations and lacking in empirical deployment. This empirical study trialed a FC CAT with dominance items described by the Thurstonian Item Response Theory model with research participants. This study investigated important practical issues such as the implications of adaptive item selection and social desirability balancing criteria on score distributions, measurement accuracy and participant perceptions. Moreover, nonadaptive but optimal tests of similar design were trialed alongside the CATs to provide a baseline for comparison, helping to quantify the return on investment when converting an otherwise-optimized static assessment into an adaptive one. Although the benefit of adaptive item selection in improving measurement precision was confirmed, results also indicated that at shorter test lengths CAT had no notable advantage compared with optimal static tests. Taking a holistic view incorporating both psychometric and operational considerations, implications for the design and deployment of FC assessments in research and practice are discussed.
... In recent years, researchers have paid increasing attention to the problem of faking in psychological measurement. Much of the research has highlighted that faking on psychological measures can be a serious issue, because people do fake (e.g., Birkeland et al., 2006;Viswesvaran & Ones, 1999), and faking has an influence on scale means (e.g., Rosse et al., 1998;Stark et al., 2001;Viswesvaran & Ones, 1999), rank orders (e.g., Christiansen et al., 1994;Rosse et al., 1998), and the validity of test scores (e.g., Bäckström et al., 2009). Furthermore, measures that were considered to be supposedly immune to faking have turned out to be fakeable (e.g., Röhner et al., 2011;Röhner & Ewers, 2016;Röhner & Lai, 2020). ...
Article
Full-text available
Faking detection is an ongoing challenge in psychological assessment. A notable approach for detecting fakers involves the inspection of response latencies and is based on the congruence model of faking. According to this model, respondents who fake good will provide favorable responses (i.e., congruent answers) faster than they provide unfavorable (i.e., incongruent) responses. Although the model has been validated in various experimental faking studies, to date, research supporting the congruence model has focused on scales with large numbers of items. Furthermore, in this previous research, fakers have usually been warned that faking could be detected. In view of the trend to use increasingly shorter scales in assessment, it becomes important to investigate whether the congruence model also applies to self-report measures with small numbers of items. In addition, it is unclear whether warning participants about faking detection is necessary for a successful application of the congruence model. To address these issues, we reanalyzed data sets of two studies that investigated faking good and faking bad on extraversion ( n = 255) and need for cognition ( n = 146) scales. Reanalyses demonstrated that having only a few items per scale and not warning participants represent a challenge for the congruence model. The congruence model of faking was only partly confirmed under such conditions. Although faking good on extraversion was associated with the expected longer latencies for incongruent answers, all other conditions remained nonsignificant. Thus, properties of the measurement and properties of the procedure affect the successful application of the congruence model.
... Organizations and managers utilize them to make hiring decisions, form teams, and assess performance (Moscoso & Salgado, 2004). However, there have been concerns about the efficacy of personality tests, as well as fears that people could manipulate responses to fake certain personality traits (Birkeland et al., 2006). There are also legal limits to who can take them-U.S. courts have ruled employer mandates for personality tests may discriminate against certain persons with disabilities. 1 They can also be limited in their access to certain populations outside of their organization. ...
Article
Full-text available
This article examines Crystal Knows, a company that generates automated personality profiles through an algorithm and sells access to their database. These algorithms are the result of a long line of research into computational and predictive algorithms that track social media practices and uses them to infer individual characteristics and make psychometric assessments. Although it is now computationally possible, these algorithms are not widely known or understood by the general public. Little is known about how people would respond to them, particularly when they do not even know their online activities are being assessed by the algorithm. This study examines how people construct “snap” folk theories about the ways personality algorithms operate as well as how they react when shown their outputs. Through qualitative interviews ( n = 37) with people after being presented with their own profile, this study identifies a series of folk theories that people came up with to explain the personality algorithm across four dimensions (data source, scope, collection process, and outputs). In addition, this study examined how those folk theories contributed to certain reactions, fears, and justifications people had about the algorithm. This study builds on our theoretical understanding of folk theory literature as well as certain limitations of algorithmic transparency/sovereignty when these types of inferential and predictive algorithms get coupled with people’s hopes and fears about employment, hiring, and promotion.
... If satisficing is extreme, however, this would constitute CR since respondents are no longer exerting the sufficient level of effort needed for optimal survey responding. CR is also distinct from other response sets such as faking (e.g., Birkeland, Manson, Kisamore, Brannick, & Smith, 2006;Komar, Brown, Komar, & Robie, 2008) and response styles more generally (e.g., Grau et al., 2019;Van Vaerenbergh & Thomas, 2013;Weijters, Schillewaert, & Geuens, 2008). For instance, while faking is similar to CR in that it distorts participants' true standing on a construct, it is effortful in that participants are attempting to denote a particular level of the construct being assessed. ...
Thesis
Surveys are one of the most popular, useful, and efficient methods for gathering data within both academic and organizational settings. Despite the many benefits afforded by surveys, research shows that a nontrivial number of people engage in careless responding (CR) such that their responses to surveys are not effortful, accurate or valid. This is problematic because the failure to account for CR can distort research findings, result in false theoretical conclusions and lead to precarious workplace decisions. With surveys, it is common to model responses with a latent variable framework and use fit indices to makes conclusions about how well the model represents the data. Research shows that CR can be associated with poor fit, good fit, and/or unrelated to fit. To better understand how CR affects fit, the primary goal of this study was to examine the consequences of CR on the fit of latent variable models using a comprehensive, realistic and rigorous simulation paradigm. A secondary goal was to better elucidate the nature of CR and specify how CR behavior manifests within survey responses. In Study 1, participants’ survey response patterns were experimentally shaped. In Study 2, these results were used as a basis for the primary simulation. A total of 144 conditions (which varied the sample size, number of items, CR prevalence, CR severity, and CR type), two latent variable models (item response theory and confirmatory factor analysis), and six model fit indices were examined. Overall, the results of this study show that CR is frequently associated with deteriorations in model fit. These effects are, however, highly nuanced, variable, and contingent on many factors. Moreover, this study demonstrates that good fit is not necessarily indicative of careful responding nor is poor fit always emblematic of CR. Rather, model fit and CR/response validity are distinct issues that must be separately addressed. These findings can be leveraged by researchers to develop more accurate theories and practitioners to better manage survey data that is laden with CR.
... This directed faking methodology is necessary to allow researchers to be confident a particular person is indeed faking (i.e., faking behavior is experimentally controlled), but likely exaggerates the behavior. In high-stakes assessment and selection scenarios, faking may be more subtle and strategic, depending on a person's motivation and attitudes among numerous other individual and contextual factors (Birkeland et al., 2006;Roulin et al., 2016). Our classification approach may well successfully identify such extreme faking behavior in practice, but further research is needed to test whether this methodology is as successful at detecting faking under realistic job application responding scenarios. ...
Preprint
Forced choice (FC) personality measures are increasingly popular in research and applied contexts. To date however, no method for detecting faking behavior on this format has been both proposed and empirically tested. We introduce a new methodology for faking detection on FC measures, based on the assumption that individuals engaging in faking try to approximate the ideal response on each block of items. Individuals’ responses are scored relative to the ideal using a model for rank-order data not previously applied to FC measures (Generalized Mallows Model). Scores are then used as predictors of faking in a regularized logistic regression. In Study 1, we test our approach using cross-validation, and contrast generic and job-specific ideal responses. Study 2 replicates our methodology on two measures matched and mismatched on item desirability. We achieved between 80 – 92% balanced accuracy in detecting instructed faking, and predicted probabilities of faking correlated with self-reported faking behavior. We discuss how this approach, driven by trying to capture the faking process, differs methodologically and theoretically to existing faking detection paradigms, and measure and context-specific factors impacting accuracy.
... Most assessments are currently conducted with rating scale questionnaires. Faking seems to be quite prevalent on rating scales, resulting in increases of .1 to .6 S D in trait estimates (Birkeland et al., 2006;Viswesvaran & Ones, 1999) in real or simulated high-stakes situations. ...
Article
Full-text available
The multidimensional forced-choice (MFC) format has been proposed to reduce faking because items within blocks can be matched on desirability. However, the desirability of individual items might not transfer to the item blocks. The aim of this paper is to propose a mixture item response theory model for faking in the MFC format that allows to estimate the fakability of MFC blocks, termed the Faking Mixture model. Given current computing capabilities, within-subject data from both high- and low-stakes contexts are needed to estimate the model. A simulation showed good parameter recovery under various conditions. An empirical validation showed that matching was necessary but not sufficient to create an MFC questionnaire that can reduce faking. The Faking Mixture model can be used to reduce fakability during test construction.
... Faking is a frequent phenomenon and common issue in psychological testing (Hall and Hall 2011). For example, in job selection contexts, participants tend to answer in a way that makes them appear more conscientious and emotionally stable than they actually are (Birkeland et al. 2006;Viswesvaran and Ones 1999). Similarly, in clinical assessment, malingering-faking symptoms-is prevalent (Hall and Hall 2011). ...
Article
Full-text available
Socio-emotional abilities have been proposed as an extension to models of intelligence, but earlier measurement approaches have either not fulfilled criteria of ability measurement or have covered only predominantly receptive abilities. We argue that faking ability—the ability to adjust responses on questionnaires to present oneself in a desired manner—is a socio-emotional ability that can broaden our understanding of these abilities and intelligence in general. To test this theory, we developed new instruments to measure the ability to fake bad (malingering) and administered them jointly with established tests of faking good ability in a general sample of n = 134. Participants also completed multiple tests of emotion perception along with tests of emotion expression posing, pain expression regulation, and working memory capacity. We found that individual differences in faking ability tests are best explained by a general factor that had a large correlation with receptive socio-emotional abilities and had a zero to medium-sized correlation with different productive socio-emotional abilities. All correlations were still small after controlling these effects for shared variance with general mental ability as indicated by tests of working memory capacity. We conclude that faking ability is indeed correlated meaningfully with other socio-emotional abilities and discuss the implications for intelligence research and applied ability assessment.
... It is well known that human factors are associated with a substantial number of military aviation accidents (de Hoyos, 2019). To avoid such catastrophic events, continuous enhancement of pilot selection processes using state-of-the-art scientific research results is needed since traditionally used self-report paper-and-pencil personality measures are highly prone to self-report bias (Birkeland et al., 2006). More objective and reliable selection processes can also minimize drop-out rates among pilot candidates in later and more expensive stages of training process. ...
... This finding is interesting given that the interview was for a business manager position. Meta-analytical evidence suggests that applicants typically fake to appear more agreeable much more for nonmanagerial than for managerial positions (Birkeland et al., 2006). This study suggests that applicants may benefit from displaying higher levels of agreeableness (and H-H) for managerial positions as well. ...
Article
Full-text available
Deceptive impression management (i.e., faking) may alter interviewers’ perceptions of applicants’ qualifications, and consequently, decrease the predictive validity of the job interview. In examining faking antecedents, research has given little attention to situational variables. Using a between-subjects experiment, this research addressed that gap by examining whether organizational culture impacted both the extent to which applicants faked and the manner in which they faked during a job interview. Analyses of variance revealed that organizational culture did not affect the extent to which applicants faked. However, when taking into account applicants’ perceptions of the ideal candidate, organizational culture was found to indirectly impact the manner in which applicants faked their personality (agreeableness and honesty-humility). Overall, the findings suggest that applicants may be able to fake their personality traits during job interviews to increase their person-organization fit. Full text: https://scholarworks.bgsu.edu/pad/vol7/iss1/8/
... The study of faking on personality tests has a long history in selection (Birkeland et al., 2006;Viswesvaran & Ones, 1999). Faking involves effort to "present misleading and deceptive information about [one's] personality, interests, experiences, past behaviors, and attitudes" (Kuncel & Borneman, 2007, p. 221). ...
Article
Full-text available
New assessments were developed measuring conscientiousness and openness by gamifying two existing Likert‐type personality measures by adding a story, a measure redesign process called storification. Both the original and storified versions of the measures were administered to 426 people to compare the measures' convergence, incremental prediction of performance, magnitude of faking effects, and reactions using a counter‐balanced within‐subjects fake‐good research design. Moderate convergence of latent traits was observed between the original and storified measures for both conscientiousness (β = .45) and openness (β = .72). Convergence at the item level was generally poor (−.02 < r < .40 with mean convergence = 0.17). The measures predicted performance more poorly than the original measures on which they were based. When instructed to fake, the conscientiousness measure showed improved resistance to faking, but the openness measure did not. Reactions (e.g., face validity perceptions, predictive validity perceptions) were more positive to the storified measures in terms of procedural justice and general fairness perceptions. Overall, the present attempt at storification was more successful at creating alternative measures of traits rather than parallel ones, suggesting storification may be better considered a distinct scale development strategy instead of a redesign technique that maintains the psychometric profile of an original measure. The present approach to converting a Likert‐type personality assessment into a story created alternative personality measures rather than parallel ones Success differed between measures, suggesting needed further research in story design The present approach to converting a Likert‐type personality assessment into a story created alternative personality measures rather than parallel ones Success differed between measures, suggesting needed further research in story design
... Yet, comparing the detected range of response distortion to previous findings on faking in personality inventories, the effect sizes of the current study seem to be larger than previous findings. For example, Birkeland, Manson, Kisamore, Brannick, and Smith (2006), as well as Viswesvaran and Ones (1999), claim that the extent of response distortion in personality inventories may be up to one standard deviation. For example, conscientiousness scores, as well as neuroticism scores, could be augmented by nearly an entire standard deviation, scores for extraversion, openness, and agreeableness tended to be augmented by around half a standard deviation (corresponding to η 2 = .20, ...
Article
Full-text available
Background: This study examines people's ability to fake their reported health behavior and explores the magnitude of such response distortion concerning faking of preventive health behavior and health risk behavior. As health behavior is a sensitive topic, people usually prefer privacy about it or they wish to create a better image of themselves (Fekken et al., 2012; Levy et al., 2018). Nevertheless, health behavior is often assessed by self-report questionnaires that are prone to faking. Therefore, it is important to examine the possible impact of such faking. Methods: To replicate the findings and test their robustness, two study designs were realized. In the within-subjects-design, 142 participants repeatedly answered a health behavior questionnaire with an instruction to answer honestly, fake good, and fake bad. In the between-subjects design, 128 participants were randomly assigned to one of three groups that filled out the health behavior questionnaire with only one of the three instructions. Results: Both studies showed that successful faking of self-reported preventive and health risk behavior was possible. The magnitude of such faking effects was very large in the within-subjects design and somewhat smaller in the between-subjects design. Conclusion: Even though each design has its inherent merits and problems, caution is indicated regarding faking effects.
... B. nur Vorstände) einbeziehen, um noch differenziertere Validierungsbefunde zu schaffen. Da die im Auswahlkontext auftretende (Birkeland, Manson, Kisamore, Brannick & Smith, 2006) soziale Erwünschtheit keine Bedrohung der Validität von Persönlichkeitstests darstellt (Ones, Viswesvaran & Reiss, 1996), wurde der Bearbeitungskontext vorliegend nicht gesondert betrachtet und die Robustheit der Ergebnisse angenommen. Zukünftige Forschung könnte dennoch anhand der Substichproben die Auswirkungen des Bearbeitungskontextes spezifisch für das BIP betrachten. ...
Article
Full-text available
Das Bochumer Inventar zur berufsbezogenen Persönlichkeitsbeschreibung (BIP) erfasst berufsbezogene Persönlichkeitsmerkmale und kann in linearen Regressionen verschiedene Maße subjektiven und objektiven Berufserfolgs aufklären. Um zusätzliche Nachweise für die Kriteriumsvalidität zu erbringen, werden in der vorliegenden Arbeit Cluster- und Klassifikationsverfahren verwendet. Mithilfe von k-Means-Clusteranalysen können typische Persönlichkeitsstrukturen identifiziert werden: Personen, die sich durch Flexibilität und Gestaltungsmotivation auszeichnen, weisen einen bedeutsamen Zusammenhang zu höheren beruflichen Entgelten auf, während solche, die durch emotionale Instabilität und geringe Durchsetzungsstärke geprägt sind, häufig ein niedriges Entgelt erzielen. Klassische und neuere Klassifikationsverfahren (logistische Regressionen bzw. Random Forests) besitzen substantielle Trefferquoten in der Identifikation von Mitarbeitenden als Fach- oder Führungskraft. Die Ergebnisse sind als mittlere bis große Effekte einzustufen und liefern damit einen Nachweis über die Relevanz der Persönlichkeit für beruflichen Erfolg. Der Artikel ist unter https://econtent.hogrefe.com/doi/10.1026/0932-4089/a000377 im Open Access verfügbar.
... Despite the predictive value of Honesty-Humility for organizations, methods used to measure this trait in a selection context are limited and have received significantly less attention compared to other personality traits. In non-selection settings, such as laboratory experiments where the stakes are low, Honesty-Humility can be accurately assessed using self-report personality measures (Birkeland et al., 2006). However, in high-stakes selection settings, self-reports can present several issues, including candidates' ability to present themselves in an ideal light, affecting who is hired irrespective of their ability to perform on the job (Robie et al., 2007;Winkelspecht et al., 2006), and negative candidate perceptions of the selection system (Morgeson et al., 2007). ...
Article
Honesty-Humility is a valuable predictor in personnel selection; however, problems with self-report measures create a need for new tools to judge this trait. Therefore, this research examines the interview as an alternative for assessing Honesty-Humility and how to improve judgments of Honesty-Humility in the interview. Using trait activation theory, we examined the impact of interview question type on Honesty-Humility judgment accuracy. We hypothesized that general personality-tailored questions and probes would increase the accuracy of Honesty-Humility judgments. Nine hundred thirty-three Amazon Mechanical Turk workers watched and rated five interviews. Results found that general questions with probes and specific questions without probes led to the best Honesty-Humility judgments. These findings support the realistic accuracy model and provide implications for Honesty-Humility-based interviews.
... It also has important applied implications for high-stakes testing contexts, such as employee selection. Job applicants can and do distort their responses to personality assessments (Anglim et al., 2017(Anglim et al., , 2018Birkeland et al., 2006;Rothstein & Goffin, 2006). As such, practitioners are keen to identify strategies that reduce the impact of socially desirable responding bias, such as subtle items (Worthington & Schlottmann, 1986), forced-choice formats (Bartram, 2007), and warnings (Mcfarland, 2003). ...
Article
Full-text available
Researchers and practitioners have long been concerned about detrimental effects of socially desirable responding on the structure and criterion validity of personality assessments. The current research examined the effect of reducing evaluative item content of a Big Five personality assessment on test structure and criterion validity. We developed a new public domain measure of the Big Five called the Less Evaluative Five Factor Inventory (LEFFI), adapted from the standard 50-item IPIP NEO, and intended to be less evaluative. Participants ( n = 3164) then completed standard (IPIP) and neutralized (LEFFI) measures of personality. Criteria were also collected, including academic grades, age, sex, smoking, alcohol consumption, exercise, protesting, religious worship, music preferences, dental hygiene, blood donation, other-rated communication styles, other-rated HEXACO personality, and cognitive ability (ICAR). Evaluativeness of items was reduced in the neutralized measure. Cronbach's alpha and test-retest reliability were maintained. Correlations between the Big Five were reduced in the neutralized measure and criterion validity was similar or slightly reduced in the neutralized measure. The large sample size and use of objective criteria extend past research. The study also contributes to debates about whether the general factor of personality and agreement with socially desirable content reflect substance or bias.
... Faking is defined as "conscious distortions of answers to the interview questions in order to obtain a better score on the interview and/or create favorable perceptions" Campion, 2007, p. 1639). Faking is one of the most popular issues among test takers (Goffin & Boyd, 2009) and thus, has been widely examined (Birkeland et al., 2006). Melchers et al. (2020) review the existing literature on applicant faking in selection interviews found that most applicants fake at least to some degree. ...
Article
It is well‐established that selection decisions can be improved using multiple non‐redundant assessments. Two such assessments are cognitive ability and conscientiousness. Though meta‐analytic findings demonstrate little or no relationship between cognitive ability and conscientiousness, faking research suggests the two variables are related when test‐takers are motivated to fake. We extend this logic by showing that incremental validity for conscientiousness declines when respondents fake, due to enhanced collinearity between conscientiousness and cognitive ability. Three studies, employing within‐subjects designs and utilizing three different faking conditions, reveal a consistent increase in collinearity between conscientiousness and cognitive ability when respondents are motivated to fake, leading to reduced incremental validity of conscientiousness beyond cognitive ability in predicting the performance. Implications for selection systems are discussed.
Book
Full-text available
Los test psicométricos son un instrumento de gran importancia tanto para el desarrollo de la teoría psicológica, como para la solución de problemas prácticos en situaciones de selección, evaluación y diagnóstico. Debido a la importancia que tienen los test tanto a nivel científico como profesional, es fundamental que los usuarios de estas pruebas conozcan los límites y alcances de estas técnicas. El presente manual involucra puede dividirse en dos secciones. En la primera se desarrollarán las bases conceptuales de la teoría psicométrica manteniendo una visión aplicada de estos conceptos. A pesar del enfoque introductorio del libro hemos intentado no perder rigor en el desarrollo de temas vinculados a la teoría y normativa psicométrica. La segunda sección del libro se centra en una revisión de instrumentos de utilidad para la evaluación psicológica en diferentes ámbitos del ejercicio profesional. De esta forma se analizan instrumentos psicométricos de amplio uso a nivel educacional, clínico y organizacional. Se espera que el lector adquiera con este breve manual los conocimientos y competencias necesarias para comprender la utilidad y limitaciones de los test psicológicos, cuente con habilidades para seleccionar una prueba y juzgar la calidad de los test publicados, sepa administrar una prueba, interpretarla y comunicar adecuadamente los resultados, permitiendo así una utilización ética y responsables de los test psicométricos.
Article
Three recent meta‐analyses have found that interests, regardless of scoring method (e.g., summative or congruence) are valid predictors of performance in employment contexts. As these inventories gain popularity as a prehire assessment, it is important to understand whether interest assessments are susceptible to faking (i.e., applicant response distortion) similar to how personality assessments are. Thus, the purpose of this study was to investigate whether interest inventories are susceptible to faking and whether faking can be detected and prevented using similar methods as what is commonly used for faking of personality assessments. Measures of forced‐choice and single‐stimulus interests, personality, general mental ability (GMA), and impression management (IM) were collected from 236 participants across honest and faked instructions. Findings suggest that individuals were capable of faking interest inventories, but surprisingly, the effect did not vary greatly as a function of format or scoring method. Further, GMA was not strongly related to faking interests, whereas faking of personality scales was. In efforts to detect interest faking, IM scores outperformed covariances indices in distinguishing between honest and faked scores. Recently, vocational interest inventories have gained traction as pre‐employment measures that are predictive of job performance. Recently, vocational interest inventories have gained traction as pre‐employment measures that are predictive of job performance. However, it is unclear whether these assessments are susceptible to faking (i.e., inflating scores to appear as an ideal candidate; applicant response distortion) like personality assessments are. The current study aimed to understand the extent to which faking influences interest scores and if interest faking can be prevented or detected using similar methods as what is used for personality assessments. To investigate the effect of faking, measures of interests, personality, general mental ability (GMA), and impression management (IM) were collected for an honest and faked (i.e., simulated applicant) sample. Overall, interests were successfully faked regardless of scoring method or item format and GMA showed small relationships with interest scores. Taken together, it appears as though interest items are transparent in terms of their job relevance, which makes them easy to fake. Though more research is needed, these results suggest that researchers and practitioners should exercise caution when using interest assessments in pre‐employment contexts. However, it is unclear whether these assessments are susceptible to faking (i.e., inflating scores to appear as an ideal candidate; applicant response distortion) like personality assessments are. The current study aimed to understand the extent to which faking influences interest scores and if interest faking can be prevented or detected using similar methods as what is used for personality assessments. To investigate the effect of faking, measures of interests, personality, general mental ability (GMA), and impression management (IM) were collected for an honest and faked (i.e., simulated applicant) sample. Overall, interests were successfully faked regardless of scoring method or item format and GMA showed small relationships with interest scores. Taken together, it appears as though interest items are transparent in terms of their job relevance, which makes them easy to fake. Though more research is needed, these results suggest that researchers and practitioners should exercise caution when using interest assessments in pre‐employment contexts.
Article
Full-text available
This study presents a comprehensive meta-analysis on the faking resistance of forced-choice (FC) inventories. The results showed that (1) FC inventories show resistance to faking behavior; (2) the magnitude of faking is higher in experimental contexts than in real-life selection processes, suggesting that the effects of faking may be, in part, a laboratory phenomenon; and (3) quasi-ipsative FC inventories are more resistant to faking than the other FC formats. Smaller effect sizes were found for conscientiousness when the quasi-ipsative format was used (δ = 0.49 vs. δ = 1.27 for ipsative formats). Also, the effect sizes were smaller for the applicant samples than for the experimental samples. Finally, the contributions and practical implications of these findings are discussed.
Article
Residency programs should use a systematic method of recruitment that begins with defining unique desired candidate attributes. Commonly sought-after characteristics may be delineated via the residency application. Scores from standardized examinations taken in medical school predict academic success, and may correlate to overall performance. Strong letters of recommendation and a personal history of prior success outside the medical field both forecast success in residency. Interviews are crucial to determining fit within a program, and remain a valid measure of an applicant's ability to prosper in a particular program, even with many interviews being completed in the virtual realm.
Article
Background Researchers have used within-subjects designs to assess personality faking in real-world contexts. However, no research is available to (a) characterize the typical finding from these studies and (b) examine variability across study results. Aims The current study was aimed at filling these gaps by meta-analyzing actual applicants’ responses to personality measurements in high-stakes contexts versus low-stakes contexts reported in within-subjects studies. Materials & Methods This meta-analysis examined 20 within-subjects applicant–honest studies (where individuals completed an assessment once as applicants and again in a low-stakes setting). Results We found that applicants had moderately higher (more socially desirable) means, slightly reduced variability, and stronger rank-order consistency in high-stakes settings. The assessment order moderated the findings; studies with a high-to-low order (where the high-stakes setting was introduced first) showed a stronger faking effect—demonstrated by higher means and weaker rank-order consistencies—than those in a low-to-high order. Discussion and Conclusion These findings are consistent with expectations that, relative to low-stakes situations, individuals tend to exaggerate, in a positive direction, their personality descriptions as job applicants. In addition, assessment order matters when understanding the magnitudes of faking effects.
Article
Full-text available
Objective and method: This meta-analysis reports the most comprehensive assessment to date of the strength of the relationships between the Big Five personality traits and academic performance by synthesizing 267 independent samples (N = 413,074) in 228 unique studies. It also examined the incremental validity of personality traits above and beyond cognitive ability in predicting academic performance. Results: The combined effect of cognitive ability and personality traits explained 27.8% of the variance in academic performance. Cognitive ability was the most important predictor with a relative importance of 64%. Conscientiousness emerged as a strong and robust predictor of performance, even when controlling for cognitive ability, and accounted for 28% of the explained variance in academic performance. A significant moderating effect of education level was observed. The relationship of academic performance with openness, extraversion, and agreeableness demonstrated significantly larger effect sizes at the elementary/middle school level compared to the subsequent levels. Openness, despite its weak overall relative importance, was found to be an important determinant of student performance in the early years of school. Conclusion: These findings reaffirm the critical role of personality traits in explaining academic performance through the most comprehensive assessment yet of these relationships.
Article
Although the use of ideal point item response theory (IRT) models for organizational research has increased over the last decade, the assessment of construct dimensionality of ideal point scales has been overlooked in previous research. In this study, we developed and evaluated dimensionality assessment methods for an ideal point IRT model under the Bayesian framework. We applied the posterior predictive model checking (PPMC) approach to the most widely used ideal point IRT model, the generalized graded unfolding model (GGUM). We conducted a Monte Carlo simulation to compare the performance of item pair discrepancy statistics and to evaluate the Type I error and power rates of the methods. The simulation results indicated that the Bayesian dimensionality detection method controlled Type I errors reasonably well across the conditions. In addition, the proposed method showed better performance than existing methods, yielding acceptable power when 20% of the items were generated from the secondary dimension. Organizational implications and limitations of the study are further discussed.
Article
Researchers and practitioners in postsecondary and workplace settings recognize the value of noncognitive constructs in predicting academic and vocational success but also perceive that many students or employees are lacking in these areas. In turn, there is increased interest in interventions designed to enhance these constructs. We provide an empirically informed theory of change (ToC) that describes the inputs, mechanisms, and outputs of noncognitive construct interventions (NCIs). The components that inform this ToC include specific relevant constructs that are amenable to intervention, intervention content and mechanisms of change, methodological considerations, moderators of program efficacy, recommendations for evaluating NCIs, and suggested outcomes. In turn, NCIs should provide benefits to individuals, institutions, and society at large and also advance our scientific understanding of this important phenomenon.
Article
Agreeableness impacts people and real-world outcomes. In the most comprehensive quantitative review to date, we summarize results from 142 meta-analyses reporting effects for 275 variables, which represent N > 1.9 million participants from k > 3,900 studies. Arranging variables by their content and type, we use an organizational framework of 16 conceptual categories that presents a detailed account of Agreeableness’ external relations. Overall, the trait has effects in a desirable direction for 93% of variables (grand mean 𝜌M = .16). We also review lower order trait evidence for 42 variables from 20 meta-analyses. Using these empirical findings, in tandem with existing theory, we synthesize eight general themes that describe Agreeableness’ characteristic functioning across variables: self- transcendence, contentment, relational investment, teamworking, work investment, lower results emphasis, social norm orientation, and social integration. We conclude by discussing potential boundary conditions of findings, contributions and limitations of our review, and future research directions.
Article
Full-text available
Prior meta-analyses demonstrate that people can intentionally distort Big Five personality scores when instructed. As yet, there is no equivalent meta-analysis addressing instructed faking on the dark triad (narcissism, Machiavellianism, and psychopathy). Therefore, we review mean score changes to the dark triad domains and facets under instructed faking. Due to insufficient k for meta-analysis, narcissism and Machiavellianism were systematically reviewed alongside psychopathy. The systematic review revealed inconsistent findings for narcissism and Machiavellianism with several effects in the opposite direction than expected. The psychopathy meta-analysis showed that: (a) scores were significantly lower under fake good compared to answer honestly instructions (d = −0.40); and (b) scores were significantly higher under fake bad compared to answer honestly instructions (d = 1.88). Subgroup analyses revealed significant score decreases under fake good instructions for both primary (d = −0.56) and secondary psychopathy (d = −0.96), and a significant score increase under fake bad instructions for primary (d = 1.69) and secondary psychopathy (d = 1.50). We conclude that dark triad measures are fakeable to a similar extent as the Big Five, and discuss the relevance of our findings for dark triad assessment in several applied contexts.
Article
The ubiquity and consequences of job performance evaluations necessitate accurate responding. This paper describes two studies designed to develop (Study 1) and provide initial validation (Study 2) for a new measure specifically designed to assist in this context: the Occupational Performance Assessment–Response Distortion (OPerA-RD) scale. This 20-item scale is contextualized to the workplace and was developed by identifying items that could detect over- and under-reporting of job performance by self- or other-report in four independent faking samples. Initial validation of the OPerA-RD was supported by expected differences between within-group faking and control conditions in subsequent samples, specifically over- and under-reporting of job performance by self- or other-reports. Implications for research and applied settings are discussed.
Article
Our objective was to compare individuals’ ability to intentionally make a positive impression when responding to a Five-Factor Model personality measure under adjective vs. statement and forced choice vs. Likert conditions. Participants were 1,798 high school students who were randomly assigned to either a condition receiving normal instructions or instructions to make a positive impression. We compared the groups’ scores and validity estimates under the various conditions. Although impression management occurred on all item types, participants could more easily manipulate their responses to Likert items vs. forced choice items, and statements vs. adjectives. Item type made little difference in terms of convergent and discriminant validity and criterion-related validity for all outcomes but one, ACT scores, which suggests cognitive ability plays a role in impression management ability.
Article
Full-text available
Recently, 2 separate yet related criticisms have been levied against the adequacy of the five-factor model (or Big Five) as a descriptive taxonomy of job applicant personality: frame of reference effects (M. J. Schmit & A. M. Ryan, 1993) and socially desirable responding (A. F. Snell & M. A. McDaniel, 1998). Of interest, although both criticisms suggest that the five-factor model is inadequate, the frame of reference effects criticism suggests that the factor structure should be more complex, whereas socially desirable responding suggests that it should be less complex in job applicant contexts. The current research reports the results of a new study demonstrating the adequacy of the five-factor model as a descriptor of job applicant, job incumbent, and student personality. Implications for personality assessment and concurrent validation designs using personality measures are also discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
The impact of positive response distortion (PRD) upon attitude test scores is examined in job applicant settings. Using data from three empirical studies, several issues are examined, including job applicant and incumbent base rates, impact on validity, and effects on hiring decisions under single-test and compensatory scoring models.
Article
Full-text available
Personality measures with items that ask respondents to characterize themselves across a range of situations are increasingly used for personnel selection purposes. Research conducted in a laboratory setting has found that personality items may have different psychometric characteristics depending on the degree to which that range is widened or narrowed (i.e., degree of contextualization). This study is an attempt to study the psychometric impact of contextualization in a large field sample (N = 1,078). Respondents were given either a contextualized (at work) or noncontextualized (in general) version of the six facets of the conscientiousness factor of the NEO PI-R. Analyses were conducted at the facet and item levels. Results were mixed but indicated that error variances tended to be slightly lower for the work-specific instrument in comparison to the noncontextualized instrument. Implications for personality inventory development, validation, and use are discussed.
Article
Full-text available
Incumbents are often used in the development and validation of a wide variety of personnel selection instruments, including noncognitive instruments such as personality tests. However, the degree to which assumed motivational factors impact the measurement equivalence and validity of tests developed using incumbents has not been adequately addressed. This study addressed this issue by examining the measurement equivalence of 6 personality scales between a group applying forjobs as sales managers in a large retail organization (N = 999) and a group of sales managers currently employed in that organization (N = 796). A graded item response theory model (Samejima, 1969) was fit to the personality scales in each group. Results indicated that moderately large differences existed in personality scale scores (approximately 1/2 standard deviation units) but only one of the six scales contained any items that evidenced differential item functioning and no scales evidenced differential test functioning. In addition, person-level analyses showed no apparent differences across groups in aberrant responding. The results suggest that personality measures used for selection retain similar psychometric properties to those used in incumbent validation studies.
Article
Full-text available
The authors examined whether individuals can fake their responses to a personality inventory if instructed to do so. Between-subjects and within-subject designs were metaanalyzed separately. Across 51 studies, fakability did not vary by personality dimension; all the Big Five factors were equally fakable. Faking produced the largest distortions in social desirability scales. Instructions to fake good produced lower effect sizes compared with instructions to fake bad. Comparing meta-analytic results from within-subjects and between-subjects designs, we conclude, based on statistical and methodological considerations, that within-subjects designs produce more accurate estimates. Between-subjects designs may distort estimates due to Subject × Treatment interactions and low statistical power.
Article
Full-text available
This study examined the effectiveness of warnings in reducing faking on noncognitive selection measures. A review of the relatively sparse literature indicated that warnings tend to have a small impact on responses (d = 0.23), with warned applicants receiving lower predictor scores than unwarned applicants. However, the effect of warnings on predictor scores was found to differ according to the type of warning used. In light of this, an experimental study was conducted to assess the following: (a) the overall effectiveness of warnings in reducing faking, and (b) the differential effects of three types of warnings on faking. The results indicated that a warning which identified that faking could be identified and the potential consequences of faking impacted responding.
Book
Full-text available
Meta-analysis is arguably the most important methodological innovation in the social and behavioral sciences in the last 25 years. Developed to offer researchers an informative account of which methods are most useful in integrating research findings across studies, this book will enable the reader to apply, as well as understand, meta-analytic methods. Rather than taking an encyclopedic approach, the authors have focused on carefully developing those techniques that are most applicable to social science research, and have given a general conceptual description of more complex and rarely-used techniques. Fully revised and updated, Methods of Meta-Analysis, Second Edition is the most comprehensive text on meta-analysis available today. New to the Second Edition: * An evaluation of fixed versus random effects models for meta-analysis* New methods for correcting for indirect range restriction in meta-analysis* New developments in corrections for measurement error* A discussion of a new Windows-based program package for applying the meta-analysis methods presented in the book* A presentation of the theories of data underlying different approaches to meta-analysis
Article
Full-text available
A review of criterion-related validities of personality constructs indicated that 6 constructs are useful predictors of important job-related criteria. An inventory was developed to measure the 6 constructs. In addition, 4 response validity scales were developed to measure accuracy of self-description. These scales were administered in 3 contexts: a concurrent criterion-related validity study, a faking experiment, and an applicant setting. Sample sizes were 9,188, 245, and 125, respectively. Results showed that (a) validities were in the .20s (uncorrected for unreliability or restriction in range) against targeted criterion constructs, (b) respondents successfully distorted their self-descriptions when instructed to do so, (c) responsive validity scales were responsive to different types of distortion, (d) applicants' responses did not reflect evidence of distortion, and (e) validities remained stable regardless of possible distortion by respondents in either unusually positive or negative directions. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Response distortion (RD), or faking, among job applicants completing personality inventories has been a concern for selection specialists. In a field study using the NEO Personality Inventory, Revised, the authors show that RD is significantly greater among job applicants than among job incumbents, that there are significant individual differences in RD, and that RD among job applicants can have a significant effect on who is hired. These results are discussed in the context of recent studies suggesting that RD has little effect on the predictive validity of personality inventories. The authors conclude that future research, rather than focusing on predictive validity, should focus instead on the effect of RD on construct validity and hiring decisions. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Increased use of personality inventories in employee selection has led to concerns regarding factors that influence the validity of such measures. A series of studies was conducted to examine the influence of frame of reference on responses to a personality inventory. Study 1 involved both within-subject and between-groups designs to assess the effects of testing situation (general instructions vs. applicant instructions) and item type (work specific vs. noncontextual) on responses to the NEO Five-Factor Inventory (P. T. Costa & R. R. McCrae, 1989). Results indicated that a work-related testing context and work-related items led to more positive responses. A second study found differences in the validity of a measure of conscientiousness, depending on the frame of reference of respondents. Specifically, context-specific items were found to have greater validity. Implications for personnel selection are discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
J. Millham and L. I. Jacobson's (1978) 2-factor model of socially desirable responding based on denial and attribution components is reviewed and disputed. A 2nd model distinguishing self-deception and impression management components is reviewed and shown to be related to early factor-analytic work on desirability scales. Two studies, with 511 undergraduates, were conducted to test the model. A factor analysis of commonly used desirability scales (e.g., Lie scale of the MMPI, Marlowe-Crowne Social Desirability Scale) revealed that the 2 major factors were best interpreted as Self-Deception and Impression Management. A 2nd study employed confirmatory factor analysis to show that the attribution/denial model does not fit the data as well as the self-deception/impression management model. A 3rd study, with 100 Ss, compared scores on desirability scales under anonymous and public conditions. Results show that those scales that had loaded highest on the Impression Management factor showed the greatest mean increase from anonymous to public conditions. It is recommended that impression management, but not self-deception, be controlled in self-reports of personality. (54 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Applied psychologists have long been interested in the relationship between applicant personality and employment interview ratings. Analysis of data from two studies, one using a situational interview and one using a behavioral interview, suggests that the correlations of structured interview ratings with self-report measures of personality factors are generally rather low. Further, a small meta-analysis integrates these two studies and the limited previous literature to arrive at a similar conclusion - there is relatively little relationship between structured interviews and self-reported personality factors.
Article
Full-text available
This study compares the criterion validity of the Big Five personality dimensions when assessed using Five-Factor Model (FFM)-based inventories and non-FFM-based inventories. A large database consisting of American as well as European validity studies was meta-analysed. The results showed that for conscientiousness and emotional stability, the FFM-based inventories had greater criterion validity than the non FFM-based inventories. Conscientiousness showed an operational validity of .28 (N = 19,460, 90% CV = .07) for FFM-based inventories and .18 (N =5,874, 90% CV = -.04) for non-FFM inventories. Emotional stability showed an operational validity of .16 (N = 10,786, 90% CV = .04) versus .05 (N = 4,541, 90% CV = -.05) for FFM and non-FFM-based inventories, respectively. No relevant differences emerged for extraversion, openness, and agreeableness. From a practical point of view, these findings suggest that practitioners should use inventories based on the FFM in order to make personnel selection decisions.
Article
Full-text available
The stability and replicability of the Five-Factor model of personality across samples and testing purposes remain a significant issue in personnel selection and assessment. The present study explores the stability of a new Greek Big Five personality measure (TPQue) across different samples in order to explore the suitability of the measure in personnel selection and assessment. The factor structure of the measure across three samples (students, employees, and job applicants) is examined. The results of exploratory and confirmatory factor analyses show that the five-factor structure remains intact for the students’, the applicants’ and the employees’ samples – contrary to previous studies – with all the sub-scales of the personality measure (TPQue) loading on the intended factors. Furthermore, congruence coefficients between the samples justify the stability of the model in the working settings.
Article
Full-text available
The purpose of this study was to investigate conflicting findings in previous research on personality and job performance. Meta-analysis was used to (a) assess the overall validity of personality measures as predictors of job performance, (b) investigate the moderating effects of several study characteristics on personality scale validity, and (c) appraise the predictability of job performance as a function of eight distinct categories of personality content, including the “Big Five” personality factors. Based on review of 494 studies, usable results were identified for 97 independent samples (total N= 13,521). Consistent with predictions, studies using confirmatory research strategies produced a corrected mean personality scale validity (.29) that was more than twice as high as that based on studies adopting exploratory strategies (.12). An even higher mean validity (.38) was obtained based on studies using job analysis explicitly in the selection of personality measures. Validities were also found to be higher in longer tenured samples and in published articles versus dissertations. Corrected mean validities for the “Big Five” factors ranged from .16 for Extroversion to .33 for Agreeableness. Weaknesses in the reporting of validation study characteristics are noted, and recommendations for future research in this area are provided. Contrary to conclusions of certain past reviews, the present findings provide some grounds for optimism concerning the use of personality measures in employee selection.
Article
There are 2 families of statistical procedures in meta-analysis: fixed- and random-effects procedures. They were developed for somewhat different inference goals: making inferences about the effect parameters in the studies that have been observed versus making inferences about the distribution of effect parameters in a population of studies from a random sample of studies. The authors evaluate the performance of confidence intervals and hypothesis tests when each type of statistical procedure is used for each type of inference and confirm that each procedure is best for making the kind of inference for which it was designed. Conditionally random-effects procedures (a hybrid type) are shown to have properties in between those of fixed- and random-effects procedures.
Article
A review of criterion-related validities of personality constructs indicated that six constructs are useful predictors of important job-related criteria. An inventory was developed to measure the 6 constructs. In addition, 4 response validity scales were developed to measure accuracy of self-description. These scales were administered in three contexts: a concurrent criterion-related validity study, a faking experiment, and an applicant setting. Sample sizes were 9,188,245, and 125, respectively. Results showed that (a) validities were in the.20s (uncorrected for unreliability or restriction in range) against targeted criterion constructs, (b) respondents successfully distorted their self-descriptions when instructed to do so, (c) response validity scales were responsive to different types of distortion, (d) applicants' responses did not reflect evidence of distortion, and (e) validities remained stable regardless of possible distortion by respondents in either unusually positive or negative directions.
Article
This study investigated the relation of the "Big Five" personality di- mensions (Extraversion, Emotional Stability, Agreeableness, Consci- entiousness, and Openness to Experience) to three job performance criteria (job proficiency, training proficiency, and personnel data) for five occupational groups (professionals, police, managers, sales, and skilled/semi-skilled). Results indicated that one dimension of person- ality. Conscientiousness, showed consistent relations with all job per- formance criteria for all occupational groups. For the remaining per- sonality dimensions, the estimated true score correlations varied by occupational group and criterion type. Extraversion was a valid pre- dictor for two occupations involving social interaction, managers and sales (across criterion types). Also, both Openness to Experience and Extraversion were valid predictors of the training proficiency criterion (across occupations). Other personality dimensions were also found to be valid predictors for some occupations and some criterion types, but the magnitude of the estimated true score correlations was small (p < .10). Overall, the results illustrate the benefits of using the 5- factor model of personality to accumulate and communicate empirical findings. The findings have numerous implications for research and practice in personnel psychology, especially in the subfields of person- nel selection, training and development, and performance appraisal.
Article
People often rely on reasoning processes whose purpose is to enhance the logical appeal of their behavioral choices. These reasoning processes will be referred to as justification mechanisms. People favor only certain types of behaviors and develop justification mechanisms to support them because these behaviors allow them to express underlying dispositions. People with different dispositions are prone to develop different justification mechanisms. Reasoning that varies among individuals due to the use of these different justification mechanisms is described as conditional. A new system for measuring dispositional tendencies, or personality, was based on conditional reasoning. This system was applied to develop measures of achievement motivation and aggression. Initial tests suggested that the measurement system is valid. An additional study examined relationships between conditional reasoning and both self-report and projective measurements of the motives to achieve and to avoid failure.
Article
Two studies investigated relations between supervisors' evaluations of contextual performance and personality characteristics in jobs where opportunities for advancement were either absent or present. The first study examined performance in entry-level jobs where advancement, in general, was precluded; employees (N = 214) completed the Hogan Personality Inventory (HPI) as applicants and subsequently were rated by their supervisors for contextual performance. Results indicated that conscientiousness - measured by HPI Prudence scores - was significantly related to ratings of Work Dedication and Interpersonal Facilitation, which are dimensions of contextual performance. The results were corroborated in an independent sample. In the second study, employees (N = 288) in jobs with opportunities for advancement completed the HPI and their supervisors provided ratings for contextual performance. Results indicated that ambition/surgency - measured by HPI Ambition scores - predicted contextual performance. These results also were confirmed in a second sample. Relations between personality and contextual performance are explained by the motives of cooperation - getting along - and status - getting ahead. When there are no opportunities for advancement, employees perform contextual acts because they are conscientious; however, when there are opportunities for advancement, employees engage in contextual acts because they are ambitious.
Article
We evaluated the effects of faking on mean scores and correlations with self-reported counterproductive behavior of integrity-related personality items administered in sin- gle-stimulus and forced-choice formats. In laboratory studies, we found that respon- dents instructed to respond as if applying for a job scored higher than when given stan- dard or "straight-take" instructions. The size of the mean shift was nearly a full standard deviation for the single-stimulus integrity measure, but less than one third of a standard deviation for the same items presented in a forced-choice format. The cor- relation between the personality questionnaire administered in the single-stimulus condition and self-reported workplace delinquency was much lower in the job appli- cant condition than in the straight-take condition, whereas the same items adminis- tered in the forced-choice condition maintained their substantial correlations with workplace delinquency.
Article
Two rational, a priori strategies for dealing with intentional distortion of self-descriptions were developed and evaluated according to their (a) impact on criterion-related validity, (b) effect on scale score means for the total group as well as women and minorities, and (c) impact on who specifically is hired. One strategy involves "correcting" an individual's content scale scores based on the individual's score on an Unlikely Virtues (UV) scale. A second strategy involves removing people from the applicant pool because their scores on an UV scale suggest they are presenting themselves in an overly favorable way. Incumbent and applicant data from three large studies were used to evaluate the two strategies. The data suggest that (a) neither strategy affects criterion-related validities, (b) both strategies produce applicant mean scores for content scales that are closer to incumbent mean scores, (c) men, women, Whites, and minorities are not differentially affected, and (d) both strategies result in a subset of people who are not hired who would otherwise have been hired. If one's goal is to reduce the impact of intentional distortion on hiring decisions, both strategies appear reasonably effective.
Article
One of the perennial problems which faces any human being trying to evaluate another is the fact that behaviour changes according to the situation in which the subject of enquiry finds himself. People who are being observed tend to perform differently from the way in which they behave when they are unaware that their activities are under investigation.
Article
L'auteur discute un modele a cinq facteurs de la personnalite qu'il confronte a d'autres systemes de la personnalite et dont les correlats des dimensions sont analyses ainsi que les problemes methodologiques
Article
Applied the Gordon Personal Inventory and the Gordon Personal Profile to investigate response distortion of forced-choice personality inventories and its implications for employee selection. An experimental group of female job applicants (n = 29) was told that their scores would be used in the selection decision, while the control group (n = 30) was first hired and then requested to take the inventories. No significant mean scale differences appeared between groups. Certain scales were significantly predictive of turnover in the experimental group but not in the control group; however, this relationship was not significantly moderated by the instructional set provided. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
A Group of 81 college students were administered the Gordon Personal Profile, first under directions to simulate applying for industrial employment, and then in a simulated guidance situation. A total score difference "not of great practical significance" equivalent to an increase of about 8 percentile points was found, in favor of a "better" score for the industrial situation. "Present results support the contention that the Gordon Personal Profile '… probably is less subject to "faking" than inventory-type instruments.' " (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Recent personnel selection studies have focused on the 5-factor model of personality. However, the stability of this factor structure in job applicant populations has not been determined. Conceptual and empirical evidence has suggested that similar factor structures should not be assumed across testing situations that have different purposes or consequences. A study was conducted that used confirmatory factor analysis to examine the fit of the 5-factor model to NEO Five-Factor Inventory (P. T. Costa and R. R. McCrae, 1989) test data from student and applicant samples. The 5-factor structure fit the student data but did not fit the applicant data. The existence of an ideal-employee factor in the applicant sample is suggested. The findings are discussed in terms of both construct validity issues and the use of the Big Five in personnel selection. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Response bias continues to be the most frequently cited criticism of personality testing for personnel selection. The authors meta-analyzed the social desirability literature, examining whether social desirability functions as a predictor for a variety of criteria, as a suppressor, or as a mediator. Social desirability scales were found not to predict school success, task performance, counterproductive behaviors, and job performance. Correlations with the Big Five personality dimensions, cognitive ability, and years of education are presented along with empirical evidence that (a) social desirability is not as pervasive a problem as has been anticipated by industrial-organizational psychologists, (b) social desirability is in fact related to real individual differences in emotional stability and conscientiousness, and (c) social desirability does not function as a predictor, as a practically useful suppressor, or as a mediator variable for the criterion of job performance. Removing the effects of social desirability from the Big Five dimensions of personality leaves the criterion-related validity of personality constructs for predicting job performance intact. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
This question was investigated by comparing test performance of a group of 45 Juvenile Bureau patrolmen and a group of 70 applicants for assignment as Juvenile Bureau patrolmen. The ACE, Cardall's Test of Practical Judgment, the Kuder Preference Record, the Guilford-Martin Inventory of Factors GAMIN, and Guilford's Inventory of Factors STDCR were the tests used. Generalizations are made concerning attempts at and success in faking on self-inventory tests. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
There are 2 families of statistical procedures in meta-analysis: fixed- and random-effects procedures. They were developed for somewhat different inference goals: making inferences about the effect parameters in the studies that have been observed versus making inferences about the distribution of effect parameters in a population of studies from a random sample of studies. The authors evaluate the performance of confidence intervals and hypothesis tests when each type of statistical procedure is used for each type of inference and confirm that each procedure is best for making the kind of inference for which it was designed. Conditionally random-effects procedures (a hybrid type) are shown to have properties in between those of fixed- and random-effects procedures. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
400 male omnibus conductor job applicants were given a 2-part personality measure (emotional maladjustment and sociability), 100 each under one of the following 4 conditions: before selection, paper-and-pencil administration; after being notified of selection, paper-and-pencil administration; a box-and-card administration under each of the 2 selection circumstances. The selection circumstances significantly affected the distribution of scores on the emotional maladjustment scale, but not on the sociability scale. Method of administration did not affect the score distributions. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
This study investigated possible faking of the Edwards Personal Preference Schedule in an industrial selection situation. EPPS scores for 97 Retail sales applicants and 66 Industrial sales applicants (all later hired) were compared to those of scores of 69 Retail salesmen and 49 Industrial salesmen (all tested on the job). Results showed that Retail applicants tended to score significantly higher on Orderliness, Intraception, and Dominance scales and lower on the Heterosexuality scale than Retail salesman. No significant differences were found, however, between Industrial applicants and Industrial salesmen. This suggests that persons more oriented toward selling in terms of interests and personality (i. e., Retail sales applicants) are more likely to distort answers to the EPPS. (PsycINFO Database Record (c) 2012 APA, all rights reserved)