Article

A Meta‐Analytic Investigation of Job Applicant Faking on Personality Measures

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This study investigates the extent to which job applicants fake their responses on personality tests. Thirty-three studies that compared job applicant and non-applicant personality scale scores were meta-analyzed. Across all job types, applicants scored significantly higher than non-applicants on extraversion (d=.11), emotional stability (d=.44), conscientiousness (d=.45), and openness (d=.13). For certain jobs (e.g., sales), however, the rank ordering of mean differences changed substantially suggesting that job applicants distort responses on personality dimensions that are viewed as particularly job relevant. Smaller mean differences were found in this study than those reported by Viswesvaran and Ones (Educational and Psychological Measurement, 59(2), 197–210), who compared scores for induced “fake-good” vs. honest response conditions. Also, direct Big Five measures produced substantially larger differences than did indirect Big Five measures.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... actually applied, whether there are faking behaviors in SRPSs that are so far not considered, whether there are combinations of faking behaviors in SRPSs, or whether faking behaviors are generalizable (e.g., with respect to faking direction, Bensch et al. 2019; or the to-be-faked construct, Birkeland et al. 2006). ...
... Fourth, faking has been demonstrated to depend on faking conditions (e.g., the to-be faked measure, the to-be-faked constructs, and the faking directions; Birkeland et al. 2006;Röhner, Thoss, and Schütz 2022) that form a certain faking context. Thus, our study's focus on a certain faking context also implies that the generalizability of our taxonomy is limited to this context and, therefore represents a building block in the development of an exhaustive taxonomy. ...
... On the one hand, employing more than two constructs could increase cognitive load, potentially making faking more difficult. On the other hand, faking varies with respect to the to-befaked construct (e.g., Birkeland et al. 2006). Therefore, both issues may promote additional faking behaviors and strategies or affect their frequencies. ...
Article
Faking in self-report personality scales (SRPSs) is not sufficiently understood. This limits its detection and prevention. Here, we introduce a taxonomy of faking behaviors that constitute faking strategies in SRPSs, reflecting the stages (comprehension, retrieval, judgment, and response) of the general response process model (GRPM). We reanalyzed data from two studies investigating the faking of high and low scores on Extraversion (E) and Need for Cognition (NFC) scales (Data Set 1; N = 305) or on an E scale (Data Set 2; N = 251). Participants were asked to explain exactly what they did to fake, and their responses (N = 533) were examined via a qualitative content analysis. The resulting taxonomy included 22 global and 13 specific behaviors that (in combination) constitute faking strategies in SRPSs. We organized the behaviors into four clusters along the stages of the GRPM. The behaviors held irrespective of the construct (E or NFC), and with two exceptions, also irrespective of the data set (Data Sets 1 or 2). Eight exceptions concerning faking direction (high or low) indicate direction-specific differences in faking behaviors. Respondents reported using not only different faking behaviors (e.g., role-playing, behaviors to avoid being detected) but also multiple combinations thereof. The suggested taxonomy is necessarily limited to the specified context, and, thus, additional faking behaviors are possible. To fully understand faking, further research in other contexts should be conducted to complement the taxonomy. Still, the complexity shown here explains why adequate detection and prevention of faking in SRPSs is so challenging.
... Previous research indicates that sales applicants may specifically inflate their ratings on ascendancy (i.e., the desire for governing or controlling influence; Bass, 1957), orderliness, intraception, and dominance (Kirchner, 1962). In their meta-analysis on preemployment faking and personality traits, Birkeland et al. (2006) found that sales applicants were more likely to inflate their extraversion scores compared to applicants in other industries. Thus, it is important to consider applicant industry when examining faking behavior to contextualize the findings. ...
... First, meta-analytic research has found that Extraversion is an important predictor of sales performance (Vinchur et al., 1998). Second, applicant-versus-honest responses to Extraversion provide a robust indicator of faking for sales jobs, such that sales applicants tend to inflate their Extraversion scores more dramatically than those applying for nonsales positions (Birkeland et al., 2006). Consistent with past research on faking in retail sales positions, our first hypothesis was: ...
... RICs were computed using Extraversion scores as rationalized in the introduction. More specifically, because Extraversion tends to be most robustly distorted by sales applicants (Birkeland et al., 2006), it should provide the most conceptually pure measure of faking ability in our simulated pre-employment testing scenario relative to the other major personality dimensions. Using other traits that may be less prone to faking in this context (e.g., Openness), may increase the likelihood that variance in the RICs is more strongly attributable to random fluctuations in responding and/or response biases that are conceptually unrelated to faking ability (see Paulhus, 1991 for examples). ...
Article
Full-text available
Applicants’ faking of personality test responses is a major concern in pre-employment testing. The prevailing research has found faking to be positively related to reviled employee attributes, but some recent research has unveiled positive relations between faking and revered attributes. We examined the relationships of a gold standard measure of faking ability with employee attributes among a sample of working individuals (N = 347). Within our simulated pre-employment testing scenario, faking ability had negative relationships with organizationally reviled attributes and positive relationships with organizationally revered attributes. Our results suggest that within a retail sales context, applicants with the ability to fake their responses to pre-employment personality tests may have an increased likelihood of possessing sought-after traits and thus could be valued by employers.
... While heeding this appropriateness may lead to more complex models, we argue that predictiveness is a desirable model trait worth the price of a less parsimonious models and scoring methods (Dubova et al., 2024). Second, there are marked differences in responses to personality tests depending on whether responses were collected in low-stakes contexts, such as honest-response laboratory settings; versus high-stakes contexts, such as selecting for employment, promotions, or university acceptance (Anglim et al., 2017;Birkeland et al., 2006;Griffith et al., 2007;Hendy et al., 2021;Krammer et al., 2017;Schermer, Holden, et al., 2019;Viswesvaran & Ones, 1999). Consequently, scrutinizing different scoring options for personality scales should include addressing how different contexts influence personality responses. ...
... Meta-analysis has shown that mean scores on Big-Five traits tend to increase in faking-good versus honest contexts (Viswesvaran & Ones, 1999), and in applicant versus honest contexts. The exceptions to this trend are agreeableness scores which, although still impacted by applicant contexts, change in a less consistent direction than the other Big Five traits (Birkeland et al., 2006;Hu & Connelly, 2021). It seems likely that Big-Five-irrelevant variance comprises these context-related shifts in scores, and, as discussed in the next section, investigations capitalizing on bifactor modeling may hold the key to a more granular specification of this variance (e.g., Anglim et al., 2017;Biderman et al., 2019;Hendy et al., 2021;Schermer, Holden, et al., 2019;. ...
... As such, the underlying models are the most parsimonious models discussed due to their strong restrictions (e.g., all items weighted by unity). Following, it is not surprising that manifest mean scores have the strongest standing tradition of being used in research (see meta-analyses : Birkeland et al., 2006;Hu & Connelly, 2021;Viswesvaran & Ones, 1999). ...
Article
Full-text available
Based on a large (N = 612) longitudinal sample in a teacher education program, we compared how three methods of personality scoring—manifest mean scores, correlated‐factors model scores, and bifactor model scores—predict academic achievement assessed by grade point averages. Furthermore, we compared predictiveness across honest responses, applicants' responses and responses collected under laboratory faking‐good instructions. To this end, a real‐life selection setting was part of our study (i.e., applicants to initial teacher education selected, among other things on their personality). We found the expected pattern of manifest mean scores (honest responses were the lowest, applicants' responses higher and faking‐good responses highest) and could demonstrate that applicant faking does not reduce personality assessment's predictiveness. Overall, correlated‐factors model scoring increased the predictiveness of honest and applicants' responses, and scoring via bifactor model even more so. No method of scoring could retrieve the predictiveness in the faking‐good response condition. Regarding the practical application within selection processes, bifactor model scores only slightly outperformed mean scores, and this only occurred in the case of small selection ratios. Nevertheless, we showed that there is criterion‐related and systematic variance within applicants' personality scores above and beyond their personality traits that can be extracted when modeled with bifactor models.
... Conversely, non-applicants usually respond more honestly to low-stakes tests. As a result, applicant scores on personality tests have been higher compared to non-applicants in experimental studies (Birkeland et al., 2006;Hough, 1998;Rosse et al., 1998), and also compared to respondents instructed to answer honestly or to those given no instructions (Viswesvaran & Ones, 1999;Zickar & Drasgow, 1996). Induced-faking studies that compared a group of high-faking-inclination individuals to a group of non-motivatedfaking individuals, generally reported higher score inflation than those using an applicantincumbent design (Birkeland et al., 2006;Viswesvaran & Ones, 1999). ...
... As a result, applicant scores on personality tests have been higher compared to non-applicants in experimental studies (Birkeland et al., 2006;Hough, 1998;Rosse et al., 1998), and also compared to respondents instructed to answer honestly or to those given no instructions (Viswesvaran & Ones, 1999;Zickar & Drasgow, 1996). Induced-faking studies that compared a group of high-faking-inclination individuals to a group of non-motivatedfaking individuals, generally reported higher score inflation than those using an applicantincumbent design (Birkeland et al., 2006;Viswesvaran & Ones, 1999). Respondents with faking instructions tend to please researchers and exaggerate their faking levels (Cao & Drasgow, 2019). ...
... Conscientiousness and neuroticism are part of the four antecedents in an integrative model of faking behavior; that is, the predictors of individual differences in motivation and ability to distort responses (Mueller-Hanson et al., 2006). In addition, these two factors have exhibited the largest effect sizes on score inflation in both laboratory studies and actual selection contexts, although all five factors were affected by faking (Birkeland et al., 2006;Schripsema et al., 2016;Viswesvaran & Ones, 1999). For example, Birkeland et al. (2006) found that job applicants scored significantly higher than non-applicants on conscientiousness (d = .45) ...
Article
Faking behavior has been examined by various methods and procedures, but without consistent conclusions. This study explores the measurement of faking based on traditional social desirability scales. Under both instructed faking-good and honest contexts, the response distortion was assessed by measures of the Big Five personality traits and the Marlowe-Crowne Social Desirability Scale. Classification tests and diagnostic analyses were conducted. The majority of suspected fakers flagged by the detection scale had little dependence with those identified by the amount of faking. The diagnostic accuracy and regression model estimates did not support the use of social desirability scales for detecting faking. These findings indicate that social desirability scales are not capable of precisely capturing faking-related changes, although they indicate, to a certain extent, the level of faking. Interpreting these scales as indicators of faking should be approached with caution. Using multiple social desirability scales along with modern model-based methods is recommended.
... Faking can have numerous adverse effects on the psychometric properties of a test (Ziegler et al., 119 2011): For instance, faking leads to substantially elevated scores on scales that measure desirable traits (e.g., 120 Birkeland et al., 2006;Viswesvaran & Ones, 1999). One might argue that this would not be a problem if 121 condition-specific standardization samples were used and if all test-takers elevated their scores by an equal 122 amount. ...
... Regarding substantive traits, we expected that potential mean differences between conditions would 519 be less pronounced when accounting for faking compared to ignoring faking. In line with typical findings in 520 high-stakes assessments (Birkeland et al., 2006;Viswesvaran & Ones, 1999), there were substantial mean 521 differences between conditions when ignoring faking such that mean person parameters of all Big Five were 522 significantly higher in the high-stakes condition than in the low-stakes condition (see Figures 3c to 3g). 523 ...
... Moreover, adding the faking dimension to the model reduced mean differences in Big Five person 625 parameters between the high-stakes and low-stakes condition. Typically for application contexts, mean Big 626 Five estimates were substantially higher among test-takers taking the test as part of an application than 627 among job incumbents taking the test for validation purposes (Birkeland et al., 2006;Viswesvaran & Ones, 628 1999). These mean differences were fairly constant across models with and without response styles. ...
Preprint
Full-text available
Self-report personality tests used in high-stakes assessments hold the risk that test-takers engage in faking. In this article, we demonstrate an extension of the multidimensional nominal response model (MNRM) to account for the response bias of faking. The MNRM is a flexible item response theory (IRT) model that allows modeling response biases whose effect patterns vary between items. In an empirical demonstration, we modeled responses from N = 3046 job applicants taking a personality test under high-stakes conditions. We thereby specified item-specific effect patterns of faking by setting scoring weights of faking to appropriate values that we collected in a pilot study. Results indicated that modeling faking increased model fit over and above response styles and improved divergent validity, while the faking dimension exhibited relations to several covariates. Furthermore, we validated the model in a sample of job incumbents taking the test under low-stakes conditions and found evidence that the model can effectively capture faking and adjust estimates of substantive personality traits for faking. We end the article with a discussion of implications for psychological assessment in organizational contexts.
... Job applicants can and do display elevated levels of socially desirable responding when completing personality assessments (e.g., Anglim, Bozic, Little, & Lievens, 2018;Anglim, Morse, De Vries, MacCann, & Marty, 2017;Birkeland, Manson, Kisamore, Brannick, & Smith, 2006;Griffith, Chmielowski, & Yoshita, 2007;Hu & Connelly, 2021;Morgeson et al., 2007b). Socially desirable responding means answering personality items such that ratings are more favorable than the individual's "true score". ...
... A large body of research shows that job applicants can and do respond to personality assessments in more socially desirable ways. This has been shown both in studies comparing job applicants with non-applicants (e.g., Anglim et al., 2017;Birkeland et al., 2006;Jeong, Christiansen, Robie, Kung, & Kinney, 2017) and in studies comparing participants role-playing going for a job with participants in an "honest" condition (e.g., Birkeland et al., 2006;Hooper, 2007;Viswesvaran & Ones, 1999). This research indicates that means on traits such as conscientiousness and extraversion are often around a half standard deviation higher in applicants compared to non-applicants, with the size of these effects varying based on contextual factors, individual differences, and scale characteristics. ...
... A large body of research shows that job applicants can and do respond to personality assessments in more socially desirable ways. This has been shown both in studies comparing job applicants with non-applicants (e.g., Anglim et al., 2017;Birkeland et al., 2006;Jeong, Christiansen, Robie, Kung, & Kinney, 2017) and in studies comparing participants role-playing going for a job with participants in an "honest" condition (e.g., Birkeland et al., 2006;Hooper, 2007;Viswesvaran & Ones, 1999). This research indicates that means on traits such as conscientiousness and extraversion are often around a half standard deviation higher in applicants compared to non-applicants, with the size of these effects varying based on contextual factors, individual differences, and scale characteristics. ...
... Second, although FC personality measures have been shown to be more faking resistant, these estimates have also varied from study to study when compared to similarly situated SS measures. Although there have been meta-analyses that examined the fakeability of FC measures (Cao & Drasgow, 2019;Martínez & Salgado, 2021) and SS measures (Birkeland et al., 2006;Viswesvaran & Ones, 1999) independently, these studies are challenging to reconcile. Because these meta-analyses included estimates for each format that were drawn from different settings, additional sources of extraneous variance are introduced. ...
... Applicants improve trait scores when it is possible to easily identify the job-relevant, keyed traits on personality inventories (Birkeland et al., 2006;Hu & Connelly, 2021). Depending on how transparent the item is, this might be a relatively easy task. ...
... for FC scores and .75 for SS scores in paired samples. This finding is generally consistent with past research (e.g., Birkeland et al., 2006;Cao & Drasgow, 2019;Martínez & Salgado, 2021;Viswesvaran et al., 1996). However, within this set of studies and under faking conditions, respondents were still able to score highly on desirable FC traits (e.g., conscientiousness). ...
Article
Full-text available
Forced-choice (FC) personality assessments have shown potential in mitigating the effects of faking. Yet despite increased attention and usage, there exist gaps in understanding the psychometric properties of FC assessments, and particularly when compared to traditional single-stimulus (SS) measures. The present study conducted a series of meta-analyses comparing the psychometric properties of FC and SS assessments after placing them on an equal playing field—by restricting to only studies that examined matched assessments of each format, and thus, avoiding the extraneous confound of using comparisons from different contexts (Sackett, 2021). Matched FC and SS assessments were compared in terms of criterion-related validity and susceptibility to faking in terms of mean shifts and validity attenuation. Additionally, the correlation between FC and SS scores was examined to help establish construct validity evidence. Results showed that matched FC and SS scores exhibit strong correlations with one another (ρ = .69), though correlations weakened when the FC measure was faked (ρ = .59) versus when both measures were taken honestly (ρ = .73). Average scores increased from honest to faked samples for both FC (d = .41) and SS scores (d = .75), though the effect was more pronounced for SS measures and with larger effects for context-desirable traits (FC d = .61, SS d = .99). Criterion-related validity was similar between matched FC and SS measures overall. However, when considering validity in faking contexts, FC scores exhibited greater validity than SS measures. Thus, although FC measures are not completely immune to faking, they exhibit meaningful benefits over SS measures in contexts of faking.
... Research has repeatedly shown that faking can have numerous effects on a test's psychometric properties (Ziegler et al., 2011). For instance, depending on whether desirable (undesirable) traits are measured, faking leads to considerably inflated (deflated) item and scale scores (e.g., Birkeland et al., 2006;Viswesvaran & Ones, 1999). A shift in item and scale scores would not be problematic for the assessment of interindividual differences if the range of possible scores was unlimited and if all test-takers shifted their scores by an equal amount. ...
... It is possible that the strong situational cues in the experimental setting, in which no strong differences in the motivation to adhere to the instructions can be expected, led to restricted variation in the degree of faking between participants (cf. Birkeland et al., 2006;McFarland & Ryan, 2000), implying a relatively low impact of a latent faking dimension. Psychometrically, this is reflected in the estimated slopes of the faking dimension, which were on average notably smaller ( ⋅Faking = 0.12) than the estimated slopes of the substantive trait dimensions ( ⋅θs = 0.80). ...
Article
Full-text available
Faking in self-report personality questionnaires describes a deliberate response distortion aimed at presenting oneself in an overly favorable manner. Unless the influence of faking on item responses is taken into account, faking can harm multiple psychometric properties of a test. In the present article, we account for faking using an extension of the multidimensional nominal response model (MNRM), which is an item response theory (IRT) model that offers a flexible framework for modeling different kinds of response biases. Particularly, we investigated under which circumstances the MNRM can adequately adjust substantive trait scores and latent correlations for the influence of faking and examined the role of variation in the way item content is related to social desirability (i.e., item desirability characteristics) in facilitating the modeling of faking and counteracting its detrimental effects. Using a simulation, we found that the inclusion of a faking dimension in the model can overall improve the recovery of substantive trait person parameters and latent correlations between substantive traits, especially when the impact of faking in the data is high. Item desirability characteristics moderated the effect of modeling faking and were themselves associated with different levels of parameter recovery. In an empirical demonstration with N = 1070 test-takers, we also showed that the faking modeling approach in combination with different item desirability characteristics can prove successful in empirical questionnaire data. We end the article with a discussion of implications for psychological assessment.
... Consequently, personality assessments have become highly prevalent in recruitment settings (Kantrowitz et al., 2018). However, researchers have expressed concerns that these self-report personality measures are prone to applicant faking (i.e., deliberate response distortion in socially desirable and jobrelevant directions; Birkeland et al., 2006;Griffith et al., 2011), which may be detrimental to the integrity and the validity of the organization's selection processes (Donovan et al., 2014;Rosse et al., 1998;Tett & Simonet, 2011). Given these concerns, faking warnings (now termed warnings) were advanced to combat the prevalence and efficacy of faking (Hough et al., 1990;Schrader & Osburn, 1977). ...
... COMPARING THE EFFICACY OF FAKING WARNING TYPES 5 previous meta-analyses on applicant faking (Birkeland et al., 2006;Cao & Drasgow, 2019;Viswesvaran & Ones, 1999), we used a bare-bones approach in the meta-analysis to represent operational effect sizes. ...
Article
Full-text available
Numerous faking warning types have been investigated as interventions that aim to minimize applicant faking in preemployment personality tests. However, studies vary in the types and effectiveness of faking warnings used, personality traits, as well as the use of different recruitment settings and participant samples. In the present study, we advance a theory that classifies faking warning types based on ability, opportunity, and motivation to fake (Tett & Simonet, 2011), which we validated using subject matter expert ratings. Using this framework as a guide, we conducted a random-effects pairwise meta-analysis (k = 34) and a network meta-analysis (k = 36). We used inverse-variance weighting to pool the effect sizes and relied on 80% prediction intervals to evaluate heterogeneity. Overall, faking warnings had a significant, moderate effect in reducing applicant faking (d = 0.31, 95% CI [0.23, 0.39]). Warning types that theoretically targeted ability, motivation, and opportunity to fake (d = 0.36, 95% CI [0.25, 0.47]) were the most effective. Additionally, warnings were least effective in studies using recruitment settings and nonuniversity student samples. However, all effect sizes contained substantial heterogeneity, and all warning types will be ineffective in some contexts. Organizations should be cognizant that warnings alone may not be sufficient to address applicant faking, and future research should explore how their effectiveness varies depending on other contextual factors and applicant characteristics.
... These distortions may compromise the validity of the conclusions derived from Likert scales (Christiansen et al., 2005), especially in high-stakes situations, such as personnel selection (Birkeland et al., 2006;Christiansen et al., 2005) and performance appraisal (Brown et al., 2017), clinically relevant constructs (Young, 2018), or when the measured traits are particularly undesirable, such as "dark" personality traits (Guenole et al., 2018;Paulhus & Jones, 2014). Response biases can also be problematic when differentiation is particularly important, such as in market research or career advice (Wang et al., 2017). ...
... In situations where participants were faced with reallife incentives to distort (e.g., money or a job), the distortion seems to be somewhat weaker ( d = .30 in a meta-analysis by Edens & Arthur, 2000; see also Birkeland et al., 2006;Martínez & Salgado, 2021). ...
Article
Full-text available
This study compares the faking resistance of Likert scales and graded paired comparisons (GPCs) analyzed with Thurstonian IRT models. We analyzed whether GPCs are more resistant to faking than Likert scales by exhibiting lower score inflation and better recovery of applicants’ true (i.e., honest) trait scores. A total of N=573N=573 participants completed either the Likert or GPC version of a personality questionnaire first honestly and then in an applicant scenario. Results show that participants were able to increase their scores in both the Likert and GPC format, though their score inflation was smaller in the GPC than the Likert format. However, GPCs did not exhibit higher honest–faking correlations than Likert scales; under certain conditions, we even observed negative associations. These results challenge mean score inflation as the dominant paradigm for judging the utility of forced-choice questionnaires in high-stakes situations. Even if forced-choice factor scores are less inflated, their ability to recover true trait standings in high-stakes situations might be lower compared with Likert scales. Moreover, in the GPC format, faking effects correlated almost perfectly with the social desirability differences of the corresponding statements, highlighting the importance of matching statements equal in social desirability when constructing forced-choice questionnaires.
... When applying for a place at university, would you even consider disagreeing with the statement "I am someone who is efficient and gets things done?" Several studies have demonstrated that applicants can and indeed do change their answers in highstakes settings (e.g., Alliger & Dwight, 2000;Birkeland et al., 2006;Ellingson et al., 2007;Hu & Connelly, 2021;Jeong et al., 2017;Viswesvaran & Ones, 1999). However, the empirical answer to the question of how high-stakes settings affect the criterion-related validity of (the interpretation of) personality questionnaires is somewhat scattered. ...
... This suggests that applicants change their answers in high-stake selection settings. This is in line with previous studies and meta-analytic findings (e.g., Alliger & Dwight, 2000;Birkeland et al., 2006;Ellingson et al., 2007;Hu & Connelly, 2021;Jeong et al., 2017;Viswesvaran & Ones, 1999) and demonstrates that applicants describe themselves as more positively in high-stake selection settings. This behavior may be interpreted as faking, the failure to respond truthfully, or the deliberate act of presenting a favorable impression. ...
Article
Full-text available
We investigated, to what extent Conscientiousness can predict academic performance in a real application high-stakes setting. N = 267 applicants for a place at a university completed a Conscientiousness questionnaire during the selection process and 6 weeks after they commenced their studies. Students’ academic grades were used as criterion variables. The results suggest that the high-stakes setting increases the level of Conscientiousness reported, that not all applicants change their answers to the same extent and that the high-stakes setting decreases the criterion-related validity, but Conscientiousness remains a useful predictor for academic performance in high-stakes settings.
... Procedural issues of personality assessments. Similar to findings in graduate admissions, researchers who conducted studies in undergraduate and personnel selection show that the major procedural issue appears to be faking (Birkeland et al., 2006;König et al., 2017;Pavlov et al., 2019). The extent of faking depends on personality dimension under examination, type of test, aimed position (Birkeland et al., 2006), and situation stakes (Pavlov et al., 2019). ...
... Similar to findings in graduate admissions, researchers who conducted studies in undergraduate and personnel selection show that the major procedural issue appears to be faking (Birkeland et al., 2006;König et al., 2017;Pavlov et al., 2019). The extent of faking depends on personality dimension under examination, type of test, aimed position (Birkeland et al., 2006), and situation stakes (Pavlov et al., 2019). However, there are approaches, where supervisors of students are asked to report on their personality, and while the supervisors also tend to fake when reporting on the personality of their students, the extent of their faking is smaller (König et al., 2017). ...
Article
Full-text available
This review presents the first comprehensive synthesis of available research on selection methods for STEM graduate study admissions. Ten categories of graduate selection methods emerged. Each category was critically appraised against the following evaluative quality principles: predictive validity and reliability, acceptability, procedural issues, and cost-effectiveness. The findings advance the field of graduate selective admissions by (a) detecting selection methods and study success dimensions that are specific for STEM admissions, (b) including research evidence both on cognitive and noncognitive selection methods, and (c) showing the importance of accounting for all four evaluative quality principles in practice. Overall, this synthesis allows admissions committees to choose which selection methods to use and which essential aspects of their implementation to account for.
... Applicants understand the position they are applying to obtain and tend to answer personality assessments in accordance with what they believe the position requires. Job applicants scored higher on socially desirable characteristics than incumbents who would have been expected to score higher since they already held the position and went through the application and selection process themselves (Birkeland, Manson, Kisamore, Brannick, & Smith, 2006). ...
... What if it were possible to test and assess personality and traits indirectly where the applicant was less likely to fake answers? Direct testing of the Big Five personality tests yielded higher differences in Socially Desirable personality traits between applicants and nonapplicants than did indirect testing (Birkeland, Manson, Kisamore, Brannick, & Smith, 2006). Picture Story Exercises (PSE) also known as Thematic Apperception Testing (TAT) are useful as the assessor instructs the candidate to write about the picture following a set of directed questions (Gruber & Kreuzpointner, 2013). ...
Book
Full-text available
E6 Excellence: > How to coach and consult individuals and teams by putting your house in order. > How to help individuals to find strengths and the right fit in careers to achieve that next level of success > How to help the team through the stages of team development to smooth out the transition to high performance groups. > How organizations can implement changes though systems analysis and holistic approaches. > How the learning organization is able to constantly improve to increase its competitive advantage. Includes the content of a full day workshop: > The E6 Excellence system > Theoretical analysis and documented research > Universal expression of emotion and EQ Inside the book: > How the system dictates employee success > Strategies for interviewing and hiring > How situational constructs impact behavior > The structured interview and its power
... Furthermore, job applicant interviews may elicit in some candidates a tendency to exaggerate their emotional stability and to minimize their psychological problems (Birkeland et al., 2006). ...
Book
This book seeks to distinguish empirically-based knowledge from widespread misconceptions in the fields of legal and forensic psychology. Across ten chapters, leading scholars contribute different perspectives on their areas of expertise within the fields of legal and forensic psychology, providing a comprehensive overview of the historical context and defining characteristics of these two disciplines. The first section of the book is dedicated to legal psychology, exploring issues such as pseudoscience in lie detection, the use of polygraphs, and the reliability of eyewitness testimony and memory reports in legal settings. The second focuses on forensic psychology, addressing topics such as the relationship between criminal behavior and psychopathology, symptom validity assessment, risk assessment, and the treatment of forensic patients. As such, this vital book will serve as an excellent starting point for those seeking to educate themselves about these disciplines.
... Li et al., 2024;Viswesvaran & Ones, 1999;Ziegler et al., 2007) and a large proportion of job applicants fake during personnel selection (Donovan et al., 2014;Griffith et al., 2007;Peterson et al., 2011). Faking inflates applicants' mean trait scores (Birkeland et al., 2006), making them more likely to be selected than equally qualified others who do not fake (Rosse et al., 1998). This unfair advantage leads to job applicants considering faking to be one of the biggest threats to procedural justice in the selection process (Gilliland, 1995). ...
Article
Full-text available
The covariance index method, the idiosyncratic item response method, and the machine learning method are the three primary response-pattern-based (RPB) approaches to detect faking on personality tests. However, less is known about how their performance is affected by different practical factors (e.g., scale length, training sample size, proportion of faking participants) and when they perform optimally. In the present study, we systematically compared the three RPB faking detection methods across different conditions in three empirical-data-based resampling studies. Overall, we found that the machine learning method outperforms the other two RPB faking detection methods in most simulation conditions. It was also found that the faking probabilities produced by all three RPB faking detection methods had moderate to strong positive correlations with true personality scores, suggesting that these RPB faking detection methods are likely to misclassify honest respondents with truly high personality trait scores as fakers. Fortunately, we found that the benefit of removing suspicious fakers still outweighs the consequences of misclassification. Finally, we provided practical guidance to researchers and practitioners to optimally implement the machine learning method and offered step-by-step code.
... The organizational context is one of the contexts in which positive impression management, including underreport of undesirable personality characteristics or psychological symptoms, is expected, due to a fear of a negative impact in getting or maintaining a job (Picard et al., 2023). A meta-analysis has found that individuals applying for a job distort their scores on measures of personality to portray themselves in a positive manner (Birkeland et al., 2006). In this vein, research has shown that employees are afraid to disclosure mental health problems in the workplace (e.g., Brohan et al., 2012;Stratton et al., 2018), and a study in 35 countries showed that people with major depressive disorder very frequently reported discrimination in the work setting (Brouwers et al., 2016). ...
Article
Full-text available
The clinical-organizational context (where clinical psychology services are provided in the individuals’ professional setting) has still been insufficiently approached in research, namely the influence it may have on the response attitudes of individuals undergoing psychological assessment. Our main goal is to find out if, when psychological assessment occurs in the workplace context, patients being assessed present specific response bias that may have implications for the clinical results and correlative decisions. Five hundred and ten adult participants grouped in two samples of ambulatory patients – Clinical-Organizational Sample (COS n = 238) and Clinical Sample (CS n = 272) – were assessed with the Minnesota Multiphasic Personality Inventory-2-RF validity and substantive scales. Under-reporting is five times more frequent in the COS, which presents Defensiveness (11%), and Desirability (5%). In the CS, under-reporting is residual and over-reporting is more prevalent than in the COS. Clinical record information of COS participants presenting under vs. over-reporting also reveal differences concerning their circumstances, and type of clinical conditions. Comparing participants with under-reporting in each sample, the COS had lower clinical profiles, and tended to present excessively low psychopathology and symptomology values, suggesting higher defensiveness. Finally, the fact that 33% of the COS present biased response attitudes (i.e., 15% presented under-reporting and 18% presented over-reporting) has implications for both clinical and career decision making processes. In conclusion, there are relevant differences in response attitude and psychopathology features between outpatients assessed in a traditional clinical setting and in a clinical-organizational one, suggesting the professional context of the patients may influence motivations to disclosure psychological symptoms and problems.
... Sie konnten zeigen, dass willentliche Verfälschung (Faking) in Persönlichkeitstests betrieben wird, wenn diese der Personalauswahl dienen. Insbesondere bei den Big Five-Konstrukten Emotionale Stabilität und Gewissenhaftigkeit betreiben Bewerbende Faking-Good, verfälschen also ihre Antworten, um sich besser darzustellen(Birkeland et al., 2006). Wie bereits in Kap. ...
Thesis
Full-text available
Die vorliegende Masterthesis beschäftigt sich mit dem Einsatz von psychologischen Tests zur Personalauswahl und -entwicklung in der Schweiz. Dabei werden zum einen Verwendungshäufigkeit, Testart, deren Qualität, Zweck, Zielgruppen und Informationsquellen untersucht. Zum anderen besteht das Forschungsinteresse im Entdecken von Einflussgrössen, die den Testeinsatz in einem Unternehmen begünstigen oder hemmen. Mittels sequenziellem Mixed Methods Design, bestehend aus einer vorgelagerten qualitativen Interviewstudie und einer quantitativen Fragebogenstudie, wurde das Forschungsinteresse bearbeitet. Der Fokus liegt auf den quantitativen Daten, die durch einen selbst entwickelten Fragebogen bei HR Mitarbeitenden erhoben wurden. Die Daten offenbaren, dass insb. qualitativ mangelhafte Persönlichkeitstests, wie der MBTI oder der DISC, eingesetzt werden. Auch der wissenschaftlich fundierte BIP findet vermehrt Anwendung findet. Psychologische Tests dienen laut Untersuchung vor allem dem Zweck der Personalauswahl, wobei mehrheitlich Bewerbende für eine Führungs- oder Fachfunktion getestet werden. Als Informationsquellen dienen laut quantitativer Befragung überwiegend der Austausch mit Fachexpertinnen und -experten, sowie der Austausch mit Testanbietenden. Testrezensionen oder Fachliteratur werden hingegen kaum genutzt, was in der vorliegenden Thesis als eine mögliche Ursache hinsichtlich des Einsatzes mangelhafter Tests diskutiert wird. Eine binär logistische Regression zeigte drei statistisch relevante Zusammenhänge für den Testeinsatz in Unternehmen: je positiver die persönliche Einstellung der HR-Mitarbeitenden ggü. psychologischen Tests und je weniger Herausforderungen und Risiken in Bezug auf den Testeinsatz durch die HR Mitarbeitenden wahrgenommen werden, desto eher werden psychologische Tests in einem Unternehmen eingesetzt. Zudem hängt die Einsatzwahrscheinlichkeit auch davon ab, ob die HR Mitarbeitenden ihre Meinung in Bezug auf psychologische Tests bei der Entscheidung einbringen dürfen.
... In job application contexts, respondents have a much stronger incentive to engage in deliberate response distortion to present themselves favorably because doing so can increase their chances of being hired for the job. In fact, studies have consistently shown that test-takers respond to personality items in a more socially desirable manner in high-stakes testing contexts (e.g., Birkeland et al., 2006). Although respondents in non-applicant contexts may also try to present themselves more favorably, the motivation is more likely to be weaker than that for respondents in a high-stakes testing situations. ...
Article
We used exploratory structural equation modeling to examine gender-based measurement invariance (MI) in the HEXACO-100 across three samples that varied in terms of age (undergraduate students in Study 1, working adults in Studies 2 and 3) and testing context (research context in Studies 1 and 2, high-stakes selection context in Study 3). Across three studies, we consistently found support for configural and metric invariance but not scalar invariance. However, the effect size measures of non-invariance were generally small. That said, in the Emotionality scale, for the same latent score, females scored higher than males due to measurement non-invariance (between 0.26 and 0.48 standard deviation units). Thus, the observed mean gender differences overestimated the true mean gender differences. The current study provides detailed evidence regarding gender-based MI of HEXACO personality scales. More generally, it provides insight regarding the effect that measurement artifacts can have on understanding psychological gender differences at the latent level.
... Many studies (e.g., Tett et al., 1991) have provided support for the usefulness of personality constructs for personnel selection and placement in industrial or organizational psychology. However, it is not uncommon for individuals taking personality tests to exhibit social desirability bias (Nederhof, 1985) or give fake answers (Birkeland et al., 2006). Suppose candidates are asked to sit in a preemployment test. ...
Article
Full-text available
Multidimensional forced-choice (MFC) items have been found to be useful to reduce response biases in personality assessments. However, conventional scoring methods for the MFC items result in ipsative data, hindering the wider applications of the MFC format. In the last decade, a number of item response theory (IRT) models have been developed, majority of which are for MFC items with binary responses. However, MFC items with polytomous responses are more informative and have many applications. This paper develops a polytomous Rasch ipsative model (pRIM) that can deal with ipsative data and yield estimates that measure construct differentiation—a latent trait that describes the degree to which the personality constructs (e.g., interests) distinguish between each other. The pRIM and its simpler form are applied to a career interests assessment containing four-category MFC items and the measures of interests differentiation are used for both intra- and inter-personal comparisons. Simulations are conducted to examine the recovery of the parameters under various conditions. The results show that the parameters of the pRIM can be well recovered, particularly when a complete linking design and a large sample are used. The implications and application of the pRIM in the personality assessment using MFC items are discussed.
... There is evidence that applicants are more motivated to fake and engage in more intentional distortion than incumbents when completing personality inventories. Higher mean scores on personality inventories have consistently been observed in applicant samples when compared to current employees (Birkeland et al., 2006;Bott et al., 2007); the opposite pattern is generally observed on job-relevant tests where incumbents typically score higher (Guion, 2011). These mean differences resemble those in faking simulations when comparing scores of those instructed to respond as applicants with those answering honestly (see Burns et al., 2004). ...
Article
Full-text available
Two field studies were conducted to examine how applicant faking impacts the normally linear construct relationships of personality tests using segmented regression and by partitioning samples to evaluate effects on validity across different ranges of test scores. Study 1 investigated validity decay across score ranges of applicants to a state police academy ( N = 442). Personality test scores had nonlinear construct relations in the applicant sample, with scores from the top of the distribution being worse predictors of subsequent performance but more strongly related to social desirability scores; this pattern was not found for the partitioned scores of a cognitive test. Study 2 compared the relationship between personality test scores and job performance ratings of applicants ( n = 97) to those of incumbents ( n = 318) in a customer service job. Departures from linearity were observed in the applicant but not in the incumbent sample. Effects of applicant distortion on the validity of personality tests are especially concerning when validity decay increases toward the top of the distribution of test scores. Observing slope differences across ranges of applicant personality test scores can be an important tool in selection.
... Individuals may intentionally present themselves more favorably than they really are (distorting to look good) or less favorably than they really are (distorting to look bad, also referred to as malingering; Arthur et al., 2010;Donaldson & Grant-Vallone, 2002;Furnham, 1990;Rogers & Bender, 2018;Rogers et al., 2003). For example, a job applicant may try to enhance their positive qualities to look like a good fit for the job (Birkeland et al., 2006). ...
Article
Research consistently demonstrates that people can distort their responses on self-report personality tests. Informant-reports (where a knowledgeable informant rates a target's personality) can be used as an alternative to self-ratings. However, there has been little research on the extent to which informants can distort their responses on personality tests (or their motives for response distortion). The current study examines the effects of experimentally induced response distortion on self- and informant-reports of the Dark Triad. The participants (N = 834 undergraduates) completed Dark Triad measures in a 2 × 3 between-person design crossing format (self- vs. informant-report [imagined friend]) with instruction condition (answer honestly, look good, or look bad). “Look good” effects were significant for both self-reports (d = −1.22 to 1.42) and informant-reports (d = −1.35 to 0.62). “Look bad” effects were also significant for both self-reports (d = −0.56 to 3.58) and informant-reports (d = −0.55 to 3.70). The Five Factor Machiavellianism Inventory results were opposite to hypotheses, but Dirty Dozen Machiavellianism results were as expected. We conclude that people can distort Dark Triad scores for themselves (self-report) and on behalf of someone else (informant-report). We discuss the relevance of our findings for self- and informant-report assessment in applied contexts.
... This is particularly concerning in selection settings, where applicants are motivated to fake their responses to be hired. The resulting responses could alter the rank order of job applicants and distort the factor structure, reliability, and validity evidence of personality tests, consequently harming the utility of selection systems (e.g., Birkeland et al., 2006;Komar et al., 2008;Zickar et al., 2004). ...
Article
Full-text available
Human resource (HR) practices have been focused on using assessments that are robust to faking and response biases associated with Likert‐type scales. As an alternative, multidimensional forced‐choice (MFC) measures have recently shown advances in reducing faking and response biases while retaining similar levels of validity to Likert‐type measures. Although research evidence supports the effectiveness of MFC measures, fairness issues resulting from gender biases in the use of MFC measures have not yet been investigated in the literature. Given the importance of gender equity in HR development, it is vital that new assessments improve upon known gender biases in the historical use of Likert‐type measures and do not lead to gender discrimination in HR practices. In this vein, our investigation focuses specifically on potential gender biases in the use of MFC measures for HR development. Specifically, our study examines differential test‐taker reactions and differential prediction of self‐assessed leadership ability between genders when using the MFC personality measure. In an experimental study with college students, we found no evidence of gender differences in test‐taker reactions to MFC measures. In a second cross‐sectional study with full‐time employees, we found evidence of intercept differences, such that females were frequently underpredicted when using MFC personality measures to predict self‐assessed leadership ability. Moreover, the pattern of differential prediction using MFC measures was similar to that of Likert‐type measures. Implications for MFC personality measures in applied practice are discussed.
... In recent decades, the term faking has gained strength to refer to the intentional manipulation of responses. Faking is considered the most threatening bias in studies on psychology and social sciences, especially regarding non-cognitive assessment tools [17][18][19]. Regarding psychosocial risk assessment tools, no studies on the impact of SD have been found. Therefore, it is extremely important to know the effect of this bias in the tests that evaluate workrelated stressors, due to its relevant contribution in occupational health prevention, of special interest to prevention technicians who assess psychosocial risks. ...
Article
Background: Self-reported test is one of the main psychosocial risk assessment tools. However, this test it is susceptible to certain sources of error, including social desirability. Since psychosocial risks are emerging, there are not many studies on their assessment. Objective: The aim of this work is to analyze the impact of social desirability on the short version of the CopSoq-ISTAS 21 assessment tool. Method: A total of 563 workers (45.10% women and 54.90% men) participated in this study. The short version of the CoPsoQ-Istas21 questionnaire with four Likert scale questions as markers, which correspond to the Eysenck Personality Lie Scale Questionnaire Revised (EPQ-r), were used. The sample was divided into two halves, and both a confirmatory analysis and an exploratory analysis were carried out to find out the factorial structure of the scale and, with it, apply the bias filtering method. Results: The results indicate that 10% of the scale is biased due to social desirability, and that there are significant differences between the group with bias clean scores and the group with scores without bias control. Conclusions: The effects of social desirability on the scale are verified, so it is concluded that in a psychosocial risk assessment is not enough to apply a self-report test and interpret its results, being necessary to minimize the sources of error.
... The percentage of tests falsified in this way has increased from about 14% in the 1960s to even 62% in the 21 st century. The widespread phenomenon of falsifying selection tests is also demonstrated by Birkeland et al. (2006), Levashina and Campion (2007) and Levashina et al. (2014). ...
Article
Full-text available
Psychological testing – including personality tests – is one of the methods used by contemporary organisations for selection of candidates. This article provides a systematic analysis of arguments concerning the validity of this selection method using the argument mapping technique. The study highlights doubts regarding the validity of assessing a candidate’s potential on the basis of such tests due to the significant potential for result manipulation by the candidate. The primary conclusion drawn from this analysis is that personality tests should only be used as a complementary instrument alongside other selection techniques. Test‑based assessment methods should be used optionally, while adhering to appropriate standards for conducting such tests. The study also suggests a shift away from self‑report tests and entrusting their execution and interpretation to individuals with relevant qualifications.
... Результаты респондентов во второй группе в среднем оказались на 0,6 стандартного отклонения (σ) выше, чем в первой [Goffin, Christiansen, 2003]. О том, что респонденты могут искажать свои баллы при заполнении нормативных опросников, если они мотивированы создать позитивный образ себя, свидетельствуют и метааналитические исследования [Birkeland et al., 2006;Salgado, 2016]. В организационном контексте респонденты стремятся завысить баллы по характеристикам, которые, на их взгляд, положительно связаны с эффективностью деятельности и успешностью на желаемой позиции [Martínez, Moscoso, Lado, 2021;Anglim et al., 2017]. ...
Article
One of the significant lack of questionnaires is a scores distortion for the measured constructs, associated with the social desirability effects. An even greater threat to the validity of decisions is social desirability in high-stakes evaluation, such as selection for a position. Moreover the issue of the relationship between different components of social desirability and the most frequently measured personal constructs remains debatable. In the material of the author's normative questionnaire of universal competencies, an approach is considered for making adjustments to the final scores for measured constructs using the developed scales of egoistic and moralistic social desirability. Also discussed the prospect of using statement formulations that are neutral to social desirability or express the most positive degree of measured indicators. The empirical basis of this study is data gathered within a pilot conducted in the spring of 2022, during which data were obtained from 579 respondents in 49 measurable competencies. The analysis was aimed at assessing the quality of the developed scales of social desirability and modeling of each of the universal competencies scales was carried out with the inclusion of a scale of social desirability. The data were analyzed in the framework of structural modeling - confirmatory factor analysis (CFA) using bifactor models for each of the measured competencies. According to the results of this study, the use of the scale of egoistic social desirability as a measure for adjusting factor scores for the competencies has generally satisfactory psychometric statistics, but there is concern about the relatively large measurement error. The paper discusses the advantages and disadvantages of this approach and other practices that are most often used to reduce the effects of social desirability in the academic and business environment.
... To capture this, future studies could implement a fake-good condition in which interviewees are explicitly asked to engage in as much deceptive IM as possible. However, there is also evidence both from interviews (e.g., Bill et al., 2020, Study 3) and from other selection instruments such as personality tests (e.g., Birkeland et al., 2006) that most applicants do or would not fake as much as would be possible. Thus, results from fake-good studies might reveal the maximum possible rating improvement even though this improvement might be larger compared to what applicants are willing to do in a real application setting. ...
Article
Full-text available
Research on whether interviewees can improve their interview ratings through impression management (IM) relative to an honest condition has focused on highly structured interviews whereas traditional interviews have received little attention. Thus, this study aimed to determine how prone traditional compared to highly structured interviews are to effects of IM. Therefore, we conducted simulated selection interviews using a 2 x 2 within-subjects design. All participants went through a condition in which they were asked to present themselves as honestly as possible and a condition in which they were instructed to act like an applicant. Additionally, each interview contained eight traditional and eight structured questions. The differences in the usage of self-reported honest and deceptive IM between the honest and applicant conditions were comparable for both interview types. Furthermore, interview ratings were better in the applicant condition compared to the honest condition, and importantly, this improvement was larger for the traditional interview part compared to the structured interview part. Even though the larger performance improvement was
... Such paradigms examine the extent to which test-takers can distort their responses in a socially desirable way to present create a more favorable impression (faking good; Arthur et al., 2010;Donaldson & Grant-Vallone, 2002;Furnham, 1990;Rogers et al., 2003). For example, a job applicant may try to enhance their positive qualities to obtain a job (Birkeland et al., 2006). There is evidence from instructed faking studies of self-ratings that people do ''fake good'' on EI rating scales (Day & Carroll, 2008;Grubb & McDaniel, 2007;Hartman & Grubb, 2011;Whitman et al., 2008). ...
Article
Full-text available
Research demonstrates that people can fake on self-rated emotional intelligence scales. As yet, no studies have investigated whether informants (where a knowledgeable informant rates a target’s emotional intelligence) can also fake on emotional intelligence inventories. This study compares mean score differences for a simulated job selection versus a standard instructed set for both self-ratings and informant-ratings on the Trait Emotional Intelligence Questionnaire—Short Form (TEIQue-SF). In a 2 × 2 between-person design, participants (N = 81 community volunteers, 151 university students) completed the TEIQue-SF as either self-report or informant-report in one of two instruction conditions (answer honestly, job simulation). Both self-reports (d = 1.47) and informant-reports (d = 1.56) were significantly higher for job simulation than “answer honestly” instructions, indicating substantial faking. We conclude that people can fake emotional intelligence for both themselves (self-report) and on behalf of someone else (informant-report). We discuss the relevance of our findings for self- and informant-report assessment in applied contexts.
... Job applicants are incentivized to engage in impression management behaviors that emphasize their strengths and minimize negatives (Knouse, Giacalone, & Pollard, 1988;McFarland, Yun, Harold, Viera, & Moore, 2005). Applicants may even embellish or lie to best fit the expectations of the organization (Birkeland, Manson, Kisamore, Brannick, & Smith, 2006;Levashina & Campion, 2007). ...
Chapter
In this chapter, we elaborated on the two main forms of distorted symptom reporting: symptom over- and underreporting. In certain situations, individuals might exaggerate their symptoms. For instance, defendants facing legal issues may overstate their mental health problems. This type of behavior is often described with terms such as “faking bad,” “feigning,” and “malingering,” though these should not be used interchangeably with symptom overreporting, as each term carries distinct connotations. Conversely, in other scenarios, individuals may underreport their symptoms. For example, a parent involved in a custody dispute might minimize genuine psychological issues. Symptom underreporting differs from “faking good” or “superlative self-presentation”. Symptom validity tests serve as valuable tools for clinicians and experts to identify distorted symptom reporting in patients, defendants, or plaintiffs. There are numerous symptom validity tests, each varying in diagnostic efficacy, often quantified by Likelihood Ratios. Deviant scores on these tests suggest either over- or underreporting, particularly when the Likelihood Ratio is high. Relying solely on subjective clinical impressions to detect over- or underreporting is not advisable, as it may result in missed cases and misclassification of individuals with genuine issues (i.e., false positives). Therefore, incorporating symptom validity tests into assessments, especially when results can be discussed with the individuals being tested, proves beneficial in correcting distorted symptom presentations. However, these tests alone do not provide sufficient information regarding the motivation underlying symptom over- or underreporting; thus, additional contextual information is necessary.
Article
In graded paired comparisons (GPCs), two items are compared using a multipoint rating scale. GPCs are expected to reduce faking compared with Likert-type scales and to produce more reliable, less ipsative trait scores than traditional binary forced-choice formats. To investigate the statistical properties of GPCs, we simulated 960 conditions in which we varied six independent factors and additionally implemented conditions with algorithmically optimized item combinations. Using Thurstonian IRT models, good reliabilities and low ipsativity of trait score estimates were achieved for questionnaires with 50% unequally keyed item pairs or equally keyed item pairs with an optimized combination of loadings. However, in conditions with 20% unequally keyed item pairs and equally keyed conditions without optimization, reliabilities were lower with evidence of ipsativity. Overall, more response categories led to higher reliabilities and nearly fully normative trait scores. In an empirical example, we demonstrate the identified mechanisms under both honest and faking conditions and study the effects of social desirability matching on reliability. In sum, our studies inform about the psychometric properties of GPCs under different conditions and make specific recommendations for improving these properties.
Article
Personality assessments are commonly used in hiring, but concerns about faking have raised doubts about their effectiveness. Qualitative reviews show mixed and inconsistent impacts of faking on criterion‐related validity. To address this, a series of meta‐analyses were conducted using matched samples of honest and motivated respondents (i.e., instructed to fake, applicants). In 80 paired samples, the average difference in validity coefficients between honest and motivated samples across five‐factor model traits ranged from 0.05 to 0.08 (largest for conscientiousness and emotional stability), with the validity ratio ranging from 64% to 72%. Validity was attenuated when candidates faked regardless of sample type, trait relevance, or the importance of impression management, though variation existed across criterion types. Both real applicant samples ( k = 25) and instructed response conditions ( k = 55) showed a reduction in validity across honest and motivated conditions, including when managerial ratings of job performance were the criterion. Thus, faking impacted the validity in operational samples. This suggests that practitioners should be cautious relying upon concurrent validation evidence (for personality inventories) and expect attenuated validity in operational applicant settings, particularly for conscientiousness and emotional stability scales. That said, it is important to highlight that personality assessments generally maintained useful validity even under‐motivated conditions.
Article
(LINK TO THE OPEN-ACCESS PUBLICATION: https://journals.sagepub.com/doi/epdf/10.1177/00131644241307560) Self-report personality tests used in high-stakes assessments hold the risk that test-takers engage in faking. In this article, we demonstrate an extension of the multidimensional nominal response model (MNRM) to account for the response bias of faking. The MNRM is a flexible item response theory (IRT) model that allows modeling response biases whose effect patterns vary between items. In a simulation, we found good parameter recovery of the model accounting for faking under different conditions as well as good performance of model selection criteria. Also, we modeled responses from N = 3046 job applicants taking a personality test under real high-stakes conditions. We thereby specified item-specific effect patterns of faking by setting scoring weights to appropriate values that we collected in a pilot study. Results indicated that modeling faking significantly increased model fit over and above response styles and improved divergent validity, while the faking dimension exhibited relations to several covariates. Additionally, applying the model to a sample of job incumbents taking the test under low-stakes conditions, we found evidence that the model can effectively capture faking and adjust estimates of substantive trait scores for the assumed influence of faking. We end the article with a discussion of implications for psychological measurement in high-stakes assessment contexts.
Chapter
The chapter discusses the relevance of individual differences in personality traits for the study of school leadership, especially with regard to leadership success. Findings from psychological leadership research have shown that, amongst others, personality, cognitive and emotional intelligence, as well as creativity predict leadership outcome variables. The authors investigate how far these traits have been able to predict leadership success across different occupations and also across different situational and methodological conditions. In addition, studies on the relationship of individual trait differences and school principals' effectiveness are discussed. The chapter shows that individual differences research holds potential for educational leadership, but further studies are needed to draw conclusions about the potential cognitive ability, personality traits, emotional intelligence, as well as creativity hold for predicting leadership success of school principals.
Article
Organizations are increasingly employing personality assessments as part of their selection processes due to their predictive value for job‐related outcomes. However, applicant faking can undermine the validity of such measures. This study explored a novel faking prevention method using the dual task paradigm. Respondents in the dual task conditions memorized a series of five or seven digits while attempting to fake their responses on a personality measure. Their results were compared with a no dual task condition in which respondents were also instructed to fake. Our results revealed that faking performance was limited, and criterion‐related validity was improved in the dual task conditions compared with the no dual task condition. The practical implications and future directions for this initial proof of concept are discussed.
Article
Personality testing is a critical component of organizational assessment and selection processes. Despite nearly a century of research recognizing faking as a concern in personality assessment, the impact of order effects on faking has not been thoroughly examined. This study investigates whether the sequence of administering personality and cognitive ability measures affects the extent of faking. Previous research suggests administering personality measures early in the assessment process to mitigate adverse impact; however, models of faking behavior and signaling theory imply that test order could influence faking. In two simulated applicant laboratory studies (Study 1 N = 172, Study 2 N = 174), participants were randomly assigned to complete personality measures either before or after cognitive ability tests. Results indicate that participants who completed personality assessments first exhibited significantly higher levels of faking compared to those who took cognitive ability tests first. These findings suggest that the order of test administration influences faking, potentially due to the expenditure of cognitive resources during cognitive ability assessments. To enhance the integrity of selection procedures, administrators should consider the sequence of test administration to mitigate faking and improve the accuracy of personality assessments. This study also underscores the need for continued exploration of contextual factors influencing faking behavior. Future research should investigate the mechanisms driving these order effects and develop strategies to reduce faking in personality assessments.
Article
What can employers learn from personality tests when applicants have incentives to misrepresent themselves? Using a within‐subject, laboratory experiment, we compare personality measures with and without incentives for misrepresentation. Incentivized personality measures are weakly to moderately correlated with non‐incentivized measures in all treatments. When test‐takers are given a job ad indicating that an extrovert (introvert) is desired, extroversion measures are positively (negatively) correlated with IQ. Among other characteristics, only locus of control appears related to faking on personality measures. Our findings highlight the identification challenges in measuring personality and the potential for correlations between incentivized personality measures and other traits.
Article
The rank two-parameter logistic (Rank-2PL) item response theory models refer to a set of models applying the 2PL model in a sequential ranking process that occurs in forced-choice questionnaires. The multi-unidimensional pairwise preference with 2PL model (MUPP-2PL) is a Rank-2PL model for items with two statements. Focusing on items with three statements, we develop a maximum marginal likelihood estimation with an expectation-maximization algorithm to estimate item parameters and their standard errors. A simulation study is conducted to check parameter recovery, and then the model is applied to a real dataset. Finally, the findings are summarized and discussed, and future research is suggested.
Chapter
Full-text available
This study examines the preliminary evidence of consumer attitudes towards the products offered by Malaysia's Private Retirement Scheme (PRS), emphasizing the influence of social, marketing, and personal aspects by using demographic factors such as government support, age, and income. This study employed a survey questionnaire with the working Malaysians in Klang Valley and only 54 respondents were tested for preliminary analysis. The findings demonstrate that (1) social, marketing and personal aspects have mediating effects on consumer attitudes towards PRS products; (2) government support, age, and income have moderating effects on PRS products in Malaysia. The results showed that they have a mediating influence on consumer attitudes and the roles of moderating effects on PRS products in Malaysia for this study, expanding the structure of the theory of planned behavior. The study's originality is that the government is advised to promote alternative retirement funding via PRS, should offer additional incentives, and should raise public awareness of the significance of having enough money to pay for one's retirement years.
Article
Hochschulen für angewandte Wissenschaften (HAW) spielen eine entscheidende Rolle in der deutschen Bildungslandschaft. Sie zeichnen sich durch ihre praxisorientierte Ausrichtung aus und sind maßgeblich daran beteiligt, den steigenden Bedarf an hochqualifizierten Fachkräften in verschiedenen Branchen zu decken. Um diese Aufgabe erfolgreich zu bewältigen, bedarf es engagierter und kompetenter Professor:innen, die sowohl über fundiertes Fachwissen als auch über umfangreiche praktische Erfahrungen verfügen. Gleichzeitig sind HAWen selbst von einem zum Teil dramatischen Fachkräftemangel bei der Besetzung von Professuren betroffen. Hochschulen und Politik haben mittlwerweile verstanden, dass die Qualifizierung von Talenten und die Gewinnung von Profes sor:in nen nicht mehr dem Zufall überlassen werden darf, sondern strukturiert und systematisch angegangen werden muss. Das Förderprogramm FH-Personal setzt genau hier an: Es unterstützt Hochschulen dabei, die Hochschul-Professur als attraktive Karriereoption offensiv zu bewerben und das Profil der Hochschulen als Arbeitgeberinnen zu schärfen. So soll es gelingen, herausragende Talente aus Wissenschaft und Praxis zu gewinnen und ihnen die Möglichkeit zu bieten, sich in ihrer akademischen Laufbahn weiterzuentwickeln. Dies trägt nicht nur zur Qualität der Lehre an den Hochschulen bei, sondern stärkt auch die Forschung und den Wissenstransfer in den jeweiligen Fachbereichen. Seit 2021 werden an 64 Hochschulen die in einer Konzeptphase erarbeiteten Maßnahmen im Rahmen einer ersten Projektrunde gefördert. 34 weitere Hochschulen kamen in der zweiten Projektrunde dazu. Die Konzepte und Instrumente der Hochschulen schließen dabei an Diskurse zur verbesserten Strukturierung und Planbarkeit von Karrierewegen in der Wissenschaft an. Maßnahmen und Konzepte zur Personal- und Organisationsentwicklung wie Personalmarketing und Employer Branding werden nun von Hochschulen aufgegriffen, um einem War for Talents entgegenzuwirken. Der Transfer dieser Diskurse und Konzepte auf den Kontext Hochschulen für angewandte Wissenschaften erfordert eine präzise Analyse der bisherigen Umsetzungsbedingungen, die neben Erfolgsgeschichten naturgemäß auch von Herausforderungen und Misserfolgen geprägt sein können. Nach zwei Jahren des Bund-Länder-Programms FH-Personal wird mit diesem Themenheft ein erstes Resümee gezogen. Wir geben ausgewählten Erfahrungen und Erkenntnissen der FH-Personal-Projekte aus der ersten Förderrunde (2021-2027) Raum. Mit den Beiträgen des Themenhefts möchten wir Ihnen einen Überblick über die verschiedenen Facetten des Förderprogramms FH-Personal geben.
Article
The multidimensional forced-choice (MFC) format is an alternative to rating scales in which participants rank items according to how well the items describe them. Currently, little is known about how to detect careless responding in MFC data. The aim of this study was to adapt a number of indices used for rating scales to the MFC format and additionally develop several new indices that are unique to the MFC format. We applied these indices to a data set from an online survey ( N = 1,169) that included a series of personality questionnaires in the MFC format. The correlations among the careless responding indices were somewhat lower than those published for rating scales. Results from a latent profile analysis suggested that the majority of the sample (about 76–84%) did not respond carelessly, although the ones who did were characterized by different levels of careless responding. In a simulation study, we simulated different careless responding patterns and varied the overall proportion of carelessness in the samples. With one exception, the indices worked as intended conceptually. Taken together, the results suggest that careless responding also plays an important role in the MFC format. Recommendations on how it can be addressed are discussed.
Article
Forced-choice questionnaires involve presenting items in blocks and asking respondents to provide a full or partial ranking of the items within each block. To prevent involuntary or voluntary response distortions, blocks are usually formed of items that possess similar levels of desirability. Assembling forced-choice blocks is not a trivial process, because in addition to desirability, both the direction and magnitude of relationships between items and the traits being measured (i.e., factor loadings) need to be carefully considered. Based on simulations and empirical studies using item pairs, we provide recommendations on how to construct item pairs matched by desirability. When all pairs contain items keyed in the same direction, score reliability is improved by maximizing within-block loading differences. Higher reliability is obtained when even a small number of pairs consist of unequally keyed items.
Article
Full-text available
The effectiveness of forced-choice personality measures in preventing socially desirable responding, such as faking, may depend on how closely personality items comprising each item-block are similar in terms of perceived desirability. Item desirability matching is routinely performed on empirically obtained item desirability ratings and different approaches have been used interchangeably to obtain them. In the current study, we compare item similarity estimates based on sets of item desirability ratings obtained with the two most used instruction sets: the explicit ratings of item desirability and the fake-good induction. We find substantial differences in similarity estimates between the two sets. We show that these differences may play an important role in item desirability matching and fake-resistant forced-choice test construction. We discuss our findings and provide recommendations to both researchers and test developers.
Article
Full-text available
AI, or artificial intelligence, is a technology of creating algorithms and computer systems that mimic human cognitive abilities to perform tasks. Many industries are undergoing revolutions due to the advances and applications of AI technology. The current study explored a burgeoning field—Psychometric AI, which integrates AI methodologies and psychological measurement to not only improve measurement accuracy, efficiency, and effectiveness but also help reduce human bias and increase objectivity in measurement. Specifically, by leveraging unobtrusive eye-tracking sensing techniques and performing 1470 runs with seven different machine-learning classifiers, the current study systematically examined the efficacy of various (ML) models in measuring different facets and measures of the emotional intelligence (EI) construct. Our results revealed an average accuracy ranging from 50–90%, largely depending on the percentile to dichotomize the EI scores. More importantly, our study found that AI algorithms were powerful enough to achieve high accuracy with as little as 5 or 2 s of eye-tracking data. The research also explored the effects of EI facets/measures on ML measurement accuracy and identified many eye-tracking features most predictive of EI scores. Both theoretical and practical implications are discussed.
Thesis
Full-text available
The aim of this research was to check to what extent candidates distort their answers at the selection interviews, and is the amount of answer distortion related to preparation for the interview and familiarity with the workplace. The research was conducted on a sample from the student population (N=102). Participants took part in a simulated selection interview for the job of a call centre manager. In the interview, the questions were designed to measure extraversion and honesty/humility. Each participant took part in two researched situations, by answering a part of the question presenting oneself as an ideal candidate for the job, while they answered honesty the second part of the questions. To control the effect of the questions order, using the Latin quare method, 16 versions of the questions order were made and systematically rotated through the interview. The results showed that the participants are capable of successfully distorting their answers in the selection interview, and to a considerable extent. In the answer distortion situation, the participants achieved significantly higher results on extraversion and honesty/humility, compared to honest presentation situation. The size of the obtained differences, expressed through the eta square, turned out to be large for all the measures. On the other hand, no significant correlation between the amount of answer distortion and preparation for the interview was found. The was also no significant correlation between the amount of distortion of the answers and familiarity with the job. Recommendations for further research and guidelines for practical work wore provided.
Article
Full-text available
Recently, 2 separate yet related criticisms have been levied against the adequacy of the five-factor model (or Big Five) as a descriptive taxonomy of job applicant personality: frame of reference effects (M. J. Schmit & A. M. Ryan, 1993) and socially desirable responding (A. F. Snell & M. A. McDaniel, 1998). Of interest, although both criticisms suggest that the five-factor model is inadequate, the frame of reference effects criticism suggests that the factor structure should be more complex, whereas socially desirable responding suggests that it should be less complex in job applicant contexts. The current research reports the results of a new study demonstrating the adequacy of the five-factor model as a descriptor of job applicant, job incumbent, and student personality. Implications for personality assessment and concurrent validation designs using personality measures are also discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
The impact of positive response distortion (PRD) upon attitude test scores is examined in job applicant settings. Using data from three empirical studies, several issues are examined, including job applicant and incumbent base rates, impact on validity, and effects on hiring decisions under single-test and compensatory scoring models.
Article
Full-text available
Personality measures with items that ask respondents to characterize themselves across a range of situations are increasingly used for personnel selection purposes. Research conducted in a laboratory setting has found that personality items may have different psychometric characteristics depending on the degree to which that range is widened or narrowed (i.e., degree of contextualization). This study is an attempt to study the psychometric impact of contextualization in a large field sample (N = 1,078). Respondents were given either a contextualized (at work) or noncontextualized (in general) version of the six facets of the conscientiousness factor of the NEO PI-R. Analyses were conducted at the facet and item levels. Results were mixed but indicated that error variances tended to be slightly lower for the work-specific instrument in comparison to the noncontextualized instrument. Implications for personality inventory development, validation, and use are discussed.
Article
Full-text available
Incumbents are often used in the development and validation of a wide variety of personnel selection instruments, including noncognitive instruments such as personality tests. However, the degree to which assumed motivational factors impact the measurement equivalence and validity of tests developed using incumbents has not been adequately addressed. This study addressed this issue by examining the measurement equivalence of 6 personality scales between a group applying forjobs as sales managers in a large retail organization (N = 999) and a group of sales managers currently employed in that organization (N = 796). A graded item response theory model (Samejima, 1969) was fit to the personality scales in each group. Results indicated that moderately large differences existed in personality scale scores (approximately 1/2 standard deviation units) but only one of the six scales contained any items that evidenced differential item functioning and no scales evidenced differential test functioning. In addition, person-level analyses showed no apparent differences across groups in aberrant responding. The results suggest that personality measures used for selection retain similar psychometric properties to those used in incumbent validation studies.
Article
Full-text available
The authors examined whether individuals can fake their responses to a personality inventory if instructed to do so. Between-subjects and within-subject designs were metaanalyzed separately. Across 51 studies, fakability did not vary by personality dimension; all the Big Five factors were equally fakable. Faking produced the largest distortions in social desirability scales. Instructions to fake good produced lower effect sizes compared with instructions to fake bad. Comparing meta-analytic results from within-subjects and between-subjects designs, we conclude, based on statistical and methodological considerations, that within-subjects designs produce more accurate estimates. Between-subjects designs may distort estimates due to Subject × Treatment interactions and low statistical power.
Article
Full-text available
This study examined the effectiveness of warnings in reducing faking on noncognitive selection measures. A review of the relatively sparse literature indicated that warnings tend to have a small impact on responses (d = 0.23), with warned applicants receiving lower predictor scores than unwarned applicants. However, the effect of warnings on predictor scores was found to differ according to the type of warning used. In light of this, an experimental study was conducted to assess the following: (a) the overall effectiveness of warnings in reducing faking, and (b) the differential effects of three types of warnings on faking. The results indicated that a warning which identified that faking could be identified and the potential consequences of faking impacted responding.
Book
Full-text available
Meta-analysis is arguably the most important methodological innovation in the social and behavioral sciences in the last 25 years. Developed to offer researchers an informative account of which methods are most useful in integrating research findings across studies, this book will enable the reader to apply, as well as understand, meta-analytic methods. Rather than taking an encyclopedic approach, the authors have focused on carefully developing those techniques that are most applicable to social science research, and have given a general conceptual description of more complex and rarely-used techniques. Fully revised and updated, Methods of Meta-Analysis, Second Edition is the most comprehensive text on meta-analysis available today. New to the Second Edition: * An evaluation of fixed versus random effects models for meta-analysis* New methods for correcting for indirect range restriction in meta-analysis* New developments in corrections for measurement error* A discussion of a new Windows-based program package for applying the meta-analysis methods presented in the book* A presentation of the theories of data underlying different approaches to meta-analysis
Article
Full-text available
A review of criterion-related validities of personality constructs indicated that 6 constructs are useful predictors of important job-related criteria. An inventory was developed to measure the 6 constructs. In addition, 4 response validity scales were developed to measure accuracy of self-description. These scales were administered in 3 contexts: a concurrent criterion-related validity study, a faking experiment, and an applicant setting. Sample sizes were 9,188, 245, and 125, respectively. Results showed that (a) validities were in the .20s (uncorrected for unreliability or restriction in range) against targeted criterion constructs, (b) respondents successfully distorted their self-descriptions when instructed to do so, (c) responsive validity scales were responsive to different types of distortion, (d) applicants' responses did not reflect evidence of distortion, and (e) validities remained stable regardless of possible distortion by respondents in either unusually positive or negative directions. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Applied the Gordon Personal Inventory and the Gordon Personal Profile to investigate response distortion of forced-choice personality inventories and its implications for employee selection. An experimental group of female job applicants (n = 29) was told that their scores would be used in the selection decision, while the control group (n = 30) was first hired and then requested to take the inventories. No significant mean scale differences appeared between groups. Certain scales were significantly predictive of turnover in the experimental group but not in the control group; however, this relationship was not significantly moderated by the instructional set provided. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
A Group of 81 college students were administered the Gordon Personal Profile, first under directions to simulate applying for industrial employment, and then in a simulated guidance situation. A total score difference "not of great practical significance" equivalent to an increase of about 8 percentile points was found, in favor of a "better" score for the industrial situation. "Present results support the contention that the Gordon Personal Profile '… probably is less subject to "faking" than inventory-type instruments.' " (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Recent personnel selection studies have focused on the 5-factor model of personality. However, the stability of this factor structure in job applicant populations has not been determined. Conceptual and empirical evidence has suggested that similar factor structures should not be assumed across testing situations that have different purposes or consequences. A study was conducted that used confirmatory factor analysis to examine the fit of the 5-factor model to NEO Five-Factor Inventory (P. T. Costa and R. R. McCrae, 1989) test data from student and applicant samples. The 5-factor structure fit the student data but did not fit the applicant data. The existence of an ideal-employee factor in the applicant sample is suggested. The findings are discussed in terms of both construct validity issues and the use of the Big Five in personnel selection. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Response bias continues to be the most frequently cited criticism of personality testing for personnel selection. The authors meta-analyzed the social desirability literature, examining whether social desirability functions as a predictor for a variety of criteria, as a suppressor, or as a mediator. Social desirability scales were found not to predict school success, task performance, counterproductive behaviors, and job performance. Correlations with the Big Five personality dimensions, cognitive ability, and years of education are presented along with empirical evidence that (a) social desirability is not as pervasive a problem as has been anticipated by industrial-organizational psychologists, (b) social desirability is in fact related to real individual differences in emotional stability and conscientiousness, and (c) social desirability does not function as a predictor, as a practically useful suppressor, or as a mediator variable for the criterion of job performance. Removing the effects of social desirability from the Big Five dimensions of personality leaves the criterion-related validity of personality constructs for predicting job performance intact. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Response distortion (RD), or faking, among job applicants completing personality inventories has been a concern for selection specialists. In a field study using the NEO Personality Inventory, Revised, the authors show that RD is significantly greater among job applicants than among job incumbents, that there are significant individual differences in RD, and that RD among job applicants can have a significant effect on who is hired. These results are discussed in the context of recent studies suggesting that RD has little effect on the predictive validity of personality inventories. The authors conclude that future research, rather than focusing on predictive validity, should focus instead on the effect of RD on construct validity and hiring decisions. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Increased use of personality inventories in employee selection has led to concerns regarding factors that influence the validity of such measures. A series of studies was conducted to examine the influence of frame of reference on responses to a personality inventory. Study 1 involved both within-subject and between-groups designs to assess the effects of testing situation (general instructions vs. applicant instructions) and item type (work specific vs. noncontextual) on responses to the NEO Five-Factor Inventory (P. T. Costa & R. R. McCrae, 1989). Results indicated that a work-related testing context and work-related items led to more positive responses. A second study found differences in the validity of a measure of conscientiousness, depending on the frame of reference of respondents. Specifically, context-specific items were found to have greater validity. Implications for personnel selection are discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
There are 2 families of statistical procedures in meta-analysis: fixed- and random-effects procedures. They were developed for somewhat different inference goals: making inferences about the effect parameters in the studies that have been observed versus making inferences about the distribution of effect parameters in a population of studies from a random sample of studies. The authors evaluate the performance of confidence intervals and hypothesis tests when each type of statistical procedure is used for each type of inference and confirm that each procedure is best for making the kind of inference for which it was designed. Conditionally random-effects procedures (a hybrid type) are shown to have properties in between those of fixed- and random-effects procedures. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
400 male omnibus conductor job applicants were given a 2-part personality measure (emotional maladjustment and sociability), 100 each under one of the following 4 conditions: before selection, paper-and-pencil administration; after being notified of selection, paper-and-pencil administration; a box-and-card administration under each of the 2 selection circumstances. The selection circumstances significantly affected the distribution of scores on the emotional maladjustment scale, but not on the sociability scale. Method of administration did not affect the score distributions. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
This study investigated possible faking of the Edwards Personal Preference Schedule in an industrial selection situation. EPPS scores for 97 Retail sales applicants and 66 Industrial sales applicants (all later hired) were compared to those of scores of 69 Retail salesmen and 49 Industrial salesmen (all tested on the job). Results showed that Retail applicants tended to score significantly higher on Orderliness, Intraception, and Dominance scales and lower on the Heterosexuality scale than Retail salesman. No significant differences were found, however, between Industrial applicants and Industrial salesmen. This suggests that persons more oriented toward selling in terms of interests and personality (i. e., Retail sales applicants) are more likely to distort answers to the EPPS. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
The Gordon Personal Profile was administered to junior and senior high school students for vocational guidance purposes. Three months later it was readministered as an employment test to students applying for jobs. Those not seeking jobs took the test again as a guidance test… .Individuals did not change their profile patterns substantially from a guidance situation to an employment situation, and mean increases for the group were found to be moderate." (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
The Gordon Personal Profile was given to 265 sales employees of a food distributor, and to 471 employment applicants. Applicants scored significantly higher than employees on all scales. "Applicants practically never earn a minus value on any response while employees often do. Applicants never indicate as most like themselves some derogatory alternative in a tetrad… . Greater range in response among applicants can be obtained in several ways. For example four complimentary statements can be used in a tetrad of more subtle items." (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
J. Millham and L. I. Jacobson's (1978) 2-factor model of socially desirable responding based on denial and attribution components is reviewed and disputed. A 2nd model distinguishing self-deception and impression management components is reviewed and shown to be related to early factor-analytic work on desirability scales. Two studies, with 511 undergraduates, were conducted to test the model. A factor analysis of commonly used desirability scales (e.g., Lie scale of the MMPI, Marlowe-Crowne Social Desirability Scale) revealed that the 2 major factors were best interpreted as Self-Deception and Impression Management. A 2nd study employed confirmatory factor analysis to show that the attribution/denial model does not fit the data as well as the self-deception/impression management model. A 3rd study, with 100 Ss, compared scores on desirability scales under anonymous and public conditions. Results show that those scales that had loaded highest on the Impression Management factor showed the greatest mean increase from anonymous to public conditions. It is recommended that impression management, but not self-deception, be controlled in self-reports of personality. (54 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
Applied psychologists have long been interested in the relationship between applicant personality and employment interview ratings. Analysis of data from two studies, one using a situational interview and one using a behavioral interview, suggests that the correlations of structured interview ratings with self-report measures of personality factors are generally rather low. Further, a small meta-analysis integrates these two studies and the limited previous literature to arrive at a similar conclusion - there is relatively little relationship between structured interviews and self-reported personality factors.
Article
Full-text available
This study compares the criterion validity of the Big Five personality dimensions when assessed using Five-Factor Model (FFM)-based inventories and non-FFM-based inventories. A large database consisting of American as well as European validity studies was meta-analysed. The results showed that for conscientiousness and emotional stability, the FFM-based inventories had greater criterion validity than the non FFM-based inventories. Conscientiousness showed an operational validity of .28 (N = 19,460, 90% CV = .07) for FFM-based inventories and .18 (N =5,874, 90% CV = -.04) for non-FFM inventories. Emotional stability showed an operational validity of .16 (N = 10,786, 90% CV = .04) versus .05 (N = 4,541, 90% CV = -.05) for FFM and non-FFM-based inventories, respectively. No relevant differences emerged for extraversion, openness, and agreeableness. From a practical point of view, these findings suggest that practitioners should use inventories based on the FFM in order to make personnel selection decisions.
Article
Full-text available
The stability and replicability of the Five-Factor model of personality across samples and testing purposes remain a significant issue in personnel selection and assessment. The present study explores the stability of a new Greek Big Five personality measure (TPQue) across different samples in order to explore the suitability of the measure in personnel selection and assessment. The factor structure of the measure across three samples (students, employees, and job applicants) is examined. The results of exploratory and confirmatory factor analyses show that the five-factor structure remains intact for the students’, the applicants’ and the employees’ samples – contrary to previous studies – with all the sub-scales of the personality measure (TPQue) loading on the intended factors. Furthermore, congruence coefficients between the samples justify the stability of the model in the working settings.
Article
Full-text available
The purpose of this study was to investigate conflicting findings in previous research on personality and job performance. Meta-analysis was used to (a) assess the overall validity of personality measures as predictors of job performance, (b) investigate the moderating effects of several study characteristics on personality scale validity, and (c) appraise the predictability of job performance as a function of eight distinct categories of personality content, including the “Big Five” personality factors. Based on review of 494 studies, usable results were identified for 97 independent samples (total N= 13,521). Consistent with predictions, studies using confirmatory research strategies produced a corrected mean personality scale validity (.29) that was more than twice as high as that based on studies adopting exploratory strategies (.12). An even higher mean validity (.38) was obtained based on studies using job analysis explicitly in the selection of personality measures. Validities were also found to be higher in longer tenured samples and in published articles versus dissertations. Corrected mean validities for the “Big Five” factors ranged from .16 for Extroversion to .33 for Agreeableness. Weaknesses in the reporting of validation study characteristics are noted, and recommendations for future research in this area are provided. Contrary to conclusions of certain past reviews, the present findings provide some grounds for optimism concerning the use of personality measures in employee selection.
Article
This article reviews traditional approaches for the psychometric analysis of responses to personality inventories, including classical test theory item analysis, exploratory factor analysis, and item response theory. These methods, which can be called "dominance" models, work well for items assessing moderately positive or negative trait levels, but are unable to describe adequately items representing intermediate (or average) trait levels. This necessitates a shift to an alternative family of psychometric models, known as ideal point models, which stipulate that the likelihood of endorsement increases as respondents' trait levels get closer to an item's location. The article describes an ideal point model for personality measures using single statements as items, reanalyzes data to show how the change of modeling framework improves fit, and discusses the pairwise preference format for use in personality assessment. It also considers two illustrative ideal point models for unidimensional and multidimensional pairwise preferences and shows that, after correcting for unreliability, correlations of personality traits assessed with single statements, unidimensional pairs, and multidimensional pairs are very close to unity.
Article
There are 2 families of statistical procedures in meta-analysis: fixed- and random-effects procedures. They were developed for somewhat different inference goals: making inferences about the effect parameters in the studies that have been observed versus making inferences about the distribution of effect parameters in a population of studies from a random sample of studies. The authors evaluate the performance of confidence intervals and hypothesis tests when each type of statistical procedure is used for each type of inference and confirm that each procedure is best for making the kind of inference for which it was designed. Conditionally random-effects procedures (a hybrid type) are shown to have properties in between those of fixed- and random-effects procedures.
Article
A review of criterion-related validities of personality constructs indicated that six constructs are useful predictors of important job-related criteria. An inventory was developed to measure the 6 constructs. In addition, 4 response validity scales were developed to measure accuracy of self-description. These scales were administered in three contexts: a concurrent criterion-related validity study, a faking experiment, and an applicant setting. Sample sizes were 9,188,245, and 125, respectively. Results showed that (a) validities were in the.20s (uncorrected for unreliability or restriction in range) against targeted criterion constructs, (b) respondents successfully distorted their self-descriptions when instructed to do so, (c) response validity scales were responsive to different types of distortion, (d) applicants' responses did not reflect evidence of distortion, and (e) validities remained stable regardless of possible distortion by respondents in either unusually positive or negative directions.
Article
This study investigated the relation of the "Big Five" personality di- mensions (Extraversion, Emotional Stability, Agreeableness, Consci- entiousness, and Openness to Experience) to three job performance criteria (job proficiency, training proficiency, and personnel data) for five occupational groups (professionals, police, managers, sales, and skilled/semi-skilled). Results indicated that one dimension of person- ality. Conscientiousness, showed consistent relations with all job per- formance criteria for all occupational groups. For the remaining per- sonality dimensions, the estimated true score correlations varied by occupational group and criterion type. Extraversion was a valid pre- dictor for two occupations involving social interaction, managers and sales (across criterion types). Also, both Openness to Experience and Extraversion were valid predictors of the training proficiency criterion (across occupations). Other personality dimensions were also found to be valid predictors for some occupations and some criterion types, but the magnitude of the estimated true score correlations was small (p < .10). Overall, the results illustrate the benefits of using the 5- factor model of personality to accumulate and communicate empirical findings. The findings have numerous implications for research and practice in personnel psychology, especially in the subfields of person- nel selection, training and development, and performance appraisal.
Article
People often rely on reasoning processes whose purpose is to enhance the logical appeal of their behavioral choices. These reasoning processes will be referred to as justification mechanisms. People favor only certain types of behaviors and develop justification mechanisms to support them because these behaviors allow them to express underlying dispositions. People with different dispositions are prone to develop different justification mechanisms. Reasoning that varies among individuals due to the use of these different justification mechanisms is described as conditional. A new system for measuring dispositional tendencies, or personality, was based on conditional reasoning. This system was applied to develop measures of achievement motivation and aggression. Initial tests suggested that the measurement system is valid. An additional study examined relationships between conditional reasoning and both self-report and projective measurements of the motives to achieve and to avoid failure.
Article
Two studies investigated relations between supervisors' evaluations of contextual performance and personality characteristics in jobs where opportunities for advancement were either absent or present. The first study examined performance in entry-level jobs where advancement, in general, was precluded; employees (N = 214) completed the Hogan Personality Inventory (HPI) as applicants and subsequently were rated by their supervisors for contextual performance. Results indicated that conscientiousness - measured by HPI Prudence scores - was significantly related to ratings of Work Dedication and Interpersonal Facilitation, which are dimensions of contextual performance. The results were corroborated in an independent sample. In the second study, employees (N = 288) in jobs with opportunities for advancement completed the HPI and their supervisors provided ratings for contextual performance. Results indicated that ambition/surgency - measured by HPI Ambition scores - predicted contextual performance. These results also were confirmed in a second sample. Relations between personality and contextual performance are explained by the motives of cooperation - getting along - and status - getting ahead. When there are no opportunities for advancement, employees perform contextual acts because they are conscientious; however, when there are opportunities for advancement, employees engage in contextual acts because they are ambitious.
Article
We evaluated the effects of faking on mean scores and correlations with self-reported counterproductive behavior of integrity-related personality items administered in sin- gle-stimulus and forced-choice formats. In laboratory studies, we found that respon- dents instructed to respond as if applying for a job scored higher than when given stan- dard or "straight-take" instructions. The size of the mean shift was nearly a full standard deviation for the single-stimulus integrity measure, but less than one third of a standard deviation for the same items presented in a forced-choice format. The cor- relation between the personality questionnaire administered in the single-stimulus condition and self-reported workplace delinquency was much lower in the job appli- cant condition than in the straight-take condition, whereas the same items adminis- tered in the forced-choice condition maintained their substantial correlations with workplace delinquency.
Article
Two rational, a priori strategies for dealing with intentional distortion of self-descriptions were developed and evaluated according to their (a) impact on criterion-related validity, (b) effect on scale score means for the total group as well as women and minorities, and (c) impact on who specifically is hired. One strategy involves "correcting" an individual's content scale scores based on the individual's score on an Unlikely Virtues (UV) scale. A second strategy involves removing people from the applicant pool because their scores on an UV scale suggest they are presenting themselves in an overly favorable way. Incumbent and applicant data from three large studies were used to evaluate the two strategies. The data suggest that (a) neither strategy affects criterion-related validities, (b) both strategies produce applicant mean scores for content scales that are closer to incumbent mean scores, (c) men, women, Whites, and minorities are not differentially affected, and (d) both strategies result in a subset of people who are not hired who would otherwise have been hired. If one's goal is to reduce the impact of intentional distortion on hiring decisions, both strategies appear reasonably effective.
Article
One of the perennial problems which faces any human being trying to evaluate another is the fact that behaviour changes according to the situation in which the subject of enquiry finds himself. People who are being observed tend to perform differently from the way in which they behave when they are unaware that their activities are under investigation.
Article
L'auteur discute un modele a cinq facteurs de la personnalite qu'il confronte a d'autres systemes de la personnalite et dont les correlats des dimensions sont analyses ainsi que les problemes methodologiques
Article
This question was investigated by comparing test performance of a group of 45 Juvenile Bureau patrolmen and a group of 70 applicants for assignment as Juvenile Bureau patrolmen. The ACE, Cardall's Test of Practical Judgment, the Kuder Preference Record, the Guilford-Martin Inventory of Factors GAMIN, and Guilford's Inventory of Factors STDCR were the tests used. Generalizations are made concerning attempts at and success in faking on self-inventory tests. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Several studies, simulated and actual, have found that job applicants return higher scores on distortion scales in personality questionnaires than similar people who are not undergoing selection; this has been attributed to motivation. This study examines the test results of more than 700 male and female subjects in a variety of real-life selection programmes. It concludes that distortion is related to the stress of the circumstances in which testing takes place, and suggests a modified interpretation of the role of lie scales.
Article
Although the use of personality tests for personnel selection has gained increasing acceptance, researchers have raised concerns that job applicants may distort their responses to inflate their scores. In the present meta-analysis, we examined the effects of the two dimensions of social desirability, impression management and self-deception, on the criterion validity of personality constructs using the balanced inventory of desirable responding (BIDR). The results indicate that impression management and self-deception did not create spurious effects on the relationship between personality measures and performance, nor did they function as performance predictors. Moreover, removing the influence of impression management or self-deception from personality measures did not substantially attenuate the criterion validity of personality variables. Implications of the results and directions for future research are also discussed.
Article
To reduce faking on personality tests, applicants may be warned that a social desirability scale is embedded in the test. Although this procedure has been shown to substantially reduce faking, there is no data that addresses how such a warning may influence applicant reactions toward the selection procedure or the relationships among personality constructs. Using an organizational justice framework, this study examines the effect of warning on procedural justice perceptions. Additionally, the extent to which warning changes the relationships among personality variables, socially desirable responding, and organizational justice variables, was explored. The results suggest that warning did not negatively affect test-taker reactions. However, the relationships among the justice measures and the personality variables and socially desirable responding differed across the warned and unwarned groups. The organizational justice model fit best and there was less multicollinearity among the personality variables in the warned condition, compared to the unwarned condition. Thus, providing a warning appears to have positive consequences when using personality measures.