A key finding in personnel selection is the positive correlation between conscientiousness and job performance. Evidence predominantly stems from concurrent validation studies with incumbent samples but is readily generalized to predictive settings with job applicants. This is problematic because the extent to which faking and changes in personality affect the measurement likely vary across samples and study designs. Therefore, we meta‐analytically investigated the relation between conscientiousness and job performance, examining the moderating effects of sample type (incumbent vs. applicant) and validation design (concurrent vs. predictive). The overall correlation of conscientiousness and job performance was in line with previous meta‐analyses (r ¯ = .17 , k = 102 , n = 23 , 305 $\bar{r}=.17,k=102,n=23,305$). In our analyses, the correlation did not differ across validation designs (concurrent: r ¯ = .18 , k = 78 , n = 19 , 132 $\bar{r}=.18,k=78,n=19,132$; predictive: r ¯ = .15 , k = 24 , n = 4173 $\bar{r}=.15,k=24,n=4173$), sample types (incumbents: r ¯ = .18 , k = 92 , n = 20 , 808 $\bar{r}=.18,k=92,n=20,808$; applicants: r ¯ = .14 , k = 10 , n = 2497 $\bar{r}=.14,k=10,n=2497$), or their interaction. Critically, however, our review revealed that only a small minority of studies (~12%) were conducted with real applicants in predictive designs. Thus, barely a fraction of research is conducted under realistic conditions. Therefore, it remains an open question if self‐report measures of conscientiousness retain their predictive validity in applied settings that entail faked responses. We conclude with a call for more multivariate research on the validity of selection procedures in predictive settings with actual applicants. Research on the predictive validity of conscientiousness is almost exclusively conducted with incumbents and criterion data gathered at the same time as test data. Such studies likely underestimate the detrimental effects of faking and personality change across time. Predictive studies with real applicant samples are scarce. Self‐report personality measures should be used with caution if faking was expected. In general, a stronger emphasis on incremental validity instead of individual predictors is desirable. Research on the predictive validity of conscientiousness is almost exclusively conducted with incumbents and criterion data gathered at the same time as test data. Such studies likely underestimate the detrimental effects of faking and personality change across time. Predictive studies with real applicant samples are scarce. Self‐report personality measures should be used with caution if faking was expected. In general, a stronger emphasis on incremental validity instead of individual predictors is desirable.