ArticlePDF Available

Belief in the unstructured interview: The persistence of an illusion

Authors:

Abstract and Figures

Unstructured interviews are a ubiquitous tool for making screening decisions despite a vast literature suggesting that they have little validity. We sought to establish reasons why people might persist in the illusion that unstructured interviews are valid and what features about them actually lead to poor predictive accuracy. In three studies, we investigated the propensity for "sensemaking" - the ability for interviewers to make sense of virtually anything the interviewee says- and "dilution"-the tendency for available but non-diagnostic information to weaken the predictive value of quality information. In Study 1, participants predicted two fellow students' semester GPAs from valid background information like prior GPA and, for one of them, an unstructured interview. In one condition, the interview was essentially nonsense in that the interviewee was actually answering questions using a random response system. Consistent with sensemaking, participants formed interview impressions just as confidently after getting random responses as they did after real responses. Consistent with dilution, interviews actually led participants to make worse predictions. Study 2 showed that watching a random interview, rather than personally conducting it, did little to mitigate sensemaking. Study 3 showed that participants believe unstructured interviews will help accuracy, so much so that they would rather have random interviews than no interview. People form confident impressions even interviews are defined to be invalid, like our random interview, and these impressions can interfere with the use of valid information. Our simple recommendation for those making screening decisions is not to use them. © 2013. The authors license this article under the terms of the Creative Commons Attribution 3.0 License.
Content may be subject to copyright.
Judgment and Decision Making, Vol. 8, No. 5, September 2013, pp. 512–520
Belief in the unstructured interview: The persistence of an illusion
Jason DanaRobyn DawesNathanial Peterson
Abstract
Unstructured interviews are a ubiquitous tool for making screening decisions despite a vast literature suggesting that
they have little validity. We sought to establish reasons why people might persist in the illusion that unstructured inter-
views are valid and what features about them actually lead to poor predictive accuracy. In three studies, we investigated
the propensity for “sensemaking” - the ability for interviewers to make sense of virtually anything the interviewee says—
and “dilution”—the tendency for available but non-diagnostic information to weaken the predictive value of quality
information. In Study 1, participants predicted two fellow students’ semester GPAs from valid background information
like prior GPA and, for one of them, an unstructured interview. In one condition, the interview was essentially nonsense
in that the interviewee was actually answering questions using a random response system. Consistent with sensemak-
ing, participants formed interview impressions just as confidently after getting random responses as they did after real
responses. Consistent with dilution, interviews actually led participants to make worse predictions. Study 2 showed that
watching a random interview, rather than personally conducting it, did little to mitigate sensemaking. Study 3 showed
that participants believe unstructured interviews will help accuracy, so much so that they would rather have random in-
terviews than no interview. People form confident impressions even interviews are defined to be invalid, like our random
interview, and these impressions can interfere with the use of valid information. Our simple recommendation for those
making screening decisions is not to use them.
Keywords: unstructured interview, random interview, clinical judment, actuarial judgment.
1 Introduction
In 1979, an act of legislature suddenly forced the Univer-
sity of Texas Medical School at Houston to admit 50 more
applicants late in the admissions season. The additional
applicants were initially rejected for admission, based
largely on impressions from unstructured interviews in
which each interviewer could ask different questions of
different applicants in whatever way he or she saw fit.
Apparently, the expense of having faculty interview ev-
ery applicant was wasted: at the conclusion of medical
training and one postgraduate year, there were no mean-
ingful differences between the initially rejected and ini-
tially accepted groups of students in terms of attrition,
academic performance, clinical performance, or honors
earned (Devaul et al., 1987). Several large-scale field
studies have provided similar examples of the embarrass-
ingly poor validity of unstructured interviews for screen-
ing decisions (e.g., Bloom & Brundage, 1947; Milstein,
Wilkinson, Burrow, & Kessen, 1981; Carroll, Wiener,
Coates, Galegher, & Alibrio, 1982). More systematic re-
Dawes was thankful for support from NSF Grant SES-0136259.
His unfortunate death in 2010 prevented him from seeing the present
version of this article.
Copyright: © 2013. The authors license this article under the terms
of the Creative Commons Attribution 3.0 License.
Yale University, 135 Prospect St., New Haven CT 06511, Email:
jason.dana@yale.edu.
Carnegie Mellon University.
views in the area of employment decisions likewise show
that unstructured interviews are poor predictors of job
performance, with structured interviews faring somewhat
better (Wiesner & Cronshaw, 1988; Wright, Lichtenfels,
& Pursell, 1989; Huffcutt & Arthur, 1994; McDaniel,
Whetzel, Schmidt, & Maurer, 1994).
Despite the evidence, unstructured interviews remain a
ubiquitous and even predominant tool for many screening
decisions. Studies of human resource executives suggest
that they believe more in the validity of unstructured in-
terviews than other screening methods, even when they
are aware that the evidence suggests that structured as-
sessment is superior (Highhouse, 2008). Academics,
though not professional interviewers, may decide to ac-
cept graduate students or hire faculty based on an infor-
mal 20 minute chat, countermanding substantial aggre-
gated and/or statisticized data comparing the candidate to
others (test scores and GPAs in the case of students, C.V.’s
in the case of faculty). Recently, Wake Forest Univer-
sity stopped requiring standardized tests for undergradu-
ate admissions, moving to a system in which every appli-
cant is eligible for an unstructured interview that figures
into the admissions decision in a “holistic”, non-numeric
manner (Highhouse & Kostek, 2013).
Since Meehl’s (1954) seminal book on the clinical-
statistical controversy for making predictions, a large
literature studying human vs. statistical prediction has
shown that the statistical method is nearly always equal
512
Judgment and Decision Making, Vol. 8, No. 5, September 2013 Belief in unstructured interviews
513
or superior to clinical judgment (see Grove, Zald, Lebow,
Snitz, & Nelson, 2000). Further, many common objec-
tions to the interpretation of this evidence have been thor-
oughly discussed and refuted (see, e.g., Dawes, Faust, &
Meehl, 1989; Grove & Meehl, 1996). We do not seek to
rehash these debates or provide merely another example
of clinical judges failing to outperform a statistical rule.
Rather, we seek to establish some reasons why people
might persist in the illusion that unstructured interviews
are valid and why they can harm predictions. Specifically,
we explore how the inevitably noisy signals in an inter-
view dilute the decision maker’s potential use of valid in-
formation and how interviewers can form falsely coherent
impressions from virtually anything the interviewee says
or does.
Extending the literature in these ways is important for
at least two reasons. First, it is reasonable to think that,
while unstructured interviews are not particularly predic-
tive, they will not hurt accuracy. At least, we are aware
of no prior evidence that unstructured interviews decrease
accuracy, e.g., by way of studying the same decision mak-
ers with and without access to interviews. This point be-
comes important because the issue is often raised that
the interview conveys benefits beyond its predictive va-
lidity. For example, candidates who go through an in-
terview process may have an increased sense of commit-
ment and more likelihood to accept an offer. Thus, if
one can get a benefit from conducting unstructured in-
terviews and the interviews do not make one’s judgment
worse, it would seem riskless to use them for certain pur-
poses. Indeed, this point was made without rebuttal in a
discussion of the Wake Forest decision on the Society for
Judgment and Decision Making’s mailing list (2008–9,
see the “search” at http://www.sjdm.org). We will pro-
vide evidence that exposure to unstructured interviews
can indeed harm judgment.
Second, many have the feeling that the unstructured in-
terview is the best way to uncover important information
that is special to a candidate. Particularly, candidates may
possess some personality traits at the extremes of the dis-
tribution that might make them ideal or unsuitable. These
unusual cues, akin to what Meehl (1954) called “broken
legs”, could immediately remove a candidate from con-
sideration or catapult a candidate ahead of others. In-
deed, even a structured interview might not help for this
purpose if the interviewer does not have in mind what
this broken leg factor would be ahead of time. The logic
of broken leg cues was addressed by Meehl, who pointed
out that, if people were actually good at spotting broken
legs that statistical rules miss, then they would be more
accurate than statistical rules. This argument, however,
has an important flaw in that it assumes that all errors and
successes are equally important. If an interviewer is espe-
cially concerned with making some kinds of errors, such
as missed broken legs, unstructured interviews could, in
theory, be highly valuable in avoiding the most important
errors, which may outweigh making a few more errors on
the more mundane cases.
Basic psychological research, however, gives us rea-
son to believe that unstructured interviews can harm judg-
ment and reason to doubt that interviewers will be suffi-
ciently adept at spotting special information, and not false
alarms, about a candidate.
1.1 Can interviews hurt?
Access to an interview could hurt predictive accuracy be-
cause exposure to non-diagnostic information is known to
dilute valuable information. Unstructured interviews ex-
pose interviewers to so many casual observations about
the interviewee that have little or unknown diagnostic-
ity that interviewers cannot help but get more informa-
tion than they can use and thus, they must ignore some
cues. Research on the “dilution effect” (e.g., Nisbett,
Zukier, & Lemley, 1981; Zukier, 1982; Peters & Roth-
bart, 2000) shows that rather than just being ignored, ex-
traneous information reduces reliance on good informa-
tion. It is perhaps no coincidence that the stimuli for
the earliest dilution effect studies, which included am-
ple material judged non-diagnostic by study participants,
came from interview snippets (Nisbett, Zukier, & Lem-
ley, 1981). Because making good social judgments of-
ten requires ignoring information and relying on simple
rules, cognitive traits that might normally be construed
as positive, such as complexity of thought and need for
cognition, can actually be detrimental to accuracy (Rus-
cio, 2000). In the presence of quality cues, most of the
interview could serve as a distraction.
1.2 Can interviewers reliably extract spe-
cial information from unstructured in-
terviews?
Although too much irrelevant information dilutes the pre-
diction process, it can also lead to unwarranted confi-
dence due to sensemaking. People seek to impose or-
der on events, so much so that they often see patterns in
random sequences (Gilovich, 1991). As such, even the
noisiest interview data are readily translated into a good
story (see Dawes, 2001, Chapter 7) about the intervie-
wee. Just as one can, post hoc, fit a “significant” sta-
tistical model to pure noise, interviewers have too many
degrees of freedom to build a coherent story of intervie-
wees’ responses. If the interviewee gives a response that
is inconsistent with the interviewer’s impression, the in-
terviewer can dynamically reformulate that impression,
perhaps asking follow up questions until hearing a set of
responses that confirm an impression. Without structure,
Judgment and Decision Making, Vol. 8, No. 5, September 2013 Belief in unstructured interviews
514
interviewers may not ask questions intended to discon-
firm these impressions. because people are inclined to
seek information that confirms their hypotheses or avoid
what might disconfirm them (Devine, Hirt, & Gehrke
1990; Sanbonmatsu, Posavac, Kardes, & Mantel, 1998).
As an example, someone known to one of the authors
was given a panel interview for a potential job. Arriving 5
minutes early, she was immediately called into the room,
where the interview went quite succesfully and she was
offered the job on the spot. During the postmortem dis-
cussion, one of the interviewers was impressed by how
well she composed herself after showing up 25 minutes
late to the interview! Apparently, she had been misin-
formed that the time of the interview was 30 minutes after
the hour, rather than on the hour as the panel expected,
and she remained composed because she did not know
she was late. Interestingly, nothing that the panel asked
effectively tested this impression that the candidate was
unusually composed under (what they believed were) the
circumstances—they never learned that she thought she
was early. Further, there are many other less flattering im-
pressions than “composed” that could also have explained
a lack of concern over being 25 minutes late, including
flippant or arrogant.
The ability to sensemake combined with the tendency
for biased testing allows unstructured interviewers to feel
they understand an interviewee almost regardless of the
information they receive. Unfortunately, a feeling of un-
derstanding, while reassuring and confidence-inspiring,
is neither sufficient nor necessary for making accurate as-
sessments (Trout, 2002). Further, there is empirical evi-
dence that confidence and accuracy are often poorly re-
lated in interpersonal prediction contexts (Dunning, Grif-
fin, Milojkovic, & Ross, 1990; Swann & Gill, 1997) and
confidence has been shown to increase with information
even in situations where accuracy does not (e.g., Ander-
sson, Edman, & Ekman, 2005; Hall, Ariss, & Todorov,
2007). We suggest that people can feel confident in
the validity of unstructured interview impressions even
if they are worthless.
We experimentally tested the roles of dilution and
sensemaking in the context of using unstructured inter-
views to predict social outcomes. Study participants
predicted the semester GPAs of other students based
on biographical information including GPA prior to the
semester in question and in some cases, an unstructured
interview. In some conditions, the interviews were non-
sense for the task at hand because the interviewee secretly
used a random responding system to answer questions,
literally providing random answers to questions that were
independent of the interviewee’s natural response. Con-
sistent with dilution, participants’ GPA predictions were
more accurate without the unstructured interview and less
accurate than had they simply predicted that semester
Table 1: Interviewees’ prior and obtained GPAs.
Interviewees
Study1 1 2 3 4 5 6 7
Prior GPA 3.32 3.28 3.24 3.23 2.95 2.84 2.81
Obtained GPA 3.80 3.08 3.71 3.34 2.68 2.69 3.35
Study 2 1 2 3 4 5 6 7 8
Prior GPA 3.69 3.38 3.29 3.29 3.23 3.05 2.83 2.65
Obtained GPA 3.83 3.80 4.00 2.83 2.65 3.59 3.00 3.31
GPA would be equal to prior GPA, a strong cue that they
were given before making predictions. Consistent with
sensemaking, participants who unknowingly conducted
random interviews were just as likely to indicate in post-
interview surveys that they got good information as those
who conducted accruate interviews. This is the first ev-
idence we know of that unstructured interviews can be
worse than invalid; they can actually decrease accuracy.
Yet, while interviews were harmful in this context, even
our nonsense interviews promoted a feeling of confidence
in the interview impression.
2 Study 1
To explore whether interviews could dilute judgments
and make them worse, we had student participants predict
the semester grade point average (GPA) of two other stu-
dents, one prediction with biographical information (de-
scribed below) and an interview, the other with just bio-
graphical information. To explore whether interviewers
sensemake, we developed a random responding system
that the interviewees could use during the interview to
see whether it would perturb predictive accuracy or sub-
jective confidence in interview impressions.
2.1 Method
2.1.1 Interviewers and interviewees
Interviewers were 76 undergraduate students at Carnegie
Mellon University who were recruited through campus
advertising and paid for their participation. We employed
five Carnegie Mellon undergraduates (two female) as per-
manent interviewees. The interviewees ranged in age
from 18 to 22, and represented multiple races, majors,
and class standings. Two of the interviewees worked for
two semesters, creating a total of 7 different semester
GPAs to be predicted. Their prior cumulative GPAs and
GPAs for the semester to be predicted are listed in Table
1.
Judgment and Decision Making, Vol. 8, No. 5, September 2013 Belief in unstructured interviews
515
2.1.2 Procedures
Participants were introduced to a randomly assigned in-
terviewee and asked to conduct a 20 minute interview
with the goal of predicting the interviewee’s GPA for a
given semester. An experimenter remained in the room
during the interview to track time and answer any ques-
tions about the task. Prior to interviewing, participants
were told the interviewee’s age, major, class standing,
and course schedule for the semester to be predicted. Par-
ticipants were offered a break 10 minutes into the inter-
view, during which they could formulate more questions
to ask.
After the interview, the interviewee was excused and
participants made their GPA predictions, which were to
be kept confidential from the interviewee. Before mak-
ing their predictions, participants were given the intervie-
wee’s cumulative GPA prior to the target semester and in-
formed that prior cumulative GPA by itself was the best
statistical model for predicting GPAs at this institution
(Lewis-Rice, 1989). After the GPA prediction, partici-
pants answered a brief questionnaire (Table 3) probing
whether they got to know the interviewee and whether
the interview provided useful information. Finally, 68
participants predicted the semester GPA for another tar-
get whom they did not interview using only the target’s
background information and cumulative GPA prior to the
semester in question.
2.1.3 Interview conditions
The structure of the interview varied according to the par-
ticipant’s random assignment to one of three conditions.
In the accurate condition (n= 25), participants could ask
only closed-ended questions, i.e., “yes or no” or “this or
that” questions. Interviewees answered these questions
accurately. The random condition (n= 26) was similar
except that after the midway break, the interviewee se-
cretly responded on a pseudo-random basis. Interviewees
noted the first letter in the last two words of each question
and classified them as category 1 (letters A through M) or
category 2 (N through Z). If both letters belonged to the
same category, the interviewee answered yes (or took the
first option of a “this” or “that” question) and otherwise
answered no. This system tends to equalize the frequency
of yes and no answers as follows: Call the proportion of
words that the interviewer samples from category 1. A
yes answer occurs if both of the last 2 words are category
1, which occurs with probability p2, or both category 2,
which occurs with probability (1 p)2. The total proba-
bility of a yes response, p2+ (1 p)2, is always closer to
.5 than pitself. By employing random response, whether
an interviewee’s response did or did not match the in-
terviewer’s expectations or confirm the interviewer’s im-
pression was simply a matter of chance.
A lack of a significant difference in accuracy or sur-
vey answers between the accurate and random conditions
might not reflect sensemaking on the part of the inter-
viewer, but rather the deficient quality of all closed-ended
interviews. That is, if closed-ended interviews are too low
in quality for this task, any differences between random
and accurate interviews might be muted. In that case,
we would expect predictions to be better and ratings to
be higher if participants could ask questions and demand
answers in any way they wanted. To rule out this expla-
nation, we also conducted a natural condition (n= 25)
in which no closed-ended constraint was placed on the
interviewer’s questions.
2.2 Results
The validity (correlation with actual outcomes) of GPA
predictions following interviews (r=.31) was indeed
significantly lower than the validity of using prior cumu-
lative GPA alone (r=.65, t(73) = 3.77, p < .05,d
=.43; Hotelling’s method for dependent r with Williams
correction), information participants had when making
their predictions. As dilution predicted, our unstructured
interview did not prove helpful in light of an already
strong cue of prior GPA. While worse than using prior
cumulative GPA, a validity of .31 compares favorably
to that of unstructured employment interviews for pre-
dicting job performance (Campion, Palmer, & Campion,
1997). Comparing the success of our interviewers with
employment screeners is not totally appropriate—GPAs
could be easier to predict becuase GPA is more reliable
than measures of job performance or because job screen-
ers do not have information as valid as prior GPA when
making decisions. Still, the validities our interviewers
were able to obtain provide at least some evidence that
they were not completely deficient at the task.
Some of our interviewees were initially concerned that
the random interview would break down and be revealed
to be nonsense. No such problems occurred and random
interviews proceeded much as interviews in our other
conditions. Further, random responding did not make
interviewers less accurate: Only the validity in the ran-
dom condition (r=.42) was significantly different from
zero, while validities in the accurate (r=.20) and natu-
ral (r=.29) conditions were not, though validities from
these 3 conditions did not differ significantly from each
other. A plausible concern is that random condition par-
ticipants might have relied more on prior GPA because
the interview was bad, thus inflating accuracy in the ran-
dom condition because prior GPA was a strong predic-
tor. This was not the case; GPA predictions were no
more correlated with prior GPAs in the random condi-
tion (r=.54) than in the accurate (r=.53) or natural
condition (r=.67).
Judgment and Decision Making, Vol. 8, No. 5, September 2013 Belief in unstructured interviews
516
Table 2: Regression analyses of the accuracy of GPA predictions. (Dependent Variable: Predicted GPA.)
Study 1 Study 2 All Study 1 Study 2 All
Predictors: (1) (2) (3) (4) (5) (6)
Actual GPA 0.51∗∗ 0.190.09 0.21 22.67 21.07
(0.08) (0.09) (0.016) (1.72) (17.42) (19.63)
Access to interview 0.99∗∗ 0.08 0.54
(0.35) (0.42) (0.22)
Actual GPA×interview 0.280.02 0.15
(0.11) (0.12) (0.07)
Q1 0.37 0.73∗∗ 0.42
(0.56) (0.27) (0.25)
Q2 0.11 0.91∗∗ 0.64
(0.48) (0.28) (0.25)
Actual GPA×Q1 0.13 0.190.11
(0.17) (0.08) (0.07)
Actual GPA×Q2 0.03 0.26∗∗ 0.18
(0.15) (0.08) (0.07)
Target dummies No No Yes1No No Yes1
Clustering at subject level Yes Yes Yes No No No
p < 0.05,∗∗ p < 0.01.
1. Two interviewees obtained the same GPA when samples are combined, thus dummies represent-
ing each interviewee are included.
2. Eight participants who did not make a prediction without an interview are excluded.
Although participants judged the interview to be some-
what informative, GPA predictions were actually less
accurate with interviews (r=.31) than without them
(r=.61). Because these correlations involved different
judgments by the same participant, we tested the differ-
ence using regression with participant random effects, re-
gressing GPA predictions on actual GPA, a dummy = 1 if
an interview was conducted, and the interview×GPA in-
teraction. The results in column 1 of Table 2 indicate that
the interaction term was negative and significant, mean-
ing that predictions were indeed significantly less corre-
lated with outcomes when an interview was performed.
Table 3 shows that the mean agreement with the state-
ments “I am able to infer a lot about this person given the
amount of time we spent together” (accurate = 2.72, nat-
ural = 2.80, random = 2.85) and “From the interview,
I got information that was valuable in making a GPA
prediction” (accurate = 3.00, natural = 3.12, random
= 3.31) was similar across all conditions, with no sig-
nificant differences emerging (F(2,73) =.233 and 1.714,
respectively). While comparisons of accuracy and sub-
jective impressions yielded null results between random
and truthful interviews, in both cases the direction was
Table 3: Study 1 post prediction questionnaire. Mean
agreement with statements on a 4-point Likert scale (1 =
disagree, 4 =agree) with standard errors in parentheses.
Accurate Random Natural
I am able to infer a lot about this person given the
amount of time we spent together.
2.72 (.68) 2.85 (.47) 2.80 (.58)
From the interview, I got information that was
valuable in making a GPA prediction.
3.00 (.65) 3.31 (.55) 3.12 (.60)
“wrong”—prediction accuracy and impressions of use-
fulness trended higher for random interviews. Agreement
with these questions did not significantly modulate accu-
racy. Column 4 of Table 2 reports the results of regress-
ing GPA predictions on obtained GPA, questions 1 and 2,
and their interactions. Neither question interacted signif-
icantly with GPA.
Judgment and Decision Making, Vol. 8, No. 5, September 2013 Belief in unstructured interviews
517
2.3 Discussion
Consistent with sensemaking, a random interview did not
perturb either GPA predictions or subjective impressions
about the quality of the interview or the extent to which
they got to know the interviewee. Consistent with di-
lution, a single, strong cue—past GPA—predicted bet-
ter than participants themselves, even though they had
this information. Further supporting dilution, participants
made better predictions without an interview than with
one. While participants generally agreed that they got
useful information from interviews, interviews signifi-
cantly impaired accuracy in this environment.
Perhaps one reason that participants felt interviews
were useful and made sense of them even when they were
random is that they conducted them. The person conduct-
ing the interview controls the questions, which could be
important to at least the sensemaking part of our results.
If participants merely watched the interviews, rather than
conducting them, would they be less prone to either or
both effects? By having participants watch pre-recorded
interviews, we could also directly assess whether they can
tell random from accurate by informing them of the pos-
sibility that the interview they watched was random and
asking them to guess which type they saw.
3 Study 2
Rather than conducting the interview themselves, partic-
ipants in Study 2 watched a pre-recorded interview that
another student had conducted. Because this procedure
did not allow participants to ask their own questions, they
could be less prone to confirming their own theories of the
interviewee and thus less prone to sensemaking. If so, we
might expect participants to be able to discern random
from accurate interviews.
3.1 Method
3.1.1 Participants and interviewees
Participants were 64 undergraduate students at Carnegie
Mellon University who were recruited through cam-
pus advertising and paid for their participation. Eight
Carnegie Mellon undergraduates (5 female) participated
as interviewees and consented to having two interview
sessions recorded (one random, one accurate) as stimuli
for the study. Interviewees ranged in age from 19 to 21,
and again represented multiple races, majors, and class
standings. Table 1 lists their prior and obtained GPAs.
3.1.2 Procedures
Procedures were the same as in Study 1, with the fol-
lowing exceptions. Prior to conducting the experimental
Table 4: Study 2 mean Likert responses (5 =strongly
agree) to post experimental questions by condition (stan-
dard errors in parentheses).
Accurate Random
I am able to infer a lot about this person given the
interview I just watched.
3.47 (0.92) 3.47 (1.08)
From the interview I just watched, I got information
that was valuable in making a GPA prediction.
3.66 (0.94) 3.75 (0.98)
sessions, we video-recorded 16 interviews (one accurate
and one random for each interviewee, natural interviews
were not used) conducted similarly to Study 1, except
that the random interview was now random responding
throughout instead of after the break. Participants were
randomly assigned to watch one of the 16 interviews via
computer interface and predict the interviewee’s GPA for
a given semester. Each interview was randomly assigned
to four different participants. The post interview ques-
tion wording was amended slightly (Table 4) to reference
the interview that was watched and the Likert-type scale
now ranged from 1 to 5 and included a “neither agree
nor disagree” point. After the post-interview question-
naire, participants were informed that their interview was
randomly drawn from a pool containing half random in-
terviews and asked to guess whether it was random or
accurate.
3.2 Results
GPA predictions were about equally correlated with ac-
tual GPAs as they were in Study 1 (r=.28). For
this sample of interviewees, however, prior GPA was
not as predictive of semester GPAs as it was in Study
1 (r=.37) and was not significantly more accurate
than participant predictions. Thus, this form of dilution
was not present in Study 2. Though the procedure in
Study 2 is somewhat different, it is informative to com-
bine results with Study 1 to see if, overall, dilution is
present, especially considering that the sample of inter-
viewees on which this result somewhat depends is small.
Combining both studies, prior GPA alone predicts sig-
nificantly better than our participants do with interviews
(t(137) = 2.59, p < .05,d=.44).
Even though participants did not control the course
of the interview in Study 2, subjective impressions were
again unperturbed by random responding. Table 4 shows
that mean agreement with the statements “I am able to
infer a lot about this person given the interview I just
Judgment and Decision Making, Vol. 8, No. 5, September 2013 Belief in unstructured interviews
518
Table 5: Frequency of accurate/random guesses by inter-
view type.
Random Accurate Total
Guess Random 13 3 16
Guess Accurate 19 29 48
Total 32 32 64
watched” (accurate = 3.47, random = 3.47) and “From
watching the interview, I got information that was valu-
able in making a GPA prediction” (accurate = 3.66, ran-
dom = 3.75) was again similar across conditions, with
agreement in the random condition again being equal or
higher. As in Study 1, GPA predictions relied on prior
GPA about the same for random (r=.58) and accurate
(r=.55) interviews.
We again tested for interactions between answers to the
post-experimental questionnaire and predictive accuracy.
As can be seen in column 5 of Table 2, both interactions
were significant in Study 2. Interestingly, the coefficients
on each question and on each interaction had opposite
signs, such that feeling one is able to infer a lot about the
interviewee negatively impacted accuracy, while feeling
one had gotten good information from the interview posi-
tively impacted accuracy. When studies 1 and 2 are com-
bined, shown in column 6 of Table 2, only the interaction
between the valuable information question and accuracy
remained significant. This result seems somewhat para-
doxical: Access to interviews overall decreased accuracy,
but given that a participant had access to an interview,
greater agreement that one had gotten valuable informa-
tion from the interview increased accuracy. This result
raises the question of whether our finding of poorer accu-
racy following interviews is driven by a subset of partic-
ipants who did not feel they got useful information from
the interview, but used it anyway. The effects are not so
simple, however. For example, looking at only those par-
ticipants who agreed with the valuable information ques-
tion (answers of 4 or 5), the validity of predictions was
only .29. Thus, there is no simple main effect such that
those who felt they got valuable information from the in-
terview were more accurate.
Table 5 tabulates participants’ judgments of whether
they saw an accurate or random interview across in-
terview type. Participants correctly classified 66% of
the interviews, significantly better than chance (χ2
(1) =
8.33, p < .01). This result, however, was largely driven
by the participants judging all interviews to be accurate:
accurate interviews were nearly always judged to be ac-
curate (29/32), and more than half of random interviews
were judged accurate (19/32). Indeed, the tendency to
judge all interviews accurate was significantly stronger
than the tendency to be correct (McNemar’s test, χ2
(1) =
11.63, p < .001). Thus, while participants have some
skill in identifying accurate from random interviews, they
also see most interviews as probably being accurate, in-
dicating some degree of sensemaking. Whether partic-
ipants were accurate in this judgment, whether partici-
pants judged their interview to be accurate, and whether
participants correctly judged their interviews to be accu-
rate all did not interact with accuracy of GPA predictions
(all p > .30).
Although participants who merely watched interviews
were still prone to sensemaking, their predictions were
not more accurate without an interview, inconsistent with
dilution (see column 2 of Table 2). Dilution did not
hold for these participants, however, largely because no-
interview predictions, which were not handled differently
in this study, were much less accurate (r=.26) than in
Study 1, while predictions following all interviews were
about as accurate as in Study 1 (r=.28). Of course,
while interviews did not make predictions worse, they
also did not make them significantly better. Our two stud-
ies thus fail to indicate any incremental validity from in-
terviews, and Study 1 suggests a decrement in validity. At
best, one can say that watching an interview did not hurt
but conducting one did. While the procedures are differ-
ent across the studies, it is again informative to combine
the data and repeat our test of predictive validity with
and without an interview. Column 3 of Table 2 shows
that the interview×GPA interaction is negative and sig-
nificant; thus, interviews are overall negatively associated
with accuracy.
3.3 Discussion
Watching interviews did little to mitigate sensemaking;
participants’ predictive accuracy and subjective impres-
sions were similar after watching random and accurate
interviews, and they were more likely to see interviews
as accurate whether they were or not. One objection to
our interpretation of Studies 1 and 2 is the presence of
experimental demand to use interviews. Because we took
the trouble of having participants conduct or watch inter-
views for the majority of the study’s duration, it is not
unreasonable to assume that participants felt they should
use the interview, regardless of their feelings about its va-
lidity. Of course, such implicit demands are also present
in real-world settings in which one is forced to conduct an
interview for screening purposes. Still, one may wonder
whether participants believed that interviews aided accu-
racy, a question we explore in Study 3.
Judgment and Decision Making, Vol. 8, No. 5, September 2013 Belief in unstructured interviews
519
Table 6: Dominance matrix in which cell frequencies
are the number of participants who ranked the column
method better than the row method.
Natural Accurate Random No interview Total
Natural – 36 12 13 61
Accurate 128 28 22 178
Random 153 136 68 357
No interview 152 142 96 390
4 Study 3
4.1 Method
One hundred sixty nine Carnegie Mellon University stu-
dents completed this task as part of a longer session. Par-
ticipants were given descriptions of the methods and con-
ditions used in Study 1 (except that the random condition
was full random as in Study 2), including the information
that participants were given prior GPA and then asked to
predict a student’s GPA from a given semester. Partici-
pants in Study 3 were then asked to rank the interview
types (including no interview) in terms which they would
like to have to make their predictions as accurately as
possible. That is, they were essentially asked about the
incremental validity of each type of interview.
4.2 Results
The modal accuracy rankings for first through last place
were natural interview first, followed by accurate, ran-
dom, and no interview, respectively, making the predic-
tion type for which participants were the most accuate
in Study 1 the least favored. This ranking was also the
single most common, chosen by 57 (33%) of our par-
ticipants. No participant ranked the natural condition
last, while 56% of participants ranked no interview last.
The dominance matrix in Table 6 depicts all aggregate
pairwise preferences by reporting how many participants
ranked the interview type in the column over the type in
the row. Even random interviews, which by definition
contain misleading information, were preferred to no in-
terview by 96 participants (57%). By ranking the random
interview ahead of no interview, a simple majority of our
participants showed that they did not anticipate a dilution
effect: Apparently, they believed that random interviews
contained some useful information that all of the useless
information would not drown out. Thus, while interviews
do not help predict one’s GPA, and may be harmful, our
participants believe that any interview is better than no
interview, even in the presence of excellent biographical
information like prior GPA.
5 Discussion
We set out to examine whether unstructured interviews
could harm predictive accuracy and whether interview-
ers would believe they garnered useful information from
the interview regardless of its quality. Consistent with
dilution, Study 1 showed that participants were better
at predicting other students’ GPAs when they were not
given access to an unstructured interview in addition to
background information. Further, participants predicted
worse than if they had used prior GPA alone, information
they were given before making their predictions. Con-
sistent with sensemaking, participants were just as able
to make coherent impressions when the interviewee re-
sponded randomly, both in terms of the accuracy of their
predictions and their confidence in their subjective im-
pressions. Study 2 showed that even when watching
rather than conducting an interview, participants were
still somewhat prone to sensemaking. Finally, Study 3
showed that participants believe that interviews will help
in this context, so much so that they rate random inter-
views as being more helpful than no interview, which
was, in fact, the best way to make predictions in Study
1 and as good as other methods in Study 2.
Our findings suggest a rethinking of the meaning of
interview validity. The validity of predictions made by
interviewers or by numerically incorporating interviews
into a model is uninformative unless it can be directly
compared to predictions made by the same methods with-
out an interview. On its face, the validity of our par-
ticipants’ predictions following unstructured interviews
looks respectable (r=.31 in Study 1 and r=.28 in
Study 2), yet these same participants were able to pre-
dict better when they did not have access to an inter-
view, and could have predicted better still if they just used
prior GPA. It may be the case that for many screening
decisions, there are one or two cues that are very impor-
tant and could be garnered from nearly any interview (or
without one), and that these cues predict better by them-
selves than the clinical judges who have access to them.
The substantial literature on interviews for employment
screening, which already indicates that unstructured in-
terviews are not particularly good, may thus even be over-
stating the validity of unstructured interviews. Our evi-
dence is experimental and compares the same judges with
and without access to an interview. To our knowledge,
there is little prior evidence of this kind.
In addition to the vast evidence suggesting that un-
structured interviews do not provide incremental validity,
we provide direct evidence that they can harm accuracy.
Because of dilution, this finding should be especially ap-
plicable when interviewers already have valid biograph-
ical information at their disposal and try to use the un-
structured interview to augment it. Because of sensemak-
Judgment and Decision Making, Vol. 8, No. 5, September 2013 Belief in unstructured interviews
520
ing, interviewers are likely to feel they are getting use-
ful information from unstructured interviews, even when
they are useless. Because of both of these powerful cog-
nitive biases, interviewers probably over-value unstruc-
tured interviews. Our simple recommendation for those
who make screening decisions is not to use them.
References
Andersson, P., Edman, J., & Ekman, M. (2005). Predict-
ing the world cup 2002 in soccer: Performance and
confidence of experts and non-experts. International
Journal of Forecasting, 21, 565–576.
Bloom, R. F., & Brundage, E. G. (1947). Predictions of
success in elementary school for enlisted personnel. In
D. B. Stuit (Ed.), Personnel research and test develop-
ment in the Naval Bureau of Personnel, pp. 233–261..
Princeton: Princeton University Press.
Campion, M. A., Palmer, D. K., & Campion, J. E. (1997).
A review of structure in the selection interview. Per-
sonnel Psychology, 50, 655–702.
Carroll, J. S., Wiener, R. L., Coates, D., Galegher, J., &
Alibrio, J. J. (1982). Evaluation, diagnosis, and predic-
tion in parole decision making. Law & Society Review,
17, 199–228.
Dawes, R. M. (2001). Everyday irrationality: How pseu-
doscientists, lunatics, and the rest of us systematically
fail to think rationally. Boulder, Colorado: Westview
Press.
DeVaul, R., Jervey, F., Chappell, J., Caver, P., Short, B.,
& O’Keefe, S. (1987). Medical school performance
of initially rejected students. Journal of the American
Medical Association, 257, 47–51.
Devine, P., Hirt, E., & Gehrke, E. (1990). Diagnostic
and confirmation strategies in trait hypothesis testing.
Journal of Personality and Social Psychology, 58, 952–
963.
Garfinkel, H. (1967). Common sense knowledge of social
structures: The documentary method of interpretation
in lay and professional fact finding. In H. Garfinkel
(Ed.), Studies in ethnomethodology, pp. 76–103. En-
glewood Cliffs, NJ: Prentice-Hall.
Gilovich, T. (1991). How we know what isn’t so: The
fallibility of human reason in everyday life. New York:
Free Press.
Hall, C. C., Ariss, L., & Todorov, A. (2007). The illusion
of knowledge: When more information reduces accu-
racy and increases confidence. Organizational Behav-
ior and Human Decision Processes, 103, 277–290.
Highhouse, S. (2008). Stubborn reliance on intuition and
subjectivity in employee selection. Industrial and Or-
ganizational Psychology: Perspectives on Science and
Practice, 1, 333–342.
Highhouse, S., & Kostek, J. A. (2013). Holistic assess-
ment for selection and placement. In K. F. Geisinger,
B. A. Bracken, J. F. Carlson, J-I. C. Hansen, N. R. Kun-
cel, S. P. Reise, & M. C. Rodriguez (Eds.), APA Hand-
book of Testing and Assessment in Psychology. Vol.
1: Test theory and testing and assessment in industrial
and organizational psychology, pp. 565–577. Wash-
ington, DC: American Psychological Association.
Lewis-Rice, M. (1989). Marketing post-secondary edu-
cational programs with implications of higher educa-
tion administration. Doctoral dissertation, School of
Urban and Public Affairs, Carnegie Mellon University.
Meehl, P. E. (1954). Clinical versus statistical predic-
tion; a theoretical analysis and review of the evidence.
Minneapolis: University of Minnesota Press.
Milstein, R. M., Wilkinson, L., Burrow, G. N., & Kessen,
W. (1981). Admission decisions and performance dur-
ing medical school. Journal of Medical Education, 56,
77–82.
Nisbett, R., Zukier, H., & Lemley, R. (1981). The di-
lution effect: Nondiagnostic information weakens the
implications of diagnostic information. Cognitive Psy-
chology, 13, 248–277.
Ruscio, J. (2000). The role of complex thought in clinical
prediction: Social accountability and the need for cog-
nition. Journal of Consulting and Clinical Psychology,
68, 145–154.
Sanbonmatsu, D., Posavac, S., Kardes, F., & Mantel, S.
(1998). Selective Hypothesis Testing, Psychonomic
Bulletin & Review, 5, 197–220.
Swann, W. B., & Gill, M. J. (1997). Confidence and accu-
racy in person perception: Do we know what we think
we know about our relationship partners? Journal of
Personality and Social Psychology, 73, 747–757.
Trout, J. D. (2002). Scientific explanation and the sense
of understanding. Philosophy of Science, 69, 212–233.
... A simple rule was discussed in Murphy (2019): avoid using predictors (i.e., give them a zero weight instead of a unit weight) that correlate more strongly with the other predictors than with the criterion. Moreover, this advice holds when decisions are made holistically as well, since adding such information could "dilute" the most predictive information (Dana et al., 2013). ...
... Importantly, research shows that overriding a statistical prediction because a certain specific case is believed to be an exception to the rule is a bad idea: people are not very good at correctly identifying these exceptions (Guay & Parent, 2018;Dietvorst et al., 2018;Dawes, 1979). This conclusion can be logically derived from the findings that statistical prediction outperforms holistic prediction; if people were good at identifying exceptions, holistic procedures would outperform mechanical procedures (see Dana et al., 2013 for a similar remark). ...
... However plausible this may sound, this is not true in general and can encourage problematic decision-making. For example, information from unstructured interviews when combined with valid grades can lower predictive validity compared to using grades alone, but at the same time increase the feeling of a valid decision (e.g., Dana et al., 2013). ...
Chapter
Full-text available
When it comes to decision-making based on psychological and educational assessments, there is compelling evidence that statistical judgment is superior to holistic judgment. Yet, implementing this finding in practice has proven to be difficult for both academic and professional psychologists. Knowledge transfer from research findings to practitioners and other stakeholders in psychological assessment is a necessary condition to close this gap. To obtain insight into how academic specialists in psychological testing disseminate knowledge about research findings in this area, we investigated how textbooks on testing and guidelines on test use report on, or do not to report on, decision-making in psychological and educational assessment. Second, we discuss some commonly encountered misunderstandings, and third we argue for a broader and more in-depth dissemination of research findings on this topic in textbooks and test standards; to this end we provide some suggestions.
... Ein weiterer, nicht eingängiger Befund ist, dass Entscheidungsträgerinnen und -träger schlechtere Leistungsvorhersagen treffen, wenn ihnen neben validen Informationen auch noch weniger valide Informationen vorliegen. Dana et al. (2013) (Dana et al., 2013;Huse, 1962;Niessen et al., 2022;Nisbett et al., 1981). ...
... Ein weiterer, nicht eingängiger Befund ist, dass Entscheidungsträgerinnen und -träger schlechtere Leistungsvorhersagen treffen, wenn ihnen neben validen Informationen auch noch weniger valide Informationen vorliegen. Dana et al. (2013) (Dana et al., 2013;Huse, 1962;Niessen et al., 2022;Nisbett et al., 1981). ...
Article
Full-text available
When hiring employees, a main goal of many organizations is to make valid predictions of future job performance. “What are valid selection methods for predicting future job performance” is therefore a central question for scientists and practitioners. This question is answered in depth by Schmidt and Hunter (1998) and Sackett et al. (2022). In most selection procedures, multiple selection methods are used, often with the intention to increase predictive validity compared to using a single selection method. Therefore, Schmidt and Hunter (1998) also discussed to what extent selection methods show incremental validity over and above the use of general mental ability, when information from selection methods is combined mechanically using optimal regression weights. However, in practice, information is rarely combined using optimal regression weights. Therefore, we discuss how different weighting schemes affect the overall validity of a selection procedure, and why more information does not always result in better predictions. However, in practice, information is rarely combined mechanically at all, but is most often combined holistically ‘in the mind’, resulting in less valid predictions than mechanical combination. Therefore, we provide practical recommendations on how to combine information mechanically, such that valid job performance predictions are made without losing the acceptance of decision makers and other stakeholders.
... The personal (One-on-One) interviews were conducted using an unstructured interview guide validated through expert validity and pre-testing on two Asante goldsmiths who were not part of the original sample. Personal (One-on-One) interviews using an unstructured interview guide assisted the researchers in gaining very rich and in-depth information in a cultural context due to the social cues gotten (Opdenakker, 2006;Dana, Dawes & Peterson, 2013) from the Asante goldsmiths on the traditional goldplating . ...
Article
Full-text available
Various techniques are used by jewellers in Ghana in depositing a film of gold on surfaces of jewellery items. Although traditional goldplating has and continues to chalk a high level of excellence in jewellery making in Ghana, little documentation has been done on it. While traditional goldplating has been practiced for decades in Ghana, the introduction of electroplating into jewellery in Ghana is downplaying its relevance. Therefore, the purpose of the study was to find out how indigenous Asante goldplating technique is done in Ghana. The study adopted the use of an art-based research design under the qualitative research approach where personal interviews, photographs, and participatory observation were used for collecting qualitative data from 19 purposively sampled Asante’s goldsmiths at Manhyia and Ayeduase in Kumasi, Ghana using expert sampling. The findings of the study have shown that traditional gold plating is an aesthetically pleasing, low cost and efficient technique used by the Asante goldsmiths that has not lost its worth. The study contends that skills and knowledge in traditional goldplating should be passed on from goldsmiths to jewellers and other apprentices who are interested in learning the craft. This would help preserve and promote this rich cultural craft for posterity
... An unstructured interview is a type of interview in which the questions or the order in which they are asked are not predetermined; instead, the interviewer asks whatever they find relevant to gather more information on the research topic [48]. It is very flexible and organized, much like everyday conversation, and fosters an open environment where new topics and ideas can flow [49]. ...
Article
Full-text available
The circular economy (CE), as an antidote to the ubiquitous and dominant global economic concept characterized by the uncontrolled exploitation of natural resources and the flow of materials from producers to users to landfills, has become inevitable. The application of circular business models is especially needed in the building sector, as one of the main consumers of natural resources and energy, considerable polluters, and substantial producers of waste. Since architects are important participants in the process of designing and building structures, it is clear that circular principles should be incorporated into architectural design (AD) as well. This paper deals with the analysis of the degree of application of circular principles in AD in Serbia and the challenges and difficulties that architects face in this endeavor. The methods used in the research included an unstructured interview on the basic principles of CE, a case study of selected housing renovation projects in Niš, Serbia (as an illustration of the principles that deal with extending the life of buildings in the domestic environment), and a survey on the degree and importance of the application of the CE principles in AD among architects in Serbia. The case study results and survey results led to the outline of guidelines for future AD in accordance with CE principles and recommendations for creating a working environment for the architects that is more circular oriented.
... However, despite the selection teams' apparent confidence in their selection approach, little evidence was presented to assess the quality of their selection outcomes. Although most people believe they can accurately assess personal characteristics through interviews (Dana et al., 2013), extensive research in organizational psychology shows that interviewers tend to make unreliable judgments and are influenced in their decision making by biased actions based on race, age, and appearance (Cook, 2016). Even though most teacher selection models are based on a sophisticated framework of teaching standards that cover multiple domains of practice and competency (Klassen & Kim, 2019), they may not use selection methods that accurately test based on domains and constructs. ...
Article
This study examines the admissions criteria used by teacher education programs in seven countries, including the England, Canada, Oman, Australia, Finland, Singapore and Malaysia. The study compares the use of three main criteria for admission: academic qualifications, nonacademic factors, or a combination of the two. The result found that there was significant variation in the admissions criteria used across the countries. Some countries placed a greater emphasis on academic qualifications, while others placed more weight on non-academic factors such as personal qualities during the interviews and assessment test. The study also found that there were differences in the types of non-academic factors considered with some countries placing a greater emphasis on literacy skills, social skills, communication skills and other skills relevant. Overall, the study highlights the importance of considering the academic, non-academic and other factors that influence admissions criteria for teacher education programs. Academic qualification is the dominant selection approach used globally in the teacher education program.
... Indeed, our results show that the presence of oddball questions negatively affected organizational attraction after the positive effects on style and innovation were controlled. Furthermore, interviewers may inadvertently incorporate applicants' responses to oddball questions in their hiring decisions, which may result in less valid decisions because irrelevant information is being considered (Dana et al., 2013). Therefore, we recommend that oddball questions be reserved for later parts of recruitment and preferably after a hiring offer has been made. ...
Article
Full-text available
Oddball interview questions have gained both popular and academic traction in recent years. Regardless of the intentions behind these questions, job seekers will form judgments about the employer based on its selection tactics. This paper examined the effect of oddball interview questions on organizational personality perceptions and subsequent attraction to the organization. In a time-lagged online experiment, we found organizations that asked oddball interview questions (vs. traditional interview questions) were perceived as more innovative and stylistic, which had a positive indirect effect on organizational attraction. Despite the positive effect of oddball interview questions on these organizational personality perceptions, oddball interview questions did not improve participants' overall attraction to the organization. The effect was not dependent on the job seekers’ personalities. Practitioners aimed to improve recruitment success by asking unorthodox interview questions should look elsewhere.
... 14 This challenge concerns researchers in various areas, 15 particularly when transferring from semi-structured interviews to structured deductions. 16 The analysis reported in this article demonstrates such a transition from scattered sentences derived out of semi-structured interviews to carefully defined articulations comprising a wanted, new vision for the State of Israel, based on the visionary section of the DOI. ...
Article
Full-text available
The core of this article is the search for an ‘organizing principle’ for the State of Israel in the present era. To approach this core goal, it conducted 40 interviews with well-known and influential Israeli individuals and attempted to identify common grounds of Israeli society. Analysis of the interviews finds a broad common denominator among those with a wide variety of opinions, which is closely linked with the major principles of the Declaration of Independence of the State of Israel.
Article
How can employers facilitate economic mobility for workers, particularly workers of color or those without a college degree? The authors integrate a fragmented literature to assess how employers’ practices affect enhanced economic security and mobility. This article first identifies three pathways linking employers’ practices to mobility: improving material job quality, increasing access to better jobs for historically marginalized workers, and promoting sustainability of employment. The authors provide a critical assessment of the research literature on recruitment and hiring practices; pay and wages; promotion practices; scheduling; leaves; diversity, equity, and inclusion initiatives; and work systems as these practices relate to economic mobility. They then identify strategic questions and feasible designs for enhancing future research on these questions in order to guide policy and management practice.
Preprint
Full-text available
We present the development and validation of a self-report instrument on Cognitions and Emotions about Child Sexual Abuse (CECSA). Three subscales, consisting of 23 items in total, were developed in a sample of 801 humanities students by means of exploratory factor analysis and Ant Colony Optimization, an automated item selection strategy used to simultaneously optimize model fit, reliability, and predictive validity. The “Naïve Confidence” subscale reflects overestimating one's ability to recognize abused children and overestimating the accuracy of children’s abuse reports, the "Emotional Reactivity" subscale measures the intensity of one's emotional reactions towards the topic of child sexual abuse (CSA), and the "Justice System Distrust" subscale covers distrusting the justice system’s ability to prosecute CSA cases. The CECSA showed adequate model fit and good internal consistencies. Bivariate correlations with other self-report measures demonstrated convergent validity. Importantly, all three subscales predicted biased evaluations towards the abuse hypothesis in scenarios of children displaying unspecific behavioral problems. This indicates predictive validity of the CECSA as an instrument measuring vulnerability for interviewer bias. The CECSA can be used to assess individual training needs of professionals who conduct interviews or conversations with children about abuse suspicions and may help to develop and evaluate interviewer trainings.
Article
The role of diagnostic and confirmation strategies in trait hypothesis testing is examined. The present studies integrate theoretical and empirical work on qualitative differences among traits with the hypothesis-testing literature. Ss tested trait hypotheses from 2 hierarchically restrictive trait dimensions: introversion-extraversion and honesty-dishonesty. In Study 1, Ss generated questions to test trait hypotheses, and diagnosticity was theoretically defined (e.g., questions associated with nonrestrictive ends of trait dimensions). In Study 2, Ss selected questions from an experimenter-provided list in which diagnosticity was empirically defined. In Study 3, Ss chose between 2 equally diagnostic questions. In each of the studies, Ss showed a primary preference for diagnostic information and a secondary preference for confirmatory information. Ss' preference for diagnostic information suggests that they prefer to ask the most informative questions. The explanation for the confirmation bias is less obvious, and possible reasons for this effect are discussed.
Article
Scientists and laypeople alike use the sense of understanding that an explanation conveys as a cue to good or correct explanation. Although the occurrence of this sense or feeling of understanding is neither necessary nor sufficient for good explanation, it does drive judgments of the plausibility and, ultimately, the acceptability, of an explanation. This paper presents evidence that the sense of understanding is in part the routine consequence of two well-documented biases in cognitive psychology: overconfidence and hindsight. In light of the prevalence of counterfeit understanding in the history of science, I argue that many forms of cognitive achievement do not involve a sense of understanding, and that only the truth or accuracy of an explanation make the sense of understanding a valid cue to genuine understanding.
Article
At the University of Texas Medical School at Houston we had a unique opportunity to examine performance through the medical curriculum and one year of postgraduate training of 50 students initially rejected for medical school. Each had been interviewed by the same Admissions Committee, which earlier had selected 150 students through the traditional process. In contrasting the initially accepted and initially rejected groups, academic and demographic variables accounted for only 28% of group difference. The 72% of group difference not accounted for by the variables examined was presumed to relate to Admissions Committee preference. In attrition and in both preclinical and clinical performance through medical school and one year of postgraduate training, there were no meaningful differences between the groups. The observations suggest that the traditional interview process probably does not enhance the ability to predict performance of medical school applicants.(JAMA 1987;257:47-51)
Article
Discretionary legal decisions have become a recent focus of theory development and policy-oriented applied research. We investigated parole release decision making in Pennsylvania from both orientations. Analyses of post-hearing questionnaires and case files from 1,035 actual parole decisions revealed that the Parole Board considers institutional behavior and predictions of future risk and rehabilitation in the decision to release on parole. Predictions seem also to be based on diagnostic judgments identifying causes of crime such as personal dispositions, drugs, alcohol, money, and environment. A one-year follow-up of 838 released parolees showed that predictions were virtually unrelated to known post-release outcomes. An actuarial prediction device was developed that is more predictive than subjective judgments. The use of decision guidelines to structure discretion is discussed, as well as the utilization of our research in guideline development by Pennsylvania.
Article
This monograph is an expansion of lectures given in the years 1947-1950 to graduate colloquia at the universities of Chicago, Iowa, and Wisconsin, and of a lecture series delivered to staff and trainees at the Veterans Administration Mental Hygiene Clinic at Ft. Snelling, Minnesota. Perhaps a general remark in clarification of my own position is in order. Students in my class in clinical psychology have often reacted to the lectures on this topic as to a protective technique, complaining that I was biased either for or against statistics (or the clinician), depending mainly on where the student himself stood! This I have, of course, found very reassuring. One clinical student suggested that I tally the pro-con ratio for the list of honorific and derogatory adjectives in Chapter 1 (page 4), and the reader will discover that this unedited sample of my verbal behavior puts my bias squarely at the midline. The style and sequence of the paper reflect my own ambivalence and real puzzlement, and I have deliberately left the document in this discursive form to retain the flavor of the mental conflict that besets most of us who do clinical work but try to be scientists. I have read and heard too many rapid-fire, once-over-lightly "resolutions" of this controversy to aim at contributing another such. The thing is just not that simple. I was therefore not surprised to discover that the same sections which one reader finds obvious and over-elaborated, another singles out as especially useful for his particular difficulties. My thesis in a nutshell: "There is no convincing reason to assume that explicitly formalized mathematical rules and the clinician's creativity are equally suited for any given kind of task, or that their comparative effectiveness is the same for different tasks. Current clinical practice should be much more critically examined with this in mind than it has been." (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
The role of diagnostic and confirmation strategies in trait hypothesis testing is examined. The present studies integrate theoretical and empirical work on qualitative differences among traits with the hypothesis-testing literature. Ss tested trait hypotheses from 2 hierarchically restrictive trait dimensions: introversion–extraversion and honesty–dishonesty. In Study 1, Ss generated questions to test trait hypotheses, and diagnosticity was theoretically defined (e.g., questions associated with nonrestrictive ends of trait dimensions). In Study 2, Ss selected questions from an experimenter-provided list in which diagnosticity was empirically defined. In Study 3, Ss chose between 2 equally diagnostic questions. In each of the studies, Ss showed a primary preference for diagnostic information and a secondary preference for confirmatory information. Ss' preference for diagnostic information suggests that they prefer to ask the most informative questions. The explanation for the confirmation bias is less obvious, and possible reasons for this effect are discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Defines irrationality as adhering to beliefs that are inherently self-contradictory, not just incorrect, self-defeating, or the basis or poor decisions. Such beliefs are unfortunately common. Two examples are given: the belief that child sexual abuse can be diagnosed by observing symptoms typically resulting from such abuse, rather than symptoms that differentiate between abused and non-abused children; and the belief that a physical or personal disaster can be understood by studying it alone in-depth rather than by comparing the situation in which it occurred to similar situations where nothing bad happened. This book first demonstrates how such irrationality results from ignoring obvious comparisons. Such neglect is traced to associational and story-based thinking, while true rational judgment requires comparative thinking. Strong emotion—or even insanity—is one reason for making automatic associations without comparison, but as the author demonstrates, a lot of everyday judgment, unsupported professional claims, and even social policy is based on the same kind of irrationality. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Virtually every previous review has concluded that structuring the selection interview improves its psychometric properties. This paper reviews the research literature in order to describe and evaluate the many ways interviews can be structured. Fifteen components of structure are identified that may enhance either the content of the interview or the evaluation process in the interview. Each component is explained in terms of its various operationalizations in the literature. Then, each component is critiqued in terms of its impact on numerous forms of reliability, validity, and user reactions. Finally, recommendations for research and practice are presented. It is concluded that interviews can be easily enhanced by using some of the many possible components of structure, and the improvement of this popular selection procedure should be a high priority for future research and practice.