Content uploaded by Jonathan Davidson
Author content
All content in this area was uploaded by Jonathan Davidson on May 16, 2018
Content may be subject to copyright.
132 Weathers et al.
© 2001 WILEY-LISS, INC.
DEPRESSION AND ANXIETY 13:132–156 (2001)
Review Article
CLINICIAN-ADMINISTERED PTSD SCALE: A REVIEW OF
THE FIRST TEN YEARS OF RESEARCH
Frank W. Weathers, Ph.D.,1* Terence M. Keane, Ph.D.,2 and Jonathan R.T. Davidson, M.D.3
The Clinician-Administered PTSD Scale (CAPS) is a structured interview for
assessing posttraumatic stress disorder (PTSD) diagnostic status and symptom
severity. In the 10 years since it was developed, the CAPS has become a stan-
dard criterion measure in the field of traumatic stress and has now been used
in more than 200 studies. In this paper, we first trace the history of the
CAPS and provide an update on recent developments. Then we review the
empirical literature, summarizing and evaluating the findings regarding
the psychometric properties of the CAPS. The research evidence indicates
that the CAPS has excellent reliability, yielding consistent scores across
items, raters, and testing occasions. There is also strong evidence of valid-
ity: The CAPS has excellent convergent and discriminant validity, diag-
nostic utility, and sensitivity to clinical change. Finally, we address several
concerns about the CAPS and offer recommendations for optimizing the
CAPS for various clinical research applications. Depression and Anxiety
13:132–156, 2001 © 2001 Wiley-Liss, Inc.
Key words: posttraumatic stress disorder; structured interview; diagnosis;
assessment; reliability; validity
1Auburn University, Auburn, Alabama
2Boston Veterans Affairs Medical Center and Boston Univer-
sity School of Medicine, Boston, Massachusetts
3Duke University Medical Center, Durham, North Carolina
*Correspondence to: Dr. Frank W. Weathers, Department of Psy-
chology, 226 Thach Hall, Auburn University, Auburn, AL 36849-
5214. E-mail: weathfw@auburn.edu
Received for publication 1 May 2000; Accepted 23 October 2000
INTRODUCTION
Since its development in 1990 at the National Center
for Posttraumatic Stress Disorder (PTSD), The Clini-
cian-Administered PTSD Scale [CAPS; Blake et al.,
1990] has become one of the most widely used struc-
tured interviews for diagnosing and measuring the se-
verity of PTSD. Initially validated on combat veterans,
the CAPS has now been used successfully in a wide va-
riety of trauma populations, including victims of rape,
crime, motor vehicle accidents, incest, the Holocaust,
torture, and cancer. It has served as the primary diag-
nostic or outcome measure in more than 200 empirical
studies on PTSD and has been translated into at least
ten languages. In addition, a child and adolescent ver-
sion of the CAPS has been developed and is now un-
dergoing field testing and psychometric evaluation.
Originally based on the PTSD criteria in the DSM-III-
R, the CAPS has been revised several times in response
to user feedback and changes in the PTSD diagnostic
criteria, with the most significant revision occurring af-
ter the publication of the DSM-IV in 1994.
The present paper is an update on the CAPS and a
critical review of the first 10 years of CAPS-related
research. It was prompted by the increasing popularity
of the CAPS, the rapid accumulation of empirical evi-
dence that supports its use, and the need to inform cur-
rent and potential CAPS users about the latest revisions
and recommendations for administration and scoring.
This paper consists of three sections. First, we provide
a brief overview of the CAPS, describing the rationale
for its development, its key features, and its evolution
through an extensive revision for DSM-IV, as well as a
description of other minor modifications. Second, we
review the published literature on the CAPS, focusing
in particular on psychometric studies of the CAPS and
on pharmacological and psychosocial treatment studies
that employed the CAPS as an outcome measure.
Third, we discuss the implications of the findings and
offer recommendations for using the CAPS in a range
of research and clinical applications.
Review Article: Clinician-Administered PTSD Scale 133
This paper was not intended as an in-depth critique
of the methodology or conceptual implications of the
studies we reviewed, nor did we seek to reach any gen-
eral conclusions about the current status of PTSD re-
search. Rather, our main purpose was simply to identify
studies that have used the CAPS and summarize the
empirical findings that bear directly on its psychomet-
ric properties and utility for assessing PTSD. Finally,
since the child and adolescent version is still undergo-
ing validation, we focus here only on research examin-
ing the adult CAPS.
OVERVIEW OF THE CAPS
In developing the CAPS, the primary goal was to
create a comprehensive, psychometrically sound inter-
view-based rating scale that would be widely accepted
as a standard criterion measure of PTSD. In this sense
it was intended to serve a role in the field of traumatic
stress analogous to that of the ubiquitous Hamilton
Depression Rating Scale [HAM-D; Hamilton, 1960]
in the field of depression. The CAPS was designed
with a number of features intended to improve exist-
ing PTSD interviews and enhance the reliability and
validity of PTSD assessment [see Blake et al., 1995,
for a full discussion and a comparison of the CAPS
with other PTSD interviews]. First, the CAPS can be
used either as a dichotomous (present/absent) diag-
nostic measure or as a continuous measure of PTSD
symptom severity. Second, the CAPS assesses both the
frequency and intensity of individual PTSD symptoms
on separate five-point (0–4) rating scales, and these rat-
ings can be summed to create a nine-point (0–8) sever-
ity score for each symptom. This permits considerable
flexibility in scoring: CAPS users can focus on the fre-
quency, intensity, or severity ratings for individual
PTSD symptoms, for the three PTSD symptom clus-
ters (re-experiencing, avoidance and numbing, and
hyperarousal), and for the PTSD syndrome as a whole.
Third, the CAPS promotes uniform administration
and scoring through carefully phrased prompt ques-
tions and explicit rating scale anchors with clear be-
havioral referents. Initial prompt questions explicitly
target each symptom, and follow-up prompts help in-
terviewers clarify the inquiry as needed, anticipating
typical points of ambiguity or confusion regarding the
PTSD criteria. These features enhance standardiza-
tion across interviewers and ensure comparability of
scores across diverse settings, raters, and trauma popu-
lations. Fourth, the CAPS provides complete coverage
of the PTSD syndrome. The original version of the
CAPS included 17 items assessing the DSM-III-R
symptoms of PTSD, 8 items assessing associated fea-
tures (e.g., guilt, hopelessness, memory impairment),
and 5 items assessing response validity, global severity,
global improvement, and social and occupational im-
pairment. As described below, the current version of the
CAPS assesses all DSM-IV diagnostic criteria for PTSD,
including Criterion A (exposure to a traumatic event),
Criteria B–D (core symptom clusters of re-experiencing,
numbing and avoidance, and hyperarousal), Criterion E
(chronology), and Criterion F (functional impairment),
as well as the associated symptoms of guilt and dissocia-
tion. Finally, the CAPS assesses current and lifetime
PTSD symptom status. The prompts for lifetime di-
agnosis help the interviewer establish explicitly that
any endorsed symptoms occurred as a syndrome
within the same one-month period.
Initially it was decided that two parallel versions of
the CAPS were needed in order to address two dis-
tinct assessment needs. The CAPS-1, or current and
lifetime diagnostic version, was designed to assess
PTSD symptom severity and diagnostic status over
the past month, or for the worst month since the
trauma. The CAPS-2, or 1-week symptom status ver-
sion, was designed to measure PTSD symptom sever-
ity over the past week and was intended primarily for
repeated assessment over relatively brief time intervals
in pharmacological research. Apart from the different
time frames assessed, the main difference between the
CAPS-1 and CAPS-2 is that for the ten CAPS items
where symptom frequency is rated in terms of a count
(i.e., how often) as opposed to a percentage (i.e., how
much of the time), the rating scale anchors on the
CAPS-2 were based on a 1-week time frame, whereas
for the CAPS-1, they were based on a 1-month time
frame. The distinction between these two original ver-
sions of the CAPS led to some confusion in the field,
such that the CAPS-2 was thought by some to be a
revised version of the CAPS. In response to this con-
fusion, as part of the DSM-IV revision, the CAPS-1
was renamed the CAPS-DX (i.e., CAPS-Diagnostic
version), and the CAPS-2 was renamed the CAPS-SX
(i.e., CAPS-Symptom Status version). As discussed be-
low, these two versions were recently combined into a
single instrument now simply known as the CAPS.
Following the publication of the DSM-IV in 1994,
the CAPS was revised, both to bring it up to date with
changes in the PTSD criteria and to incorporate user
feedback accumulated since its release in 1990. The
overarching goal for the revision was to ensure back-
ward compatibility with the original CAPS. This was
accomplished by retaining the basic structure, most of
the prompt questions, and the values and stems for the
rating scale anchors. The revision included four major
modifications and a number of relatively minor ones.
Major modifications included the following.
1. Adding a brief protocol for assessing Criterion A
(exposure to a traumatic event). This consists of a
17-item self-report checklist of potentially trau-
matic events and follow-up questions to help the
interviewer determine if a stressful event satisfies
both parts of the DSM-IV definition of a trau-
matic event (i.e., the event involves life threat, se-
rious injury, or threat to physical integrity; and
the person responds with intense fear, helpless-
ness, or horror).
134 Weathers et al.
2. Rewording some of the descriptors for the intensity
rating scale anchors. This was done to achieve a
consistent focus across items on the three key di-
mensions of intensity (duration, subjective distress,
and functional impairment), to achieve roughly
equal gradations of intensity between each of the
rating scale values, and to provide examples appli-
cable to a range of trauma populations.
3. Adding a three-point rating scale (“definite,”
“probable,” and “unlikely”) that requires inter-
viewers to determine if a reported symptom is at-
tributable to a specific traumatic event. This scale
only applies to the last 9 of the 17 symptoms of
PTSD (emotional numbing and hyperarousal)
because the first 8 symptoms (re-experiencing,
effortful avoidance, and amnesia) are all inher-
ently trauma-linked.
4. Replacing six of the eight original associated fea-
tures. The two items assessing guilt were re-
tained, but the other items were felt to be either
too population-specific (e.g., homicidality and
disillusionment with authority) or too broad or
complex to be assessed with a single item (e.g.,
sadness and depression). Also, feedback indicated
that they were not routinely administered in most
settings. They were replaced with three items
that assess the dissociative symptoms of acute
stress disorder: reduction in awareness, derealiza-
tion, and depersonalization. The addition of
these items meant that the CAPS could be used
to assess acute stress disorder, either currently, if
administered within 1 month of the trauma, or
retrospectively.
The minor modifications included a) reordering the
items to correspond to the order of the DSM-IV diag-
nostic criteria; b) adding items to fully assess Criterion E
(duration requirement) and Criterion F (subjective dis-
tress and functional impairment requirement); c) renam-
ing the CAPS-1 and CAPS-2, as described earlier; d)
improving the formatting and typeface conventions; e)
eliminating the “at its/their worst” convention for the in-
tensity prompts; f) eliminating the phrase “without being
exposed to something that reminded you of the event”
from the frequency prompt for the first item assessing
intrusive recollections; and g) adding an instruction to
the interviewer to specify the basis of any QV (question-
able validity) ratings.
Completing this discussion on the development of
the CAPS are two significant, quite recent develop-
ments. One development is the decision to eliminate
the “two CAPS” system (i.e., the distinction between
the CAPS-1 or CAPS-DX and the CAPS-2 or CAPS-
SX) and create a single CAPS scale that can be used to
assess PTSD symptoms over the past week, past
month, or worst month since the trauma. As noted
earlier, the CAPS-2 or CAPS-SX was designed to
monitor changes in symptom status over a 1-week
time frame, and it appears to work well for this pur-
pose, demonstrating excellent psychometric properties
[Nagy et al., 1999]. The problem, however, is that for
the ten CAPS items where symptom frequency is
measured as “how often” versus “how much of the
time” (i.e., as the number of occurrences rather than
as a percentage of time) the CAPS-SX and CAPS-DX
had different values because of the different time
frame (i.e., for the past week time frame on the CAPS-
SX 0=never, 1=once, 2=two or three times, 3=four or
five times, and 4=daily or almost every day, but for the
past month time frame on the CAPS-DX 0=never,
1=once or twice, 2=once or twice a week, 3=several
times a week, and 4=daily or almost every day).
This means that scores on the two versions were not
directly comparable, with CAPS-SX scores tending to
yield lower scores when the reported frequency is in
the 3–5 times a week range. As a result, investigators
who wanted to use the CAPS to establish a PTSD di-
agnosis as an inclusion criterion, but were interested
in weekly assessment intervals over the course of the
study, needed to administer a CAPS-DX in the initial
evaluation, then administer a CAPS-SX at baseline,
mid-treatment, and post-treatment, and then a CAPS-
DX at long-term follow-up if they wished to assess
end-point diagnostic status. In general this is a work-
able scheme but proved to be needlessly cumbersome.
Therefore, on the recommendation of the CAPS Ad-
visory Group for the National Center for PTSD, the
CAPS-DX and CAPS-SX were combined into a single
version, which is now simply known as the CAPS.
This was accomplished by two minor modifications to
the CAPS-DX. First, the word “week” was provided as
an alternative to “month” in the prompt questions for
frequency [e.g., “How often have you had these
memories in the past month (week)?”]. Second, for
each item a space was provided to record frequency
and intensity ratings for “past week,” in addition to
“past month” and “lifetime.” When the new combined
version of the CAPS is used to assess 1-week symptom
status, frequency ratings for the ten items for which
frequency is rated as a count are scored as 0=never,
2=once or twice a week, 3=several times a week, and
4=daily or almost every day, skipping the value 1=once
or twice (a month). Thus, the combined CAPS is ap-
propriate for assessing 1-month or 1-week intervals
and yields comparable scores from either application.
The second development involved new options for
interpreting CAPS scores. First, nine scoring rules for
deriving a PTSD diagnosis have been developed and
compared on their psychometric properties and utility
for different assessment tasks [Weathers et al., 1999].
It should be emphasized that although several of these
rules appear to be quite useful, more research is
needed before firm recommendations can be made. A
number of other rules are possible and may prove to
have greater utility for some applications. Second, five
rationally derived severity score ranges for interpret-
ing CAPS total severity scores have been proposed
and are currently being evaluated. These categories
Review Article: Clinician-Administered PTSD Scale 135
are 0–19=asymptomatic/few symptoms, 20–39=mild
PTSD/subthreshold, 40–59=moderate PTSD/thresh-
old, 60–79=severe PTSD symptomatology, and >80=
extreme PTSD symptomatology. Finally, a rationally
derived 15-point change in CAPS total severity score
has been proposed as a marker of clinically significant
change. Again, it should be emphasized that these se-
verity score ranges and the 15-point marker are pre-
liminary, and unlike the scoring rules have not been
empirically evaluated, but they offer some guidance to
clinicians and investigators who use the CAPS to mea-
sure change.
In summary, the format and the procedures for ad-
ministering and scoring the CAPS have evolved in the
10 years since it was first developed. However, the
changes can be characterized as refinements rather
than major revisions, and the goal of backward com-
patibility of the latest CAPS with the original version
appears to have been accomplished [Weathers et al.,
1999]. The CAPS now provides a range of options re-
garding administration and scoring. Interviewers can
administer only the 17 core symptoms, all DSM-IV
criteria (A-F), or add the associated symptoms. Cur-
rent symptom status can be assessed for the past week
or past month, and lifetime status can be assessed for
the worst month since the trauma. By administering
the 17 core symptoms plus the 3 dissociative items the
CAPS can also be used to assess acute stress disorder.
In terms of scoring options, the CAPS can be used to
derive a PTSD diagnosis by using one or more of the
available scoring rules, or a continuous severity score
for each item, for the three symptom clusters or for
the entire syndrome. Total severity scores summed
over the 17 core symptoms can be interpreted with re-
spect to the five proposed severity score ranges, from
asymptomatic to extreme, and a 15-point change in
CAPS scores can be used to indicate clinically signifi-
cant change.
REVIEW OF THE CAPS-RELATED
LITERATURE
LITERATURE SEARCH AND SELECTION
OF STUDIES
We developed an initial list of studies to be included
by searching the phrase “Clinician-Administered
PTSD Scale” in the “Instruments” index of the PI-
LOTS database. PILOTS is the most comprehensive
database for the field of traumatic stress, containing
virtually every relevant citation in journals and book
chapters. This search, conducted in October 1999,
yielded 241 citations. We excluded book chapters, re-
view papers, dissertations, letters to the editor, an ar-
ticle on the child and adolescent version of the CAPS,
and several studies in which CAPS-related data were
included, but not in a form suitable for our purpose.
This narrowed the list to a total of 210 studies deemed
eligible for potential inclusion in our review.
For the purposes of this review, we divided the eli-
gible studies into three categories: a) psychometric
studies, which provided direct evidence of the reliabil-
ity and validity of the CAPS; b) pharmacotherapy and
psychotherapy studies, which provided evidence of the
sensitivity of the CAPS to clinical change; and c) case-
control studies, which provided additional validity evi-
dence based on conceptually meaningful differences
between individuals diagnosed with and without PTSD
using the CAPS. In the following sections, we summa-
rize all of the available studies in the first two categories
since there was a manageable number of them and they
provided the richest information regarding the utility of
the CAPS. However, due to space constraints, we limit
our discussion of studies in the third category to several
representative examples, since these were more nu-
merous and provided more limited validity evidence.
As noted earlier, the purpose of this review was to
examine all available research addressing the psychomet-
ric characteristics of the CAPS and its usefulness as a
standard criterion measure of PTSD. Accordingly, we
placed few restrictions in selecting the studies to be in-
cluded, realizing that the final set of studies would vary
widely in their quality of design and interpretability of
results. We felt that a consistent pattern of positive re-
sults across a large number of studies would provide un-
ambiguous support for the CAPS, and that if the studies
varied in quality, it would make an even stronger case
with regard to the generalizability of the findings. In the
process of evaluating a psychological assessment instru-
ment, each study, regardless of how well-designed and
executed it is, only contributes one piece of evidence and
can never be considered definitive. Conclusive answers
can be reached only by considering the accumulation of
several different types of evidence across different trauma
populations, settings, and research designs. In the next
section, we briefly review some fundamental psycho-
metric concepts in order to provide a conceptual
framework for organizing and evaluating the evidence
regarding the effectiveness of the CAPS.
PSYCHOMETRIC CONSIDERATIONS
Psychological assessment instruments are evaluated
with respect to two important characteristics: reliabil-
ity and validity. Reliability refers to the consistency of
test scores over repeated observations. Three com-
monly reported types of reliability include internal
consistency, test-retest reliability, and interrater reli-
ability, each of which addresses a different potential
source of error in test scores. Internal consistency re-
fers to consistency over different items on a test. Re-
quiring only a single administration of a test, it is
usually indexed by coefficient alpha (Cronbach’s al-
pha), which ranges from 0.00 to 1.00, with higher val-
ues reflecting a greater degree of intercorrelation
among the items. Item-scale total correlations, which
reflect how well each item correlates with the remain-
ing items, are another useful source of information
about internal consistency. Test-retest reliability refers
136 Weathers et al.
to consistency of test scores over repeated administra-
tions. It is estimated by administering a test twice and
calculating the correlation between the two scores.
Interrater reliability refers to consistency of test scores
over different raters. It is estimated by having two or
more raters evaluate and score responses and then cal-
culating either a correlation (only two raters) or
intraclass correlation (more than two raters) on the
scores. When an instrument is used to obtain a
dichomotomous score, as in the case of a present/ab-
sent diagnostic decision, interrater or test-retest reli-
ability is estimated by calculating a kappa coefficient, a
chance-corrected measure of agreement.
Two different research designs are typically em-
ployed to evaluate the reliability of a structured inter-
view such as the CAPS. In a simple interrater design,
two or more raters independently rate the same inter-
view. One rater administers and scores the interviews as
usual, while additional raters either observe the inter-
view live or, if more convenient, observe an audiotape
or videotape of the interview. Since the information
available to the raters is identical, the only potential
source of error is inconsistency in scoring among raters.
In a test-retest design, two independent raters adminis-
ter and score the interview on separate occasions. This
is a more stringent test of reliability because it involves
inconsistency in scoring plus two additional potential
sources of error: inconsistency in how raters ask the
questions and inconsistency in respondents’ answers.
Although we follow common practice in referring to it
as test-retest reliability, the reliability estimate this de-
sign yields is more precisely known as a coefficient of
stability and interrater equivalence because it involves
both occasions and raters as potential sources of error.
An important consideration for the test-retest design
is the interval between interviews. If the interval is too
brief, respondents’ answers in the second interview
may be influenced by their memory of their answers in
the first interview. If the interval is too long, genuine
change in clinical status may occur, meaning that in-
consistencies in responses are legitimate and not a
source of error. In the assessment of PTSD, an inter-
val of a few days to a week is probably reasonable for
most applications.
Although reliability clearly is a desirable character-
istic of an assessment instrument, a more important
concern is validity, which refers to the extent to which
evidence exists to support the various inferences, in-
terpretations, conclusions, or decisions that will be
made on the basis of a test. Traditionally, three types
of validity have been identified. The first type is con-
tent validity, which refers to evidence that items on a
test adequately reflect the construct being assessed.
The second type is criterion-related validity, which re-
fers to evidence that the test can predict some variable
or criterion of interest. The criterion may be mea-
sured either at the same time the test is administered
(concurrent validity) or at some point after the test
(predictive validity). The third type is construct valid-
ity, which refers to evidence that the test measures the
construct of interest and not other constructs. This
can be demonstrated, for example, by showing that the
test correlates strongly with other measures of the
same construct (convergent validity) but not with
measures of other constructs (discriminant validity).
However, this traditional approach to validity has
recently been superseded by the latest revision of the
Standards for Educational and Psychological Testing
[APA, 1999], which maintains the following.
“[Different] sources of evidence may illuminate dif-
ferent aspects of validity, but they do not represent
distinct types of validity. Validity is a unitary concept.
It is the degree to which all the accumulated evidence
supports the intended interpretation of test scores for
the proposed purpose. Like the 1985 Standards, this
edition refers to types of validity evidence, rather than
distinct types of validity.” (p. 11)
Thus, the new Standards argues for an integrative
approach to validity, emphasizing a confluence of va-
lidity evidence from different sources, and its updated
scheme for categorizing validity evidence represents a
marked departure from previous editions. Categories
include a) evidence based on test content; b) evidence
based on response processes, which focuses on respon-
dents’ behavior during the test process; c) evidence
based on internal structure, which focuses on relation-
ships among test items and components; d) evidence
based on relations to other variables, which includes
convergent and discriminant evidence, criterion-re-
lated evidence, and the generalization of validity to
new testing situations; and e) evidence based on con-
sequences of testing, which focuses on both the in-
tended and unintended outcomes of test use.
The new Standards also emphasizes that the process
of validation applies not to tests themselves but rather
to any specific interpretations that will be made on the
basis of test scores. Therefore, stating that a test is
valid begs the question: Valid for what purpose? To
address this question specifically with regard to the
CAPS, two main uses of the CAPS have been pro-
posed. One is to establish a dichotomous PTSD diag-
nosis and the other is to provide a continuous measure
of PTSD symptom severity. Thus, the two main inter-
pretations of CAPS scores that should be the focus of
validation are the following.
1. CAPS scores reflect severity of PTSD symptoms,
for individual symptoms, symptom clusters, or
the syndrome as a whole.
2. CAPS diagnoses reflect the presence or absence
of PTSD.
One source of validity evidence that applies to these
inferences is content-based evidence. This refers to
the extent to which the content of a test corresponds
to the construct being assessed. In this regard, the
CAPS was written and revised by a team of experts in
traumatic stress at the various branches of the Na-
Review Article: Clinician-Administered PTSD Scale 137
tional Center for PTSD. It was based directly on the
diagnostic criteria for PTSD in the DSM-III-R, and
now DSM-IV, and represents these criteria faithfully.
As noted earlier, the major revision of the CAPS that
followed the publication of the DSM-IV not only re-
flected changes in the PTSD criteria but also took
into account formal and informal feedback from a
broad cross-section of CAPS users in other clinical re-
search settings. Although difficult to quantify, there is
clearly a consensus among those familiar with the
CAPS that the content of the CAPS corresponds
veridically to the construct of PTSD.
A second source of validity evidence has to do with
the internal structure of the CAPS. As currently con-
ceptualized in the DSM-IV criteria, PTSD is a multi-
faceted syndrome that consists of three closely related
but distinct symptom clusters: re-experiencing, avoid-
ance and numbing, and hyperarousal. If PTSD is a syn-
drome, then there should a reasonably high degree of
correlation among all of the symptoms. If there are dis-
tinct but overlapping symptom clusters, then the items
within the clusters should correlate more strongly with
each other than they do with symptoms in other clus-
ters. These relationships would be reflected in alpha
coefficients and item-total correlations. Factor analysis,
especially confirmatory factor analysis, in which com-
peting hypotheses about the nature of PTSD can be
directly compared, is another means of evaluating the
internal structure of the CAPS.
A third, and particularly important, source of valid-
ity evidence involves the relationship between the
CAPS and other variables. As conceptualized in the
latest Standards, this source of evidence includes what
used to be referred to as construct and criterion-re-
lated validity, and encompasses a broad range of evi-
dence that the CAPS corresponds in theoretically
meaningful ways with measures of other constructs.
Relevant findings might include a) convergent evi-
dence, showing relatively strong correlations between
the CAPS and other measures of PTSD; b) discrimi-
nant evidence, showing relatively weak correlations
between the CAPS and measures of different con-
structs; c) evidence of test-criterion relationships,
showing the correspondence between the CAPS and a
criterion such as a PTSD diagnosis or an indicator of
clinically significant improvement in PTSD symptom
severity; d) evidence that groups formed on the basis
of the CAPS differ as hypothesized on some character-
istic or behavior; and e) evidence that PTSD preva-
lence, severity, or symptom profile based on the CAPS
vary as hypothesized in different groups.
PSYCHOMETRIC STUDIES
In this section, we describe the results of studies
that emphasized the psychometric properties of the
CAPS, including studies in which the CAPS was ei-
ther the primary instrument being investigated or was
included as a validational measure for another PTSD
instrument. First, we summarize studies that examined
reliability and convergent and discriminant validity.
Then we summarize studies that address two other
psychometric issues: the factor structure of the CAPS
and the utility of various scoring rules for converting
CAPS frequency and intensity scores into a dichoto-
mous PTSD diagnosis. In reviewing these studies, we
found that investigators often neglected to specify the
version of the CAPS they administered and the scor-
ing rule they used to determine a PTSD diagnosis.
The version could usually be readily inferred, and
with few exceptions was the CAPS-1 or CAPS-DX,
the current and lifetime diagnostic version. In our dis-
cussion of the studies in this section, then, “CAPS” re-
fers to the CAPS-1 or CAPS-DX, and “CAPS-2” is
used explicitly to refer to the weekly symptom-rating
version. Unless explicitly stated, however, the scoring
rule could not be determined. For the purposes of this
review we assumed, unless stated otherwise, that in-
vestigators used the original scoring rule, whereby a
frequency of “1” or higher and an intensity of “2” or
higher for a given CAPS item indicated symptom en-
dorsement.
Reliability, convergent and discriminant validity,
and diagnostic utility. The CAPS has been the pri-
mary focus of several psychometric investigations.
Blake et al. [1990] reported the first psychometric data
on the CAPS. In a pilot study they administered the
CAPS, the Combat Exposure Scale [CES; Keane et
al., 1989], the Mississippi Scale for Combat-Related
PTSD [Mississippi Scale; Keane et al., 1988], and the
Keane PTSD Scale of the MMPI [PK scale; Keane et
al., 1984] to 25 male combat veterans. To determine
interrater reliability for the CAPS, a second rater ob-
served and independently rated seven interviews. Ex-
cellent agreement was found between the two raters,
with reliability coefficients for frequency and intensity
scores across the three symptom clusters (re-experi-
encing, numbing and avoidance, and hyperarousal)
ranging from .92 to .99. The raters also demonstrated
perfect diagnostic agreement for the seven partici-
pants, five of whom had a positive diagnosis. Internal
consistency for the three PTSD symptom clusters was
high, with alpha coefficients ranging from .73 to .85
for the three symptom clusters. Regarding convergent
validity, the CAPS correlated strongly with the Missis-
sippi Scale (.70) and the PK scale (.84). It also corre-
lated .42 with the CES, a moderate correlation that is
typical for correlations between measures of trauma
exposure and measures of PTSD.
Hovens et al. [1994] examined the psychometric
properties of the CAPS in a Dutch sample, employing
translations of the CAPS and other PTSD measures.
Participants were 76 Dutch trauma survivors (51
males, 25 females), including combat veterans, resis-
tance veterans, and concentration camp survivors. Par-
ticipants were first diagnosed with or without PTSD,
using DSM-III-R criteria, on the basis of an unstruc-
tured clinical interview. They were then administered
the CAPS, the Mississippi Scale, the PK scale, and the
138 Weathers et al.
IES. Interrater reliability on the CAPS was evaluated
through simultaneous ratings of nine interviews by
two independent clinicians. Diagnostic agreement was
perfect for these nine participants. Furthermore, reli-
ability coefficients for frequency and intensity scores
for individual items were strong, ranging from .59 to
1.00 for frequency, with a mean of .92, and .52 to 1.00
for intensity, with a mean of .86. At the symptom clus-
ter level, reliability coefficients ranged from .92 to
1.00 for frequency and .92 to .98 for intensity. Re-
garding internal consistency, Hovens et al. [1994]
found alphas of .63 for re-experiencing, .78 for avoid-
ance and numbing, .79 for hyperarousal, and .89 for
all 17core PTSD symptoms. No rationale was given
for the decision to report internal consistency for in-
tensity scores but not for frequency or severity (fre-
quency + intensity) scores.
By using the clinical interview as the criterion,
Hovens et al. [1994] found that a CAPS-based PTSD
diagnosis had 74% sensitivity, 84% specificity, and
79% efficiency, and a kappa of .58. Because these fig-
ures were lower than expected, they examined discrep-
ancies between the clinical interview and the CAPS.
They concluded that in the clinical interview clini-
cians primarily emphasized re-experiencing symptoms
in making a PTSD diagnosis, failing to give sufficient
attention to the other two symptom clusters, particu-
larly avoidance and numbing. They further found that
many of the participants with discrepant diagnoses
were only mildly symptomatic and thus more diagnos-
tically ambiguous. As evidence of convergent validity,
the total CAPS score correlated .73 with the Missis-
sippi Scale, .74 with the PK scale, and .62 with the
IES total score. Finally, with the exception of amnesia,
the prevalence of each of the 17 core PTSD symptoms
on the CAPS was significantly greater in participants
with PTSD than in those without PTSD, indicating
robust discrimination between the two groups.
As part of an effort to develop and evaluate a com-
puter-administered version of the CAPS, Neal et al.
[1994] administered both the computerized and the
original interview versions of the CAPS to 40 military
personnel (36 males and 4 females) with mixed trauma
exposure, including combat, non-combat-related as-
saults, accidents, and disasters, and childhood physical
and sexual abuse. To evaluate the reliability of the
CAPS interview, ten participants were interviewed
twice by independent clinicians, resulting in perfect
diagnostic agreement. Treating the CAPS interview as
the criterion, the computerized version had 95% sen-
sitivity and 95% specificity, with a kappa of .90. Al-
though the interval between the two versions was not
specified, they appear to have been administered in a
single session, which could have inflated this high
level of agreement. An initial finding of a high corre-
lation (.96) between total frequency and total intensity
scores on both the interview and computerized ver-
sions of the CAPS led Neal et al. [1994] to use inten-
sity scores alone as a continuous measure of severity in
all further analyses. Internal consistency of intensity
scores was high for both versions, with an alpha of .90
and a median item-total correlation of .77 for the in-
terview version, and an alpha of .92 and a median
item-total correlation of .70 for the computerized ver-
sion. In addition, intensity scores on the two versions
were strongly correlated, ranging from .55 to .92 for
individual items and from .87 to .92 for the three
symptom clusters. The correlation for total intensity
score between the two versions was .95.
Hyer et al. [1996] investigated the utility of the
CAPS for assessing older combat veterans. Participants
were 125 male World War II and Korean combat veter-
ans. They were administered a computer-assisted version
of the SCID (SCID-DTREE), including the PTSD
module, as well as the CAPS, by two clinicians. They
also completed the Mississippi Scale, the IES, and the
CES. To assure the comparability of the SCID-DTREE
and the SCID, 25 participants were administered the
SCID in a separate testing session by an independent cli-
nician. In this subsample there was perfect agreement as
to PTSD diagnostic status, not only between the SCID-
DTREE and the SCID, but between the CAPS and
the SCID. In the full sample, against a PTSD diagno-
sis based on the SCID-DTREE, the CAPS had 90%
sensitivity, 95% specificity, and 93% efficiency, and a
kappa of .75. The CAPS also demonstrated high inter-
nal consistency, with alphas of .88 for re-experiencing,
.87 for avoidance and numbing, .88 for hyperarousal,
and .95 for all 17 core items. CAPS diagnosis was cor-
related .81 with the IES, .61 with the Mississippi
Scale, and .26 with the CES. The relatively low corre-
lation with the CES is likely attributable in part to a
restricted range on the CES, since most participants
had moderate to heavy combat exposure.
As part of a large prospective study on the effects of
trauma, Shalev et al. [1997] employed signal detection
methodology to determine whether the CAPS or any
of several questionnaire measures of PTSD, dissocia-
tion, and anxiety administered at 1 week or 1 month
post-trauma could predict PTSD diagnostic status at 4
months post-trauma. Participants included 207 (98
male and 109 female) victims of civilian trauma re-
cruited from the emergency room of a hospital. In
most cases, the traumatic event involved a motor ve-
hicle accident. Within a week of their trauma, partici-
pants completed the IES, the State form of the State
Trait Anxiety Inventory [STAI; Spielberger et al.,
1970], and the Peritraumatic Dissociative Experiences
Questionnaire [PDEQ; Marmar et al., 1997]. Assess-
ments at one month and 4 months post-trauma added
the CAPS and the civilian version of the Mississippi
Scale to this battery. They found that all of the
questionnaires administered at either 1 week or 1
month post-trauma were predictive of PTSD diag-
nostic status at 4 months, but that none of the ques-
tionnaires differed significantly in terms of accuracy
of prediction. In contrast, the CAPS at 1 month
post-trauma, used as a continuous measure, was a
Review Article: Clinician-Administered PTSD Scale 139
significantly better than all of the questionnaires in
predicting a 4-month diagnostic status that was also
based on the CAPS. Although Shalev et al. [1997]
did not identify an optimal cutoff score for CAPS
total severity, they did provide diagnostic utility
data for a range of selected cutoff scores. These data
indicate that a CAPS score of 40 yielded 93% sensi-
tivity and 80% specificity.
To determine the prevalence of PTSD in veterans
with spinal cord injuries, Radnitz et al. [1995] admin-
istered the CAPS and the SCID PTSD module to 126
male veterans receiving medical care for spinal cord
injuries in inpatient and outpatient settings. Current
and lifetime diagnostic status was assessed on both the
CAPS and the SCID. To determine diagnostic status
on the CAPS, Radnitz et al. [1995] used a variant of
the original scoring rule (i.e., frequency ≥ 1, intensity
≥ 2), whereby either the frequency or intensity of an
item had to be “2” or higher and the other dimension
had to be a “1” or higher. As described below, this
scoring rule was referred by Blanchard et al. [1995a,b]
as the “Rule of 3.” Although Radnitz et al. [1995] did
not provide kappas or other diagnostic utility statistics
except efficiency, we were able to calculate these from
data provided in the tables. Treating the SCID as the
criterion, for current diagnosis the CAPS had 83%
sensitivity, 94% specificity, 93% efficiency, and a
kappa of .73. For lifetime diagnosis the CAPS had
84% sensitivity, 90% specificity, 88% efficiency, and a
kappa of .74. Although not explicitly stated, it appears
that both interviews were administered by the same
research assistant in the same session. Both of these
factors, i.e., the lack of a time interval between inter-
views and the lack of an independent rater, could have
inflated the correlation between the CAPS and the
SCID. Finally, CAPS total severity scores appeared to
strongly differentiate between participants with and
without a PTSD diagnosis, although these mean dif-
ferences were not evaluated by statistical test.
Although all of these studies provide valuable infor-
mation, the most comprehensive investigations of the
psychometric properties of the CAPS, based on data
collected at the National Center for PTSD, are de-
scribed in two articles currently submitted for publica-
tion. Weathers et al. [1999a] examined the reliability and
validity of the CAPS-1/CAPS-DX in five samples of
male Vietnam veterans, including 267 veterans from four
different research projects and 571 veterans seen for
clinical services. To evaluate the test-retest reliability
(i.e., stability and rater equivalence) of the CAPS-1, 60
veterans were administered the CAPS twice, at a 2–3 day
interval, by independent clinicians. For the three symp-
tom clusters intraclass correlations ranged from .86 to
.87 for frequency, .86 to .92 for intensity, and .88 to .91
for severity. Across all 17 symptoms intraclass correla-
tions were .93 for total frequency, .95 for total intensity,
and .95 for total severity. Following the revision of the
CAPS for DSM-IV, the same design was implemented
for the CAPS-DX in a smaller sample of 24 veterans.
This study also yielded robust estimates of reliability,
with intraclass correlations of .91 for total frequency, .91
for total intensity, and .92 for total severity. Using the
optimal scoring rule, kappa, indicating test-retest reli-
ability for a CAPS-based PTSD diagnosis, was .89 in the
first sample and 1.00 in the second sample.
Examining internal consistency, Weathers et al.
[1999a], in a combined research sample of 243 veter-
ans, found alphas for the three symptom clusters rang-
ing from .78 to .87 for frequency, .82 to .88 for
intensity, and .82 to .88 for severity. Alphas for all 17
items were .93 for frequency, .94 for intensity, and .94
for severity. In the clinical sample, alphas for the three
symptom clusters ranged from .64 to .73 for fre-
quency, .66 to .76 for intensity, and .69 to .78 for se-
verity. Alphas for all 17 items were .85 for frequency,
.86 for intensity, and .87 for severity. The lower al-
phas in the clinical sample were likely due in part to a
restricted range in CAPS scores, since most veterans
referred for clinical services at the National Center re-
port moderate to severe PTSD symptoms; they may
also be due to a much larger and more diverse pool of
clinicians, relative to the small number of well-cali-
brated clinicians who administered the CAPS to the
research samples. Nonetheless, these scores provide
excellent evidence supporting the CAPS as used in a
clinical setting.
Weathers et al. [1999a] also reported validity evi-
dence for the CAPS, focusing primarily on convergent
and discriminant validity evidence and the diagnostic
utility of the CAPS against a PTSD diagnosis based
on the SCID. In the first research sample of 123 vet-
erans, the CAPS total severity score correlated .53
with the CES, .91 with the Mississippi Scale, .77 with
the PK scale, .89 with the number of PTSD symptoms
endorsed on the SCID, and .94 with the PTSD
Checklist [PCL; Weathers et al., 1993], a 17-item self-
report measure of PTSD. CAPS total severity corre-
lated somewhat less strongly, but still robustly, with
measures of depression (.61 to .75) and anxiety (.66 to
.76), findings that were expected given the substantial
overlap between PTSD, depression, and anxiety.
Much weaker correlations were observed between
CAPS total severity and measures of antisocial person-
ality (.14 to .33), a disorder conceptually distinct from
PTSD. In an effort to bring these convergent and dis-
criminant correlations into sharper relief, Weathers et
al. [1999a] then calculated partial correlations, con-
trolling first for nonspecific distress and symptom ex-
aggeration by using the F scale of the MMPI-2, then
for nonspecific distress again using the Global Sever-
ity Index (GSI) of the Symptom Checklist-90-Revised
(SCL-90-R; Derogatis, 1983]. After controlling for
the F scale, the CAPS demonstrated strong partial
correlations with measures of PTSD, including the
Mississippi Scale (.83), the PCL (.89), and the number
of PTSD symptoms on the SCID (.82). As predicted,
however, partial correlations between the CAPS and
measures of depression (.37 to .53) and anxiety (.37 to
140 Weathers et al.
.55) were markedly lower, and those between the
CAPS and measures of antisocial personality were es-
sentially zero (–.05 to .02). A similar, but even more
striking pattern was found after controlling for the
GSI. Further, these results involving the F scale were
generally replicated in a second research sample.
Finally, again focusing on the sample of 123 partici-
pants, Weathers et al. [1999a] reported the diagnostic
utility of three CAPS scoring rules for predicting a
SCID-based PTSD diagnosis. The original, rationally
derived scoring rule (frequency ≥ 1, intensity ≥ 2, or
F1/I2) had 91% sensitivity, 71% specificity, and 82%
efficiency, with a kappa of .63. These figures reveal the
F1/I2 rule to be relatively lenient, with excellent sensi-
tivity but only moderate specificity, suggesting that it
tends to somewhat overdiagnose PTSD relative to the
SCID. The two other rules were empirically derived on
this sample. The second rule, which assigns a positive
diagnosis if the CAPS total severity score is 65 or
greater (TSEV65), had 82% sensitivity, 91% specificity,
and 86% efficiency, with a kappa of .72. Although the
higher kappa indicates a better correspondence with the
SCID than the F1/I2 rule has, the TSEV65 rule ap-
pears to be relatively stringent, tending to somewhat
underdiagnose PTSD relative to the SCID. The third
rule, derived by empirically calibrating each CAPS
symptom with the analogous SCID symptom (SX-
CAL), had the closest correspondence to the SCID,
with 91% sensitivity, 84% specificity, and 88% effi-
ciency, with a kappa of .75. Although these results re-
quire cross-validation, the SXCAL rule appears to be
the optimally efficient rule, and therefore the best
choice for differential diagnosis.
In the second article based on National Center data,
Nagy et al. [1999] described the only comprehensive
investigation of the CAPS that focused specifically on
the CAPS-2. To evaluate interrater reliability, Nagy et
al. administered the CAPS-2 to 30 (29 male, 1 female)
inpatients and outpatients in treatment for PTSD, all
but two of whom were combat veterans. Interviews
were videotaped and scored by three additional raters,
resulting in four ratings for each participant. Intraclass
correlations ranged from .76 to .99 for the 17 core
PTSD symptoms, and from .92 to .97 for the three
symptom clusters, with values of .98 for total frequency,
.96 for total intensity, and .98 for total severity. Internal
consistency and convergent and discriminant evidence
were examined in two additional samples of male com-
bat veterans: 20 veterans enrolled in a pharmacologic
trial and 37 veterans in inpatient PTSD treatment pro-
gram. All participants were administered the CAPS and
the IES. In addition, the 20 participants in the drug
trial were administered the Hamilton scales for depres-
sion and anxiety (HAM-D and HAM-A), and the 37 in-
patients completed the BDI and the Beck Anxiety
Inventory [BAI; Beck et al., 1988].
In the combined sample alphas were .25 for re-ex-
periencing, .69 for avoidance and numbing,.70 for
hyperarousal, and .79 for all 17 items. In the com-
bined sample, the CAPS correlated .37 with the IES.
For the participants in the drug trial the CAPS corre-
lated .34 with the HAM-D and .36 with the HAM-A.
In the inpatient sample the CAPS correlated .67 with
the BDI and .51 with the BAI. Taken together, these
results are generally in line with results from studies
involving the CAPS-1. However, the alpha for the re-
experiencing cluster and the correlation of the CAPS
with the IES were lower than those found previously.
It is unclear whether these findings are sample-specific
and reflect some idiosyncrasies of the particular par-
ticipants or settings in the study or whether they are
attributable to some aspect of the CAPS-2.
Although not designed primarily as psychometric
investigations of the CAPS per se, other investigations
have nonetheless provided additional evidence of its
reliability and validity. Hovens et al. [1994] used the
CAPS as a criterion measure in the evaluation of a
new self-report measure of PTSD, the Self-Rating In-
ventory for Posttraumatic Stress Disorder (SIP). The
SIP consists of 51 items, 22 assessing DSM-III-R
PTSD symptoms and 29 measuring other trauma-re-
lated sequelae, particularly those associated with the
proposed diagnostic category of disorders of extreme
stress not otherwise specified (DESNOS). This study
included two samples: the same 76 participants used in
their previous study on the CAPS plus 59 (22 male
and 37 female) psychiatric outpatients. Although the
psychiatric outpatients were not selected on the basis
of a known trauma history, 18 of them reported expo-
sure to various types of civilian trauma, including
sexual and physical assault, traumatic loss of a loved
one, and motor vehicle accidents. Combining all par-
ticipants with a trauma history across the two samples,
Hovens et al. [1994] found that the CAPS correlated
.73 with total SIP score, .75 with the DSM-III-R
items on the SIP, .70 with the civilian version of the
Mississippi Scale, .72 with the PK scale, and .61 with
the IES. Correlations for the DSM-III-R symptom
clusters between the CAPS and the SIP were .54 for
re-experiencing, .69 for avoidance and numbing, and
.71 for hyperarousal.
Two studies by Neal and colleagues also provide
convergent validity evidence. First, Neal et al. [1994]
assessed 70 (59 male and 11 female) military personnel
with mixed military and civilian trauma exposure, pre-
sumably similar to, or overlapping with, the sample
they evaluated for their study on the computerized
CAPS described earlier. They examined the correla-
tions of two CAPS variables, total intensity and num-
ber of symptoms endorsed, with the PK scale, the IES,
and the GSI of the SCL-90. Although no rationale
was offered for why they used intensity rather than se-
verity scores, presumably this was because of the high
degree of correlation between frequency and intensity
scores they found in their previous study on the com-
puterized CAPS. Similar patterns of correlations were
found for both CAPS variables. Total CAPS intensity
correlated .85 with the PK scale, .78 with the IES, and
Review Article: Clinician-Administered PTSD Scale 141
.77 with the SCL, whereas the number of CAPS
symptoms correlated .84 with the PK scale, .81 with
the IES, and .74 with the SCL-90. The strong corre-
lations with the PK scale and the IES offer convergent
evidence, but the nearly as strong correlations with
the SCL-90 failed to provide strong discriminant evi-
dence. However, given the high rates of comorbidity
found in PTSD and given that the SCL-90 primarily
reflects nonspecific distress, the SCL-90 is not an op-
timal measure for discriminant evidence. Second, Neal
et al. [1995] administered the CAPS, the IES, the PK
scale, and the Mississippi scale to 30 (29 male and 1
female) World War II prisoners of war. In this study
the CAPS correlated .63 with the IES, .71 with the
PK scale, and .81 with the Mississippi Scale, again
providing convergent evidence for the CAPS as a
measure of PTSD.
Two studies by Blanchard and colleagues provide evi-
dence regarding interrater reliability and convergent
validity. Blanchard et al. [1995b] employed the CAPS as
the primary diagnostic measure of PTSD in a study of
male and female motor vehicle accident victims. All
CAPS interviews were audiotaped and an independent
rater re-scored 15 randomly selected interviews. Inter-
rater reliability for individual items ranged from .82 to
.99, with a mean of .98, and kappa for a PTSD diagno-
sis was .81. Blanchard et al. [1996a,b] also used the
CAPS as the criterion measure in a psychometric evalu-
ation of the PCL. Participants were 27 (3 male and 24
female) motor vehicle accident victims and 13 female
sexual assault victims. Interrater reliability for the
CAPS, based on 19 audiotaped and independently re-
scored interviews, was again quite strong. Coefficients
for individual items ranged from .84 to .99 for indi-
vidual items, with a mean of .94, and kappa for a
PTSD diagnosis was .84. Correlations between the
PCL and the CAPS supplied convergent evidence.
Correlations between PCL items and corresponding
CAPS items ranged from .39 to .79, with all but three
correlations above .60 and seven correlations above
.70. In addition, the correlation between the total
scores on the PCL and the CAPS was .93.
Finally, two studies utilized the CAPS in the valida-
tion of the Davidson Trauma Scale [DTS; Davidson et
al., 1997], a 17-item self-report measure of DSM-IV
PTSD symptoms. Like the CAPS, the DTS assesses
PTSD symptoms on two dimensions: frequency, which
corresponds to the frequency dimension on the CAPS,
and severity, which corresponds to the intensity dimen-
sion of the CAPS. The DTS assesses symptoms over
the previous week. To obtain convergent evidence for
the DTS, Zlotnick et al. [1996] administered the DTS
and the CAPS to 50 female sexual abuse survivors.
They found correlations of .72 between DTS total fre-
quency and CAPS total frequency and .57 between
DTS total severity and CAPS total intensity. For total
DTS and total CAPS scores for each of the three symp-
tom clusters, they found correlations of .70 for re-ex-
periencing, .53 for avoidance, and .73 for hyperarousal.
As part of a comprehensive psychometric investigation
of the DTS, Davidson et al. [1997] administered the
DTS and the CAPS to a mixed sample of 102 female
sexual assault victims and male combat veterans, find-
ing a correlation of .78 between total scores on the
DTS and CAPS.
CAPS factor structure. The final two issues we
will discuss address the factor structure of the CAPS
and the development and evaluation of various scoring
rules for deriving a CAPS-based PTSD diagnosis.
Two studies have examined the factor structure of the
CAPS using confirmatory factor analysis. Buckley et
al. [1998] tested a single hypothesized factor structure
consisting of two factors: a) Intrusion and Avoidance
and b) Hyperarousal and Numbing. Although these
factors cut across the three DSM-III-R and DSM-IV
symptom clusters of PTSD, there is theoretical and
empirical justification for this two-factor structure. In
fact, Buckley et al. [1998] sought to replicate a previ-
ous study by Taylor et al. [1998], in which this struc-
ture was derived in an exploratory factor analysis.
Analyzing CAPS scores from a combined sample of
217 male and female motor vehicle accident victims,
Buckley et al. [1998] found support for the hypoth-
esized two-factor structure across several indices of
model fit.
In a more comprehensive analysis, King et al. [1998]
conducted a confirmatory factor analysis of CAPS
scores in 524 male combat veterans seen for clinical ser-
vices at the National Center for PTSD in Boston. In
this study King et al. [1998] tested four competing
models, three of which involved dividing Criterion C
(numbing and avoidance) into two distinct factors of
effortful avoidance (criteria C1 and C2) and emotional
numbing (criteria C3–C7). The first model was a four-
factor, first-order solution consisting of four correlated
primary factors: re-experiencing, effortful avoidance,
emotional numbing, and hyperarousal. The second
model, which was similar to the one Buckley et al.
[1998] evaluated, was a two-factor, higher-order solu-
tion, with one factor comprising re-experiencing and
effortful avoidance and the other comprising emotional
numbing and hyperarousal. The third model was a
single factor, higher-order solution that hypothesized a
single PTSD factor comprising the four symptom clus-
ters. The fourth model was a single-factor, first-order
solution that hypothesized that all 17 symptoms load on
a single PTSD factor. King et al. [1998] found that the
first model provided the best fit to the data, suggesting
that PTSD, as assessed by the CAPS, consists of four
correlated but distinct symptom clusters. This finding
supports the CAPS as a measure of PTSD in that the
internal structure of the CAPS corresponds to the
DSM PTSD symptom clusters, albeit with the addi-
tional, conceptually meaningful distinction between
effortful avoidance and emotional numbing.
CAPS scoring rules. Finally, one of the recent de-
velopments in the CAPS has been the explication and
evaluation of various rules for converting continuous
142 Weathers et al.
CAPS scores into a dichotomous PTSD diagnosis.
From the outset it was recognized that the original,
rationally derived F1/I2 rule described earlier was
only an initial working rule that might be replaced by
others once sufficient empirical evidence had accumu-
lated. Over time, a number of new rules have been
proposed and have recently appeared in the literature.
Blanchard et al. [1995a] were the first to compare the
impact of adopting different scoring rules. In an inves-
tigation of 100 (35 male and 65 female) motor vehicle
accident victims they proposed and evaluated three
different scoring rules, all of which involved convert-
ing CAPS frequency and intensity scores into a di-
chotomous score for each symptom, then following
the DSM requirements (one re-experiencing symp-
tom, three numbing and avoidance symptoms, and
two hyperarousal symptoms) to derive a PTSD diag-
nosis. According to the Rule of 2, a symptom is consid-
ered present if the severity score for an item (frequency
+ intensity) is ≥ 2 (i.e., frequency and intensity are both
≥ 1). Similarly, the Rule of 3 requires an item severity
score > 3 (either frequency or intensity is ≥ 2 and the
other is ≥ 1). This is similar to but more inclusive than
the original F1/I2 rule. Last, the Rule of 4 requires an
item severity ≥ 4. Blanchard et al. found that the three
rules yielded markedly different PTSD prevalence esti-
mates, with 44% for the Rule of 2, 39% for the Rule of
3, and 27% for the Rule of 4. Furthermore, they found
that participants who met the Rule of 4 had higher
scores on measures of depression and anxiety, and
greater functional impairment, relative to those who
only met the Rule of 3.
More recently, Weathers et al. [1999b] described
and compared nine scoring rules, drawing on data
from the same five samples in the Weathers et al.
[1999a] psychometric article described earlier. Four of
the nine rules were rationally derived, including the
original F1/I2 rule, the Item Severity ≥ 4 (ISEV4)
rule, which is identical to Blanchard’s Rule of 4, and
two rules based on clinicians’ judgments regarding
which frequency/intensity combinations constitute a
symptom. The other five rules were empirically de-
rived, including four rules calibrated in various ways
against the SCID PTSD module, and one rule identi-
fied by Orr [1997], based on a study of physiological
reactivity in female incest survivors. Kappa coeffi-
cients indicating test-retest reliability for the rules
ranged from .72 to .90 in an initial sample of 60 veter-
ans, and from .68 to 1.00 in a follow-up sample of 24
veterans. Kappa coefficients for predicting a PTSD di-
agnosis based on the SCID ranged from .63 to .75. As
in the Blanchard et al. [1995a] study, the nine rules
yielded widely varying prevalence estimates, ranging
from 26% to 49% in a combined research sample of
243 veterans and 47% to 82% in a clinical sample of
571 veterans. The F1/I2 rule was the most lenient in
the clinical sample and second most lenient in the re-
search sample. The two rules based on clinicians’ rat-
ings were the most stringent in both samples. Also,
compared to participants who met criteria only by the
F1/I2 rule, those who met criteria for the most strin-
gent rule had significantly higher scores on measures
self-report measures of PTSD, depression, anxiety,
and nonspecific distress.
A third study, by Fleming and Difede [1999], exam-
ined the impact of adopting different scoring rules on
the CAPS-2 in a sample of hospitalized burn patients.
Although they recognized that the CAPS-2 was not
suitable for a diagnosis of PTSD because of the one-
week time frame, they deliberately chose it for their
study because they were interested in acute PTSD
symptoms within the first 2 weeks after the trauma.
Administering the CAPS-2 to 69 (48 male and 21 fe-
male) participants, they compared the effects of adopt-
ing essentially the same scoring rules described by
Blanchard et al. [1995a]. The one exception was that
Fleming and Difede appear to have used the F1/I2
rule rather than the more inclusive Rule of 3 of
Blanchard et al. [1995a]. Compared to the previous
two studies, they found less variability among the dif-
ferent rules in terms of estimated prevalence of
PTSD. The Rule of 3 and the Rule of 4 both yielded a
prevalence of 25%, while the Rule of 2 yielded a
prevalence of 32%. Furthermore, they found no sig-
nificant differences on the IES or self-report measures
of acute stress and nonspecific distress between par-
ticipants who met criteria only by the Rule of 2 and
those who met criteria by the Rule of 3 or the Rule of
4. However, differences were found on all self-report
measures between all participants who met criteria for
PTSD by at least the Rule of 2 and those who did not
meet criteria for PTSD by any of the rules.
Taken together these three studies of scoring rules
for the CAPS indicate that there are important conse-
quences to adopting a particular rule. Prevalence esti-
mates can vary considerably and participants who
meet criteria by lenient rules may be less symptomatic
and less impaired relative to those who meet criteria
by more stringent rules. Weathers et al. [1999b] dis-
cuss three implications of these findings. First, investi-
gators should always explicitly describe and defend
their choice of a CAPS scoring rule. Second, for many
applications, an efficient and informative strategy
would be to use several scoring rules, ranging from le-
nient to stringent, and compare the different results
obtained. Third, when using different scoring rules is
not feasible, investigators should select scoring rules
that are best suited for the purpose of the study. Le-
nient scoring rules are most appropriate for screening,
when a lower threshold for diagnosis is needed to
avoid false negatives. Stringent rules are most appro-
priate for confirming a diagnosis or creating an unam-
biguous PTSD group for case-control research, when
a higher threshold is need to avoid false positives.
Moderate rules are most appropriate for differential
diagnosis, when false negatives and false positives are
weighted equally and the goal is to minimize the over-
all number of diagnostic errors.
Review Article: Clinician-Administered PTSD Scale 143
Finally, we note that it is possible that some of the
diagnostic utility data cited for the CAPS in this sec-
tion, even though it is consistently high, might actu-
ally have been stronger had different scoring rules
been applied. In discussing the articles in this section,
we assumed that unless stated otherwise investigators
used the original F1/I2 rule to derive a PTSD diagno-
sis from CAPS scores. However, the Weathers et al.
[1999b] article, in particular, demonstrated that the
F1/I2 rule is a relatively liberal rule and may not be
optimal for differential diagnosis.
Discussion. Considering all the accumulated evi-
dence, the CAPS appears to have excellent psycho-
metric properties across a wide variety of clinical
research settings and trauma populations. Interrater
reliability for continuous CAPS scores was consis-
tently at the .90 level and above, with diagnostic
agreement at times reaching 100%. Test-retest reli-
ability, a more stringent measure of agreement, was
nearly as strong, although it was only evaluated in one
study and needs replication. These findings suggest
that trained and calibrated raters can achieve a high
degree of consistency in using the CAPS to diagnose
PTSD and rate PTSD symptom severity. In addition,
internal consistency was generally high, with alphas
typically in the .80 to .90 range for the three PTSD
symptom clusters and for the entire syndrome.
Although somewhat more variable and therefore
more difficult to easily summarize, evidence of validity
was also strong. Regarding convergent evidence, the
CAPS generally demonstrated correlations at the .70
level and above with self-report measures of PTSD
such as the Mississippi Scale, the PK scale, the IES,
the PCL, and the DTS, often reaching the .80 to .90
range. Diagnostic utility of the CAPS was evaluated in
five studies, and with one exception in which the crite-
rion was a clinical diagnosis based on an unstructured
interview, was quite robust, with sensitivities and
specificities above .80, and often above .90, and kappas
above .70. To date, however psychometric studies of
the CAPS offer little in terms of discriminant evi-
dence. More data on this are needed. Because indi-
viduals with PTSD, especially chronic PTSD, often
have comorbid disorders and experience high levels of
distress, it may prove to be difficult to obtain un-
equivocal discriminant evidence, particularly with
measures of depression and anxiety, since these two
constructs overlap conceptually with PTSD. Weathers
et al. [1999a] tried to address this problem by includ-
ing measures of a construct conceptually unrelated to
PTSD (antisocial personality) and by partialing out
the effects of nonspecific distress. These two ap-
proaches appeared to be successful in providing dis-
criminant evidence, but more creative research on this
issue is needed.
We close this section with a brief discussion of some
fundamental questions regarding the psychometric in-
vestigation of the CAPS. First, regarding convergent
and discriminant evidence there are no absolute stan-
dards for what constitutes “good” evidence. How large
should convergent validity coefficients be? How small
should discriminant validity coefficients be? How
large a difference should there be between convergent
and discriminant coefficients? Reasonable answers to
these questions must be informed by a well-articulated
theoretical model and ultimately based on expert
judgment.
Second, is it appropriate to evaluate a putative “gold
standard” such as the CAPS against self-report mea-
sures? When a correlation between the CAPS and an-
other measure is lower than expected it is unclear if the
“problem” lies with the CAPS or with the alternative
measure, or a combination of both. This question is
particularly important with respect to self-report mea-
sures of PTSD, which are subject to misinterpretation
and to response biases such as social desirability, exag-
geration, minimization, and even random responding.
In addition, they vary significantly in format, including
their correspondence with DSM criteria for PTSD, the
dimension of symptom severity they emphasize (e.g.,
subjective distress, functional impairment, and fre-
quency), and the time frame they assess (past week and
past month). Finally, they vary in the quality of their
psychometric properties. Any of these characteristics,
alone or in combination with characteristics of different
samples, could affect their correlation with the CAPS.
In general, in PTSD research, as in other areas of psy-
chopathology, the diagnostic standard is a clinical inter-
view because interviewers can clarify as needed, ask for
examples, observe clinically relevant behaviors, and
evaluate potential response bias. Most importantly, with
an interview it is ultimately the clinician who makes the
final rating, not the participant.
This, then, raises a third question. What measure
should serve as the criterion for evaluating the diagnos-
tic utility of the CAPS? Part of the problem is that
there is no other single measure that has been widely
accepted as a criterion measure of PTSD. The SCID
PTSD module comes the closest, but there is evidence
that suggests that it may not be as reliable as the CAPS,
which sets an upper limit on how well the CAPS can
perform in predicting it. In fact, as Weathers et al.
[1999b] have argued, the CAPS appears to be more
strongly associated with the SCID PTSD module than
the SCID PTSD module is with itself. Another possi-
bility might be to use a multiple converging measures
approach, such as was used in the National Vietnam
Veterans Readjustment Study [NVVRS; Kulka et al.,
1990] or the so-called LEAD standard approach pro-
posed by Spitzer and colleagues. Both approaches could
readily be applied to the CAPS and would provide valu-
able new information.
TREATMENT OUTCOME STUDIES
Design and analysis issues. In this section, we de-
scribe pharmacological and psychosocial treatment
outcome studies that employed the CAPS as a primary
outcome measure. Our main focus in this section is on
144 Weathers et al.
the ability of the CAPS to detect genuine changes in
PTSD symptom severity in the context of a clinical
intervention. A key question addressed in this section
is “What empirical results would constitute evidence
supporting the claim that the CAPS is in fact sensitive
to change?” We hypothesize four results that we
would expect to occur in a treatment outcome study if
this claim is true. First, we would expect to find a re-
duction in CAPS scores from pre-treatment to post-
treatment. This should be true for virtually any
intervention, for any of the following reasons:
1. Possible placebo effects.
2. Possible statistical regression (i.e., participants
selected on basis of extreme scores tend to show
less extreme scores on subsequent testing).
3. The fact that repeated assessment, particularly in-
terview-based assessment, may be considered an in-
tervention in and of itself since it includes many
putative active ingredients of psychotherapy, in-
cluding a) a safe, professional interpersonal context;
b) therapeutic exposure and emotional and cogni-
tive processing through disclosure of painful aspects
of the trauma and trauma-related symptoms; and c)
education about PTSD symptoms and self-moni-
toring.
Second, if a study includes one or more comparison
groups, there should be greater improvement in the
group or groups that receive a more potent treatment
or a treatment with more putative active ingredients
of therapy. Third, changes on the CAPS should paral-
lel changes in other measures of PTSD. Finally, if the
active therapy ingredient targets PTSD specifically,
then the CAPS should show greater reduction relative
to measures of other constructs such as depression,
anxiety, and global distress and impairment.
In reviewing these studies, we focused only on data
related specifically to the CAPS. It was not our intent
to address the effectiveness of pharmacological or psy-
chosocial treatments for PTSD per se or to rigorously
critique the research methodology of the various stud-
ies. Nonetheless, within this limited scope of our re-
view, we identified several issues with regard to the
reporting of CAPS data that required several decisions
about how to extract and summarize CAPS-related re-
sults and present them in a standard format. First,
studies varied considerably in terms of the outcome
measures they included and how the data were re-
ported and analyzed, differing on a) which CAPS
scores were included (e.g., frequency, intensity, or se-
verity scores for individual items, for the three symp-
tom clusters, or for the syndrome as a whole); b)
which additional measures were included; c) how
scores were presented (e.g., means and totals); d) how
change was quantified (e.g., change scores, percent
change, statistical significance, effect size, and graphic
presentation only); and e) how complete the data
analyses were. In general, in response to this variabil-
ity, we tried to extract the results most relevant to the
CAPS and present them as uniformly as possible. For
the purposes of this review, we used percent change as
the primary metric for comparing results across stud-
ies and across instruments within the same study. This
is a commonly reported metric, particularly in the
pharmacology literature. It is easily calculated when
not provided, readily comprehensible, and applicable
for any type of study, from case studies to large ran-
domized trials. Where possible, we identified or calcu-
lated percent change for the primary outcome
variables in each of the studies. In addition, we in-
cluded the results of statistical significance tests of key
comparisons when they were provided.
Second, studies varied in terms of how many mea-
surement points they included. All studies included as-
sessments at pre-treatment and post-treatment, but
others included assessments at screening, extended
baseline, pre-treatment, post-treatment, additional in-
tervals during treatment, and one or more long-term
follow-ups. To simplify our presentation, whenever
possible, we examined only pre-post changes for all
studies. These data were available for almost all stud-
ies and were sufficient as evidence of the sensitivity of
the CAPS to clinical change. Also, in the studies that
presented additional follow-up data, pre-post changes
were generally sustained and sometimes continued to
improve, so little would have been gained by examin-
ing additional assessment periods.
Third, there was some ambiguity with regard to the
terms investigators used to describe their study de-
signs. Terms such as open trial, uncontrolled trial, and
open label do not adequately characterize the essential
aspects of the research designs they were used to de-
scribe nor were they used consistently across studies.
The questions we used as a guide in depicting the
various research designs were the following.
1. Is the treatment condition known to the partici-
pant?
2. Is the treatment condition known to the assessor?
3. Is there at least one comparison condition?
4. Is the comparison condition within-subjects, as in
a crossover design, or between-subjects, as in a
randomized controlled trial?
Answers to these questions were not always stated
explicitly, although the investigators may have in-
tended to imply them by the labels they used to de-
scribe their studies. In particular, unless otherwise
specified, we assumed that assessments were not
blinded. Fourth, studies often did not explicitly iden-
tify which version of the CAPS was used. This could
sometimes be inferred, but in general, unless there
was some specific indication that the CAPS-2/CAPS-
SX was used, we assumed that the CAPS-1/CAPS-DX
was used. Finally, the final sample size often differed
from the initial one due to attrition and inclusion/ex-
clusion criteria. We report the sample size on which
the final data analyses were based.
Review Article: Clinician-Administered PTSD Scale 145
Pharmacological and psychosocial treatment
studies. In this section, we review 10 pharmacological
and 19 psychosocial treatment studies that used the
CAPS as a primary outcome measure. These studies
and their key findings relevant to the CAPS are pre-
sented in Tables 1 and 2. We consider the results with
respect to the four issues outlined above with regard to
evidence of sensitivity to clinical change, including
within-groups effects (pre-post change), between-
groups effects (differential change due to nature of in-
tervention, e.g., drug versus placebo), change on the
CAPS relative to change on other measures of PTSD,
and change on the CAPS relative to measures of other
constructs (e.g, anxiety, depression, and global distress
and functional impairment). Whenever possible we
present the percent change values for each measure de-
scribed in Tables 1 and 2. However, some studies only
reported the results of significance tests and did not in-
clude actual values for one or more key measures. The
studies in Tables 1 and 2 are arranged chronologically
and numbered within each table. For ease of presenta-
tion in the following sections, we refer to studies by
number rather than by author(s) and year.
Within-groups effects. Among the pharmacological
studies, there was a significant reduction in CAPS to-
tal score in eight of the nine studies that reported in-
ferential statistics (Table 1, all but Study 5 reported
significance levels; all of those but Study 8 were sig-
nificant). Considering only participants who received
a drug, for the nine studies that reported actual CAPS
score values (all but Study 2), the reduction in CAPS
total score ranged from 10–63%, with a median of
33%. The psychosocial studies yielded similar find-
ings, with evidence of even greater improvement.
There was a significant reduction in CAPS total score
in 10 of the 13 studies that reported inferential statis-
tics (Table 2, Studies 1–4, 6, 8, 10–13, 15, 16, and 19
reported significance levels; all of those but Studies 1,
6, and 8 were significant). Considering the partici-
pants who received an active intervention and showed
the most improvement, for the studies that reported
actual CAPS score values (all but Studies 1, 8, and 16),
the reduction in CAPS total score ranged from 19–
100%, with a median of 50%.
Between-groups effects. Overall, there were relatively
few controlled trials. Of the ten pharmacological stud-
ies, only three were randomized, placebo-controlled
trials (Table 1, Studies 2–4). Two of these (Studies 2 and
3) found significantly greater reduction in CAPS scores
for the drug group relative to the placebo group. The
third study (Study 4) found slightly greater improve-
ment for drug versus placebo, although the effect was
not significant. Similarly, of the 19 psychosocial studies,
only 6 were randomized, controlled trials (Table 2,
Studies 1, 4, 12, 13, 16, and 19), although an additional
3 studies (2 crossover designs and 1 program evalua-
tion) included a comparison condition (Studies 8, 10,
and 17). Only two of the six randomized, controlled tri-
als (Studies 12 and 16) found significant between-
groups effect, with significantly greater reduction in
CAPS scores for a more active, trauma-focused inter-
vention than for a control condition. Two of the other
four studies (Studies 4 and 13) found greater improve-
ment for active interventions relative to control condi-
tions, but the effects were not significant. Of the
remaining two studies, Study 19 included two active
interventions, which showed substantial, equivalent
improvement but no minimal intervention control con-
dition; Study 1 employed a very brief intervention and
found no within-groups or between-groups changes on
any measures. Finally, in Study 10, a quasi-experimental
program evaluation, between-groups differences were
found among three types of PTSD inpatient programs.
CAPS versus other PTSD measures. In general,
CAPS results matched the results for self-report PTSD
measures, particularly the IES. Among the pharmaco-
logical studies, the CAPS had comparable results to the
IES in three studies (Table 1, Studies 1, 4, and 6) and
to the DTS in two other studies (Studies 5 and 9), with
differences ranging from 0–7 percentage points. For
the psychosocial studies differences between the CAPS
and other PTSD measures were more variable and
somewhat larger. Eight studies (Table 2, Studies 2, 4, 5,
11, 12, 15, 16, and 19) found a greater reduction on the
CAPS relative to the IES, with differences ranging
from 1–24 percentage points. On the other hand, four
studies (Studies 3, 9, 14, and 18) found a greater reduc-
tion on the IES, with differences ranging from 7–26
percentage points. In addition, the CAPS showed a
comparable or greater reduction relative to the PK
scale (Study 1), the Mississippi Scale (Study 5), the Ci-
vilian Mississippi Scale (Study 15), the PSS (Study 9),
the MPSS-SR (Study 7), the PCL (Study 11), and the
Penn Inventory (Study 19).
CAPS versus measures of depression. All but two of
the pharmacological studies included a measure of de-
pression, primarily the HAM-D and MADRS. Four
studies (Table 1, Studies 1, and 8–10) found greater re-
duction on the HAM-D relative to the CAPS, with
differences ranging from 2–13 percentage points. A
fifth study (Study 2) found significant within-groups
and between-groups effects for both the CAPS and
the HAM-D but did not report actual rating scale val-
ues. However, two studies (Studies 6 and 7) found
greater reduction on the CAPS relative to the MADRS,
with differences of 9 and 10 percentage points, respec-
tively. One study (Study 5) found slightly greater reduc-
tion on the CAPS relative to the BDI. The BDI was
also included in 12 of 19 psychosocial studies, with ten
(Table 2, Studies 2, 6, 7, 11–13, 15, 16, 18, and 19) find-
ing greater reduction on the CAPS and two (Studies 5
and 9) finding equivalent reduction on the two scales.
Except for a case study (Study 7), which found a 48%
reduction on the CAPS and an 18% increase on the
BDI, the greater reduction on the CAPS ranged from
2–18 percentage points.
CAPS versus measures of anxiety. Four pharmaco-
logical studies included the HAM-A. Three (Table 1,
TABLE 1. Summary of CAPS findings from pharmacological treatment studies of posttraumatic stress disorder*
Authors (year) Participants Design Drug Duration Key CAPS-related findings
1. Nagy et al. [1993] Male combat 1. Non-blinded, Fluoxetine 10 weeks 1. Significant reduction in CAPS-2 total score (34%)
veterans (N=19) uncontrolled 2. Comparable reduction on IES (39%)
2. CAPS-2 3. Somewhat larger reduction on HAM-D (47%) and HAM-A (41%)
4. With response defined as 50% reduction in CAPS total score, a 2-point
improvement on CAPS global severity rating, and consensus of two clinicians, 7
participants (37%) had good responses, 5 (26%) had partial response, and 7 (37%)
did not respond
2. van der Kolk et al. 1. Civilian with mixed Double-blind, Fluoxetine 5 weeks 1. Significantly greater reduction in total CAPS score for drug relative to
[1994] trauma (N=23, 12 randomized, placebo, after adjusting for initial CAPS score and site
male/23 female) placebo-controlled 2. Greater reduction in CAPS total score for civilian sample relative to
2. Combat veterans veteran sample
and civilians with 3. Significant reduction in numbing and hyperarousal symptoms but not
mixed trauma reexperiencing or avoidance
(N=24, 23 male/1 female) 4. Significantly greater reduction in HAM-D score for drug relative to placebo
3. Katz et al. [1994/1995] Combat veterans and Double-blind Brofaromine 14 weeks 1. Significant reduction in CAPS total score for both groups (drug=48%,
civilians with mixed randomized, placebo=29%), with significant, between-groups difference
trauma (N=45, 34 male/ placebo-controlled 2. 55% of drug group and 26% of placebo group no longer met diagnostic
11 female) multi-center criteria for PTSD
3. On CGI, drug group had significantly greater mean improvement and more
participants rated as very much improved
4. Baker et al. [1995] Combat veterans and Double-blind Brofaromine 10 weeks 1. Significant reduction in CAPS total score for both groups (drug=33%,
civilians with mixed randomized, placebo=31%), but no between-groups difference
trauma (N=114, 92 placebo-controlled 2. Comparable results for IES, with somewhat smaller reduction in IES
male/22 female) Multi-center total score in both groups (26%) and no between-groups differences
3. No between-groups difference on DTS or Physician’s Global Evaluation (within-
groups analyses not presented)
5. Hertzberg et al. [1996] Male combat veterans Multiple baseline, Trazodone 4 months 1. Reduction in CAPS total score (15%)
(N=6) open label but 2. Comparable reduction on DTS (15%)
assessment blind 3. Somewhat smaller reduction on BDI (10%), little change on STAI-S (+1%)
4. Four of 6 participants rated as much improved on CGI, 2 rated as minimally
improved
6. Neal et al. [1997] Military personnel 1. Non-blinded, Moclobemide 12 weeks 1. Significant reduction in computerized CAPS total score (50%)
and civilians with uncontrolled 2. Comparable reduction on IES (49%)
mixed trauma (N=20, 2. Computerized 3. Somewhat smaller reduction on MADRS (41%), HAM-A (44%), and
18 male/2 female) CAPS, intensity CIS (39%)
scores only 4. Computerized CAPS change score correlated .76 with IES change score, but only
.31 with MADRS and .32 with HAM-A change scores
7. Bouwer and Stein Male torture victims Routine clinical care, Sertraline (n=9) 8 weeks 1. Significant reduction in CAPS total score (63%)
[1998] (N=14) non-blinded, un- Imipramine 2. Somewhat smaller reduction on MADRS (53%)
controlled (n=2) 3. 12 of 14 participants rated as very much or much improved on CGI
Fluoxetine
(n=2)
Clomipramine
(n=1)
(continued)
TABLE 1. (Continued).
Authors (year) Participants Design Drug Duration Key CAPS-related findings
8. Cañive et al. [1998] Male combat veterans Routine clinical care, Bupropion 6 weeks 1. Trend for reduction in CAPS total score (10%), significant
(N=14) non-blinded, un- reduction (16%) in CAPS hyperarousal score, but not in
controlled reexperiencing (+1%) or avoidance/numbing (9%) scores
2. Ten of 14 participants rated as very much or much improved on CGI
3. Significant reduction on HAM-D (26%) but not HAM-A (12%)
9. Hertzberg et al. [1998] Male combat veterans Non-blinded, Nefazadone 12 weeks 1. Significant reduction in CAPS total score (32%)
(N=10) uncontrolled 2. Significant, somewhat smaller reduction on DTS (28%)
3. Significant reduction on HAM-D (34%) but not BDI (7%)
4. Ten of 10 participants rated as much improved or very much improved CGI
10. Clark et al. [1999] Male combat veterans Open label but Divalproex 8 weeks 1. Significant reduction in CAPS total (18%), reexperiencing 21%), and
(N=13) assessment (except hyperarousal (29%) scores, nonsignificant reduction in avoidance/
CGI) blind, numbing score (7%)
uncontrolled 2. Significant, somewhat larger reduction on HAM-D (31%) and HAM-A (27%)
3. Eleven of 13 participants rated as much improved or very much improved on CGI
*BDI, Beck Depression Inventory; CGI, Clinical Global Impressions; CIS, Clinician Impression of Severity; DTS, Davidson Trauma Scale; HAM-A, Hamilton Rating Scale for Anxiety, HAM-D, Hamilton Rating Scale
for Depression; IES, Impact of Event Scale, MADRS, Montgomery-Asberg Depression Rating Scale; STAI, State-Trait Anxiety Inventory.
TABLE 2. Summary of CAPS findings from psychosocial treatment studies of posttraumatic stress disorder*
Number
of sessions/
Authors (year) Participants Design Intervention duration Key CAPS-related findings
1. Boudewyns et al. Male combat veterans Ramdomized, 1. EMD Two 90-minute 1. No significant reduction in any CAPS symptom or symptom cluster scores
[1993] (N=20) controlled trial, 2. exposure EMD or 2. No significant reduction on Mississippi Scale or IES
assessments not control exposure 3. No significant reduction in psychophysiological responding
blinded 3. routine sessions in
clinical care 2 weeks
(group therapy
without exposure)
2. Busuttil et al. Military personnel, Uncontrolled, Inpatient group 12 days 1. Significant reduction in CAPS total intensity (54%), global improvement
[1995] veterans and civilians assessments not therapy (59%), and global severity (55%) scores
with mixed trauma blinded 2. Significant, somewhat smaller reduction in IES (42%) and PK (48%)
(N=34, 28 male/6 3. Significant, somewhat smaller reduction on SCL-90 (43%), and BDI (39%)
female) 4. 26 or 34 (76%) participants no longer met PTSD diagnostic criteria
3. Thompson et al. Civilians with mixed Uncontrolled, Multicomponent 8 weekly sessions 1. Significant reduction in CAPS total score (35%)
[1995] trauma (N=23, 17 assessments not cognitive-behavioral 2. Significant, somewhat larger reduction on IES (42%)
male/6 female) blinded protocol (imaginal 3. Comparable reduction on SCL-90 (38%), larger reduction on GHQ (61%)
and in vivo exposure,
cognitive
restructuring)
4. Boudewyns and Male combat veterans Randomized, 1. EMDR 5-7 EMDR or 1. Significant reduction in CAPS total score for all three groups (EMDR=33%,
Hyer [1996] (N=61) controlled trial, 2. exposure exposure sessions exposure=21%, routine care=17%), but no significant between-groups
assessments control in 6 weeks differences
blinded 3. routine 2. No significant reduction on IES
clinical care (group 3. Significant between-groups differences on POMS anxiety scale and heart rate
therapy without reactivity, with EMDR and exposure group showing reduction in scores and
exposure) no-exposure control group showing slight increase
5. Carlson et al. Male combat veterans Single-subject EMDR 12 sessions, 2 1. At 3-month followup, reduction in CAPS total score across 4 participants
[1996] (N=4) replication sessions per week ranged from 34–100%, with 3 of 4 showing > 80% improvement
series 2. Comparable reduction on IES (34–88%), but smaller reduction on Mississippi
Scale (6–46%)
3. More variable outcome on BDI and STAI-S and STAI-T (1 participant
showing slight increase on these scales, other 3 showing reduction of 50–
100% reduction on BDI and 8–41% on STAI)
6. Frueh et al. [1996] Male combat veterans Uncontrolled, Multicomponent 29 sessions in 1. Trend for reduction in CAPS total score (21%)
(N=11) assessments not cognitive- 17 weeks 2. Significant reduction on HAM-A (31%), CGI (34%) and heart rate reactivity
blinded behavioral protocol (14%)
(education, imaginal 3. No significant reduction on BDI, SPAI, or STAXI
and in vivo exposure,
social skills training,
anger management)
7. Hall and Female sexual 1. Case study Cognitive processing 17 weekly sessions 1. Reduction in CAPS-2 total scores (48%)
Henderson [1996] abuse victim 2. CAPS-2 therapy 2. Smaller reduction on MPSS-SR (31%)
(N=1) 3. Somewhat smaller reduction on SCL-90 (26%), and slight increase on BDI
(+18%)
(continued)
TABLE 2. (Continued).
Number
of sessions/
Authors (year) Participants Design Intervention duration Key CAPS-related findings
8. Pitman et al. Male combat veterans Crossover, EMDR, with and 12 weekly sessions 1. Little change in CAPS total score, with slight increase after eye movement
[1996] (N=17) assesssments without eye move- (6 in each condition and slight decrease after no eye movement condition
blinded ment condition) 2. Comparable result for Mississippi Scale, with slight increase after both
conditions, and mixed results for IES, with significant reductions for intrusion
or avoidance subscale depending on condition and trauma memory evaluated
3. Significant reduction on SCL-90 in eye movement condition
4. Therapy integrity ratings significantly correlated with CAPS change score in
both conditions (.55, .62), but with SCL-90 in eye movement condition only
(.69)
9. Thrasher et al. Male physical assault Single-subject Cognitive 10 sessions 1. Substantial reduction in CAPS total score for both participants (67–90%)
[1996] victims (N=2) replication series restructuring 2. Comparable reduction in IES (76–91%) and PSS (79–80%)
3. Comparable reduction on BDI (65–92%)
10. Fontana and Male combat veterans Quasi-experimental 1. Long-stay Variable 1. Significant reduction in CAPS total score for all three programs (long-stay=
Rosenheck [1997] (N=785) program evaluation PTSD program (approximately 13%, short-stay=19%, psychiatric=16%)
2. Short-stay 1–3 months) 2. Significant between-groups effect, with veterans in short-stay PTSD
PTSD program programs and general psychiatric inpatient units showing greater
3. General improvement
psychiatric unit 3. No significant reduction on Mississippi Scale (long-stay=0%, short-stay=3%,
psychiatric=3%)
4. Significant, larger reduction on ASI psychiatric score (long-stay=24%, short-
stay=26%, psychiatric=26%) and significant, smaller reduction on BSI (long-
stay=2%, short-stay=11%, psychiatric=12%), both with significant between-
groups effects similar to those for the CAPS
11. Hicling and Motor vehicle Uncontrolled Multi- 10 weekly sessions 1. Significant reduction in CAPS total score (68%)
Blanchard [1997] accident victims trial, nonblinded component 2. Comparable reduction on IES (66%), significant but smaller reduction on
(N=10, 1 male/ assessments cognitive- PCL (39%)
9 female) behavioral 3. Significant, somewhat smaller reduction on BDI (50%) and significant,
protocol smaller reduction on STAI-S (19%) and STAI-T (20%)
(education, 4. Five of 8 participants with full PTSD and 1 of 2 with subsyndromal
relaxation, PTSD at pre-test no longer met diagnosis at post-test; 3 of 8 with full
exposure, PTSD at pre-test were subsyndromal at post-test
cognitive
restructuring)
12. Carlson et al. Male combat veterans Randomized, 1. EMDR 12 sessions in 1. Significant Group x Time interaction at 3-month followup, with EMDR
[1998] (N=35) controlled trial, 2. Biofeedback- 6 weeks group showing significantly greater reduction on CAPS total score (69%)
non-blinded assisted relaxa- compared to relaxation group (20%0
assessments tion 2. Similar pattern with smaller reduction on IES (EMDR=45%, relaxation=14%)
except at 9- 3. Routine 3. Similar pattern with smaller reduction on BDI (EMDR=57%, relaxation=
month follow- clinical care 22%), and substantially smaller reduction in both groups on STAI-S
up (EMDR=14%), relaxation=18%) and STAI-T (EMDR=22%, relaxation=11%)
4. Of participants completing first follow-up 7 or 9 (78%) in EMDR group
versus 2 of 9 (22%) in relaxation group no longer met PTSD diagnostic
criteria
(continued)
TABLE 2. (Continued).
Number
of sessions/
Authors (year) Participants Design Intervention duration Key CAPS-related findings
13. Conlon et al. Motor vehicle Randomized, 1. Debriefing Single 30- 1. Significant reduction in CAPS total score for total sample (53%; debriefing=
[1998] accident victims, 1 controlled trial, 2. Monitoring minute 70%, monitoring=36%), but no significant between-groups difference at
week post-accident non-blinded (assessment- debriefing follow-up
(N=40, 19 male/21 assessments only control) session 2. Interpretation of CAPS change scored is somewhat ambiguous because
female) CAPS-2 used at baseline and CAPS-1 used at followup
3. Comparable reduction on IES (total sample=50%; debriefing=55%,
monitoring=44%)
14. Lazrove et al. Civilians with mixed Uncontrolled EMDR 3 weekly sessions 1. Substantial reduction in CAPS total score (70%)
[1998] trauma (N=8, 2 male/ trial, assessments 2. Larger reduction on IES-R (87–.96% for intrusion, avoidance, hyperarousal
6 female) conducted by non- subscales)
treating research 3. Comparable reduction on BDI (68%), smaller reduction on SCL-90 (42%)
assistant 4. All of the participants who completed treatment no longer met diagnostic
criteria for PTSD
15. Lubin et al. Female victims of Uncontrolled Trauma-focused, 16 weekly sessions 1. Significant reduction in CAPS total score (39%)
[1998] mixed civilian trauma trial, assessments cognitive behavioral 2. Significant, smaller reduction on Civilian Mississippi Scale (9%) and IES
conducted by non- group therapy (16%)
treating research 3. Significant, somewhat smaller reduction on BDI (33%) and smaller reduction
assistants on DES (21%) and SCL-90 (23%)
16. Marks et al. Civilians with mixed 1. Randomized, 1. Imaginal and 10 sessions in 1. Significant reduction in CAPS-2 total score, with effect sizes ranging from
[1998] trauma (N=87, 56 male/ controlled trial, in vivo exposure an average of 1.30 to 2.00 for three active intervention groups and .60 for relaxation group
31 female) blinded assessments 2. Cognitive 16 weeks 2. Significant between-groups effect for CAPS-2 total score, with greater
2. CAPS-2 restructuring reduction for three active intervention groups pooled versus relaxation group
3. Exposure plus 3. Similar within-groups and between-groups results for IES (within-groups
cognitive restruc- effect sizes from 1.30 to 1.50 for active intervention groups, .08 for relaxation
turing group)
4. Relaxation 4. Similar within-groups and between-groups results for BDI (within-groups
(placebo control) effects sizes from 1.20 to 1.70 for active intervention groups, .07 for
relaxation group)
5. With improvement defined as > 2 SDs, 47–53% of participants in active
intervention groups showed improvement on CAPS-2 total score, versus 15%
in relaxation groups. Somewhat higher rates found in IES (50–60% for active
intervention groups, 20% for relaxation group)
6. 63–75% of participants in active intervention groups versus 55% in relaxation
group no longer met diagnostic criteria for PTSD
17. Pantalon and Male combat veterans 1. Crossover, 1. Implosive therapy 12 weekly sessions 1. Reduction in CAPS-2 score (reexperiencing and avoidance only; hyperarousal
Motta [1998] (N=6) single-subject (imaginal exposure) scores not reported) ranged from 46–88% (M=71%) across the six participants
replication series, 2. Anxiety management 2. Lower but substantial reduction on PCL (8–100%, M=50% across the six
non-blinded training participants; reexperiencing and avoidance only; hyperarousal scores not
assessments reported)
2. CAPS-2
(continued)
TABLE 2. (Continued).
Number
of sessions/
Authors (year) Participants Design Intervention duration Key CAPS-related findings
18. Rothbaum et al. Male combat veterans Case study Virtual reality 14 sessions, 2 1. Reduction in CAPS total score (34%)
[1999] (N=1) exposure session per week 2. Larger reduction on IES (45%)
3. Smaller on BDI (19%) and STAXI-T (21%), substantially larger reduction on
STAXI-S (63%)
19. Tarrier et al. Civilians with mixed Randomized, 1. Imaginal Average of 10-12 1. Significant within-groups reduction in CAPS total score for both groups
[1999] trauma (N=62, 36 male/ controlled trial, exposure sessions over 6 (32% for exposure group, 35% for cognitive therapy group), but no between
26 female) assessments 2. Cognitive months groups difference
blinded therapy 2. Comparable within-groups reduction on IES (31–33% for intrusion, 25–34% for
avoidance), somewhat smaller reduction on Penn Inventory (22–27%); no
between-groups difference on either
3. Somewhat smaller within-groups reduction on BDI (27–31%) and BAI (23–
25%); no between-groups difference
4. Comparable to somewhat larger within-groups reduction on GHQ (30–46%),
but no between-groups difference
5. 59% of exposure group versus 42% of cognitive therapy group no longer met
diagnostic criteria for PTSD
*ASI, Addiction Severity Index; BDI, Beck Depression Inventory; BSI, Brief Symptom Inventory; DES, Dissociative Experiences Scale; EMDR, Eye Movement Desensitization and Reprocessing; CHQ, General Health
Questionnaire; HAM-A, Hamilton Rating Scale for Anxiety; IES, Impact of Event Scale; MPSS-SR, Modified PTSD Symptom Scale — Self-Report; PCL, PTSD Checklist; PK, Keane MMPI PTSD scale; POMS,
Profile of Mood States, PSS, PTSD Symptom Scale; SCL-90, Symptom Checklist-90; SPAI, Social Phobia and Anxiety Inventory; STAI, State-Trait Anxiety Inventory; STAXI, State-Trait Anger Expression Inventory.
152 Weathers et al.
Studies 1, 8, and 10) found greater reduction on the
HAM-A relative to the CAPS, with differences rang-
ing from 2–9 percentage points. The fourth study
found a 5 percentage point greater reduction on the
CAPS. One pharmacological study (Study 5) found a
greater reduction on the CAPS relative to the STAI-S.
Three psychosocial studies (Table 2, Studies 5, 11, and
12) included the STAI–S, two included the STAI-T
(Studies 11 and 12), and one (Study 19) included the
BAI. In each case the CAPS showed greater reduction,
ranging from 2–59 percentage points.
CAPS versus global measures of distress and impair-
ment. In the pharmacological studies, reduction in CAPS
scores were accompanied by global measures of func-
tioning, including the CGI in the six studies that em-
ployed it (Table 1, Studies 3, 5, 7, and 8–10), the CIS
(Study 6), and a relatively stringent consensus definition
of treatment response (Study 1). Five psychosocial stud-
ies (Table 2, Studies 2, 3, 7, 13, and 15) included the
SCL-90 and one (Study 10) included the BSI. In each
case the CAPS showed greater reduction, ranging from
3–28 percentage points. In contrast, two studies (Studies
3 and 19) found greater reduction on the GHQ relative
to the CAPS, and one study (Study 10) found greater
reduction on the ASI psychiatric score.
Discussion. The 29 treatment outcome studies re-
viewed in this section provide ample evidence of the
sensitivity of the CAPS to clinical change. We summa-
rize the results by returning to the four hypothesized
results discussed at the outset of this section. First,
there was clear and consistent evidence of within-
groups effects in both the pharmacological and the psy-
chosocial treatment studies. Stronger within-groups
effects were found in the psychosocial studies. This
could be due to the fact that with one exception the
drugs used in the studies reviewed were all antidepres-
sants, and their efficacy for treating PTSD has not been
clearly established. The symptom relief they bring
about may be due more to their antidepressant effects
rather than to specific effects on PTSD symptoms such
as re-experiencing and effortful avoidance. In contrast,
all of the psychosocial interventions involved some type
of trauma-specific component, and most included some
form of direct therapeutic exposure or cognitive pro-
cessing, which have been shown to have specific effects
on PTSD symptoms. This finding could also be due to
the fact that in general the psychosocial interventions
involved considerably more patient-therapist contact
than did the pharmacological trials.
Second, there was some evidence of between-groups
effects, although relatively few studies included a com-
parison condition. Two of the three pharmacological
trials with a placebo control found greater reduction on
the CAPS in participants who received the drug. Re-
sults were more inconsistent for the psychosocial trials.
Only two of the six randomized trials, plus one quasi-
experimental program evaluation, found a significant
between-groups effect. However, two of the non-sig-
nificant trials employed quite limited interventions, and
a third trial compared two active interventions, expo-
sure and cognitive restructuring. Clearly, more ran-
domized, placebo-controlled trials are needed before
this issue can be resolved.
Third, reduction in CAPS scores was mirrored by
reduction in self-report measures of PTSD, particu-
larly the IES. The CAPS showed a slightly greater
reduction than the IES in 2 of 3 pharmacological
studies and 7of 11 psychosocial studies, although
the margins, especially in the pharmacological tri-
als, were generally small. Fourth, there was some
evidence of greater reduction on the CAPS than on
measures of depression, anxiety, and global distress,
particularly on self-report measures such as the
BDI, STAI, and SCL-90.
Finally, in commenting about the populations
studied, although the CAPS was developed in a male
combat veteran population, and many of the early
studies focused exclusively on this population, the
CAPS has now been extended to increasingly diverse
samples that include females and victims of various
types of civilian trauma. Of the studies reviewed in
this section, 11 of 29 included at least some females
and 15 of 29 included at least some participants with
civilian trauma.
VALIDITY EVIDENCE FROM
CASE-CONTROL DESIGNS
In this section, we consider validity evidence from
studies in which participants were designated as
PTSD-positive (“cases”) or PTSD-negative (“con-
trols”) based on the CAPS and then compare this evi-
dence on some biological or psychological measure or
experimental task. Such case-control studies were too
numerous and diverse to summarize briefly. Instead,
we describe several representative examples from dif-
ferent research domains to illustrate that groups
formed on the basis of a CAPS diagnosis differ in con-
ceptually meaningful ways on a variety of characteris-
tics or behaviors.
The first example involves the psychophysiology
of PTSD. Physiological reactivity to reminders of
the trauma is a core symptom of PTSD, and a
growing number of studies have found that indi-
viduals with PTSD show greater reactivity than
those without PTSD in laboratory-based physi-
ological assessments. Much of the early work was
conducted with male combat veterans, but more re-
cent studies have examined male and female victims
of civilian trauma. Blanchard et al. [1996a] used the
CAPS to classify 105 male and female motor vehicle
accident victims as PTSD, subsyndromal PTSD,
and non-PTSD. They also included a control group
of 54 participants who had not experienced an acci-
dent. They found that compared to participants
without PTSD, those with PTSD showed a signifi-
cantly greater increase in heart rate in response to
brief audiotapes depicting each participant’s unique
traumatic experience. They also found that an in-
Review Article: Clinician-Administered PTSD Scale 153
crease of two beats per minute had reasonable diag-
nostic utility, yielding 69% sensitivity and 78%
specificity among accident victims.
The second example comes from a more recent line
of research on auditory event-related potentials (ERP)
in PTSD. Several different investigators have docu-
mented abnormal ERPs in individuals with PTSD and
have suggested that such characteristic responses may
be associated with the attention and concentration dif-
ficulties often seen in PTSD. Metzger et al. [1997]
used the CAPS to classify male Vietnam combat veter-
ans as PTSD or non-PTSD groups and then further
divided the PTSD participants into medicated and
unmedicated groups. Administering a three-tone audi-
tory “oddball” task, they found significantly smaller
P3 amplitudes in the unmedicated PTSD group, rela-
tive to the medicated PTSD group and the non-
PTSD controls.
The third example involves research on the associa-
tion of chronic PTSD and physical health problems.
Beckham et al. [1998] used the CAPS to classify 276
male Vietnam combat veterans as PTSD or non-
PTSD, then assessed participants’ current health status
and reviewed their medical records. Health measures
included health complaints, current and lifetime physi-
cal conditions, number of physician-rated medical cat-
egories, and total number of physician-rated illnesses.
After controlling for a variety of potentially confound-
ing third variables, including age, socioeconomic status,
ethnicity, combat exposure, alcohol problems, and
smoking history, they found that veterans with PTSD
had significantly more health problems across all indi-
cators compared to veterans without PTSD.
The last example represents an effort to identify
potential risk factors for PTSD. Yehuda et al. [1995]
used the CAPS to classify a community sample of
72 Nazi concentration camp survivors as PTSD or
non-PTSD. They also included a comparison group
of 19 demographically matched participants who
had not experienced the Holocaust. The purpose of
the study was to examine the relationships among
lifetime trauma history, recent stressful life events,
and severity of current PTSD symptoms. As ex-
pected, Yehuda et al. [1995] found that Holocaust
survivors with PTSD had greater lifetime trauma
exposure and more recent stressful life events than
did survivors without PTSD or comparison partici-
pants. By using the CAPS as a continuous measure
of PTSD symptom severity, they found that lifetime
trauma was significantly associated with avoidance
and hyperarousal, but not with re-experiencing,
within a combined sample of all Holocaust survi-
vors. In a similar analysis, they found that recent
stressful life events were significantly associated
with all three CAPS symptom clusters.
These examples, and the other case-control studies
that we did not discuss, provide additional evidence that
the CAPS is a valid measure of PTSD diagnostic status
and symptom severity. They demonstrate that when the
CAPS is used to classify trauma-exposed individuals as
PTSD or non-PTSD, the resulting groups differ sig-
nificantly in a theoretically consistent way on key de-
pendent variables.
GENERAL DISCUSSION AND
RECOMMENDATIONS
In the 10 years since it was developed, the CAPS
has proven to be a psychometrically sound, practical,
and flexible structured interview that is well-suited for
a wide range of clinical and research applications in
the field of traumatic stress. Moreover, it has been
successfully used with many different traumatized
populations. It has excellent reliability, yielding con-
sistent scores across items, raters, and testing occa-
sions. There is also considerable validity evidence that
supports the use of the CAPS as a measure of PTSD
diagnostic status and symptom severity. Evidence of
content validity derives first from its direct correspon-
dence with the DSM-IV diagnostic criteria for PTSD
and second from the fact that it was developed by ex-
perts in the field of traumatic stress and revised based
on feedback from many clinicians and investigators
who used it in real-world settings. Evidence from a
growing number of psychometric investigations indi-
cates it has strong convergent and discriminant valid-
ity, strong diagnostic utility, and is sensitive to clinical
change. In addition, factor analyses, especially confir-
matory factory analyses, have shown that the factor
structure of the CAPS corresponds well to current
conceptualizations of PTSD. Finally, when the CAPS
is used in case-control designs, individuals designated
as PTSD differ from those without PTSD in predict-
able, theoretically meaningful ways. Clearly more re-
search on the CAPS is needed, but at this point the
CAPS is the most extensively investigated structured
interview for PTSD.
Criticism of the CAPS tends to focus on three con-
cerns. The first concern is that the CAPS is cumber-
some and lengthy. In response, the CAPS clearly is
longer on paper than other PTSD interviews, but it
does not necessarily take longer to administer. Most of
the CAPS questions are optional probes, only some of
which would likely be administered during a given in-
terview. A standard administration of the CAPS in-
volves asking the initial probe under frequency for
each item. With an articulate, motivated respondent
this single questions may elicit all the information
necessary to rate both the frequency and intensity of a
given symptom. All other probes are to be used only
if: a) a response is incomplete, vague, confusing, or in
some way insufficient to make a rating, and therefore
needs to be clarified, or b) the respondent does not
understand what is being asked.
In our experience, even with ideal respondents,
some degree of clarification is inevitable. To enhance
uniformity of administration, we have included a
154 Weathers et al.
number of follow-up probes that address the most
common points of clarification. This reduces variabil-
ity due to idiosyncratic questioning across different in-
terviewers and provides a helpful structure for less
experienced interviewers. Furthermore, the CAPS was
designed as a comprehensive yet flexible instrument
that would meet the demand of almost any PTSD as-
sessment task, including diagnosis, evaluating symp-
tom severity, and conducting a functional analysis of
symptoms for case conceptualization and treatment
planning. Therefore, in some assessment contexts it
may opted not to assess Criterion A, elicit descriptive
examples of symptoms, administer the global ratings
or the guilt and dissociation items, or rate lifetime
PTSD.
The second concern, closely related to the first, is
that the CAPS is too complicated and difficult to
learn. In response, our own experience, based on doz-
ens of training sessions, is that after a 2-hour orienta-
tion trainees naïve to the CAPS can make highly
reliable ratings of a role-played interview. With some
self-study and a few practice interviews, they can
achieve a uniform, clinically sensitive administration.
CAPS trainees, including those with little or no expe-
rience with structured interviews or assessing PTSD,
typically find that the CAPS is very straightforward to
learn. In fact, less experienced interviewers tend to
have the most favorable responses because they appre-
ciate the structure the CAPS provides.
The third concern centers on the question of whether
frequency and intensity ratings overlap to such an extent
as to be essentially redundant. Clearly, they appear to be
strongly correlated at the syndrome level and even at the
symptom cluster level. At the item level, however, the
correlations between frequency and intensity are moder-
ate, suggesting that they measure correlated but distinct
dimensions. We have several responses to this concern.
First, the separate assessment of frequency and intensity
explicitly defines what is meant by symptom severity,
thereby reducing variability in clinical judgment, espe-
cially among less experienced interviewers. Second, this
is a meaningful, theoretical distinction, employed suc-
cessfully for example in the substance abuse literature,
where typologies of drinkers are based on how often a
person drinks, as well as how much they consume at any
given setting. Third, adding frequency and intensity to-
gether yields a nine-point scale (0–8) that allows finer
gradations of severity. This increases variance attribut-
able to individual differences, thereby avoiding a restric-
tion of range that could lower estimates of reliability and
validity. Fourth, it allows the assessment of the differen-
tial impact of treatment on the frequency versus the in-
tensity of symptoms.
Last, we close with some recommendations for the
use of the CAPS in clinical research and the presenta-
tion of CAPS data in empirical reports. First, for
newly initiated research, investigators should use what
is now the sole version of the CAPS, the combined
DSM-IV version, and explicitly identify it as such. For
research already underway or completed, investigators
should explicitly identify the version used, either the
CAPS-1 or CAPS-2 (DSM-III-R versions) or the
CAPS-DX or CAPS-SX (DSM-IV versions). Also, if
the CAPS is used as a diagnostic measure, investiga-
tors should specify the scoring rule used to obtain a
diagnosis. Second, investigators should briefly specify
the experience and training of CAPS interviewers,
both in terms of their general background in psycho-
pathology and structured interviewing, and in terms of
their specific experience with the CAPS. Also, when-
ever possible they should attempt to collect and report
reliability data on the interviewers and participants in-
volved. Even something as modest as inter-rater reli-
ability on a small number of audiotaped interviews is
helpful for documenting the quality of the CAPS data.
Third, investigators should take greater advantage
of the flexibility of the CAPS in analyzing their data.
Some examples include a) using multiple CAPS scor-
ing rules and comparing the results for lenient, mod-
erate, and stringent rules; b) using the CAPS as both a
dichotomous and a continuous measure, reporting not
only diagnostic status but symptom severity scores,
which would be valuable for comparing findings
across studies; c) breaking out CAPS symptom sever-
ity scores into the three DSM-IV symptom clusters
and examining the results by cluster; d) examining the
symptom clusters further by separating Cluster C into
effortful avoidance (C1 and C2) and emotional numb-
ing; and e) dividing scores even further into frequency,
intensity, and severity scores for each of the symptom
clusters. Finally, although considerable progress has
been made in the development and evaluation of
PTSD assessment measures, including the CAPS, reli-
ance on a single instrument should be avoided. We ad-
vocate multimodal assessment of PTSD, an approach
that relies on converging evidence from multiple
sources, and we encourage investigators to include
multiple measures of PTSD and comorbid disorders
whenever possible.
Acknowledgments. The authors wish to thank
Alethea Smith for her assistance with the literature
search and manuscript preparation.
REFERENCES
American Psychological Association. 1999. Standards for Educa-
tional and Psychological Testing. Washington DC: Author.
Baker DG, Diamond BI, Gillette GM, Hamner MB, Katzelnick D,
Keller TW, Mellman TA, Pontius EB, Rosenthal M, Tucker P,
Van der Kolk BA, Katz RJ. 1995. A double-blind, randomized,
placebo-controlled, multi-center study of brofaromine in the
treatment of post-traumatic stress disorder. Psychopharmacol-
ogy 122:386–389.
Beck AT, Epstein N, Brown G, Sterr RA. 1988. An inventory for
measuring clinical anxiety: psychometric properties. J Consult
Clin Psychol 56:893–897.
Beckham JC, Moore SD, Feldman ME, Hertzberg MA, Kirby AC,
Fairbank JA. 1998. Health status, somatization, and severity of
posttraumatic stress disorder in Vietnam combat veterans with
posttraumatic stress disorder. Am J Psychiatry 155:1565–1569.
Review Article: Clinician-Administered PTSD Scale 155
Blake DD, Weathers FW, Nagy LM, Kaloupek DG, Klauminzer
G, Charney DS, Keane TM. 1990. A clinician rating scale for
assessing current and lifetime PTSD: the CAPS-1. Behav Ther
13:187–188.
Blake DD, Weathers FW, Nagy LM, Kaloupek DG, Gusman FD,
Charney DS, Keane TM. 1995. The development of a Clinician-
Administered PTSD Scale. J Trauma Stress 8:75–90.
Blanchard EB, Hickling EJ, Taylor AE, Forneris CA, Loos WR,
Jaccard J. 1995a. Effects of varying scoring rules of the Clini-
cian-Administered PTSD Scale (CAPS) for the diagnosis of
post-traumatic stress disorder in motor vehicle accident victims.
Behav Res Ther 33:471–475.
Blanchard EB, Hickling EJ, Taylor AE, Loos WR. 1995b. Psychiat-
ric morbidity associated with motor vehicle accidents. J Nerv
Ment Dis 183:495–504.
Blanchard EB, Hickling EJ, Buckley TC, Taylor AE, Vollmer A,
Loos WR. 1996a. Psychophysiology of posttraumatic stress dis-
order related to motor vehicle accidents: Replication and exten-
sion. J Consult Clin Psychol 64:742–751.
Blanchard EB, Jones-Alexander J, Buckley TC, Forneris CA.
1996b. Psychometric properties of the PTSD Checklist (PCL).
Behav Res Ther 34:669–673.
Boudewyns PA, Hyer, Leon A (Lee). 1996. Eye movement desensi-
tization and reprocessing (EMDR) as treatment for post-trau-
matic stress disorder (PTSD). Clin Psychol Psychother 3:
185–195.
Boudewyns PA, Stwertka SA, Hyer, Leon A (Lee), Albrecht JW,
Sperr EV. 1993. Eye movement desensitization for PTSD of
combat: a treatment outcome pilot study. Behav Ther 16:29–33.
Bouwer C, Stein DJ. 1998. Survivors of torture presenting at an
anxiety disorders clinic: symptomatology and pharmacotherapy. J
Nerv Ment Dis 186:316–318.
Buckley TC, Blanchard EB, Hickling EJ. 1998. A confirmatory fac-
tor analysis of posttraumatic stress symptoms. Behav Res Ther
36:1091–1099.
Busuttil W, Turnbull GJ, Neal LA, Rollins JW, West AG, Blanch
N, Herepath R. 1995. Incorporating psychological debriefing
techniques within a brief group psychotherapy programme for
the treatment of post-traumatic stress disorder. Br J Psychiatry
167:495–502.
Cañive JM, Clark RD, Calais LA, Qualls CR, Tuason VB. 1998.
Bupropion treatment in veterans with posttraumatic stress disor-
der: an open study. J Clin Psychopharmacol 18:379–383.
Carlson JG, Chemtob CM, Rusnak K, Hedlund NL. 1996. Eye
movement desensitization and reprocessing treatment for com-
bat PTSD. Psychotherapy 33:104–113.
Carlson JG, Chemtob CM, Rusnak K, Hedlund NL, Muraoka MY.
1998. Eye movement desensitization and reprocessing (EMDR)
treatment for combat-related posttraumatic stress disorder. J
Trauma Stress 11:3–24.
Clark RD, Cañive JM, Calais LA, Qualls CR, Tuason VB. 1999.
Divalproex in posttraumatic stress disorder: an open-label clini-
cal trial. J Trauma Stress 12:395–401.
Conlon L, Fahy TJ, Conroy RM. 1999. PTSD in ambulant RTA
victims: a randomized controlled trial of debriefing. J Psychosom
Res 46:37–44.
Davidson JRT, Book SW, Colket JT, Tupler LA, Roth SH, David
D, Hertzberg MA, Mellman TA, Beckham JC, Smith RD,
Davison RM, Katz RJ, Feldman ME. 1997. Assessment of a new
self-rating scale for posttraumatic stress disorder. Psychol Med
27:153–160.
Derogatis LR. 1983. SCL-90-R administration, scoring, and proce-
dures manual-II for the revised version. Towson, MD: Clinical
Psychometric Research.
Fleming MP, Difede J. 1999. Effects of varying scoring rules of the
Clinician Administered PTSD Scale (CAPS) for the diagnosis of
PTSD after acute burn injury. J Trauma Stress 12:535–542.
Fontana A, Rosenheck RA. 1997. Effectiveness and cost of the in-
patient treatment of posttraumatic stress disorder: comparison of
three models of treatment. Am J Psychiatry 154:758–765.
Frueh BC, Turner SM, Beidel DC, Mirabella RF, Jones WJ. 1996.
Trauma Management Therapy: a preliminary evaluation of a
multicomponent behavioral treatment for chronic combat-re-
lated PTSD. Behav Res Ther 34:533–543.
Hall CA, Henderson CM. 1996. Cognitive processing therapy for
chronic PTSD from childhood sexual abuse: a case study. Coun-
sel Psychol Quart 9:359–371.
Hertzberg MA, Feldman ME, Beckham JC, Davidson JRT. 1996.
Trial of trazodone for posttraumatic stress disorder using a mul-
tiple baseline group design. J Clin Psychopharmacol 16:294–298.
Hamilton M. 1960. A rating scale for depression. J Neurol Neuro-
surg Psychiatry 23:56–62.
Hamilton M. 1969. Diagnosis and ratings of anxiety. Br J Psychia-
try 3:76–79.
Hertzberg MA, Feldman ME, Beckham JC, Moore SD, Davidson
JRT. 1998. Open trial of nefazodone for combat-related post-
traumatic stress disorder. J Clin Psychiatry 59:460–464.
Hickling EJ, Blanchard EB. 1997. The private practice psychologist
and manual-based treatments: post-traumatic stress disorder sec-
ondary to motor vehicle accidents. Behav Res Ther 35:191–203.
Hovens JEJM, Van der Ploeg HM, Bramsen I, Klaarenbeek MTA,
Schreuder BJN, Rivero VV. 1994. The development of the Self-
Rating Inventory for Posttraumatic Stress Disorder. Acta Psychiatr
Scand 90:172–183.
Hyer, Leon A (Lee), Summers MN, Boyd S, Litaker M, Boudewyns
PA. 1996. Assessment of older combat veterans with the Clini-
cian-Administered PTSD Scale. J Trauma Stress 9:587–593.
Katz RJ, Lott MH, Arbus P, Crocq L, Herlobsen P, Lingjaerde O,
Lopez G, Loughrey, Gerry C (Gerard), MacFarlane DJ, McIvor
R, Mehlum L, Nugent D, Turner SW, Weisath L, Yule W.
1994–1995. Pharmacotherapy of post-traumatic stress disorder
with a novel psychotropic. Anxiety 1:169–174.
Keane TM, Caddell JM, Taylor KL. 1988. Mississippi Scale for
Combat-Related Posttraumatic Stress Disorder: three studies in
reliability and validity. J Consult Clin Psychol 56:85–90.
Keane TM, Fairbank JA, Caddell JM, Zimering RT, Taylor KL,
Mora CA. 1989. Clinical evaluation of a measure to assess com-
bat exposure. Psychol Assess 1:53–55.
Keane TM, Malloy PF, Fairbank JA. 1984. Empirical development
of an MMPI subscale for the assessment of combat-related post-
traumatic stress disorder. J Consult Clin Psychol 52:888–891.
King DW, Leskin GA, King LA, Weathers FW. 1998. Confirma-
tory factor analysis of the Clinician-Administered PTSD Scale:
evidence for the dimensionality of posttraumatic stress disorder.
Psychol Assess 10:90–96.
Kulka RA, Schlenger WE, Fairbank JA, Hough RL, Jordan BK,
Marmar CR, Weiss DS. 1990. The National Vietnam Veterans
Readustment Study: tables of findings and technical appendices.
New York: Brunner/Mazel.
Lazrove S, Triffleman EG, Kite L, McGlashan TH, Rounsaville B.
1998. An open trial of EMDR as treatment for chronic PTSD.
Am J Orthopsychiatry 68:601–608.
Lubin H, Loris M, Burt J, Johnson DR. 1998. Efficacy of psycho-
educational group therapy in reducing symptoms of posttrau-
matic stress disorder among multiply traumatized women. Am J
Psychiatry 155:1172–1177.
Marks IM, Lovell K, Noshirvani H, Livanou M, Thrasher S. 1998.
Treatment of posttraumatic stress disorder by exposure and/or
cognitive restructuring: a controlled study. Arch Gen Psychiatry
55:317–325.
156 Weathers et al.
Marmar, CR, Weiss DS, Metzler TJ. 1997. The Peritraumatic Dis-
sociative Experiences Questionnaire. In: Wilson JP, Keane TM,
editors. Assessing psychological trauma and PTSD. New York:
Guilford Press. p 412–428.
Metzger LJ, Orr SP, Lasko NB, Pitman RK. 1997. Auditory event-
related potential to tone stimuli in combat-related posttraumatic
stress disorder. Biol Psychiatry 42:1006–1015.
Nagy LM, Morgan CA, Southwick SM, Charney DS. 1993. Open
prospective trial of fluoxetine for posttraumatic stress disorder. J
Clin Psychopharmacol 13:107–113.
Nagy LM, Blake DD, Schnurr P, Southwick SM, Charney D,
Weathers F, Horner B. 1999. The Clinician-Administered
PTSD Scale – Weekly Version (CAPS-2): Reliability and valid-
ity. Manuscript submitted.
Neal LA, Busuttil W, Herepath R, Strike PW. 1994. Development
and validation of the computerized Clinician Administered Post-
Traumatic Stress Disorder Scale-1-Revised. Psychol Med 24:
701–706.
Neal LA, Hill N, Hughes JC, Middleton A, Busuttil W. 1995. Con-
vergent validity of measures of PTSD in an elderly population of
former prisoners of war. Int J Geriatric Psychiatry 10:617–622.
Neal LA, Shapland W, Fox C. 1997. An open trial of moclobemide
in the treatment of post-traumatic stress disorder. Int Clin
Psychopharmacol 12:231–237.
Orr SP. 1997. Psychophysiologic reactivity to trauma-related imag-
ery in PTSD: diagnostic and theoretical implications of recent
findings. Ann NY Acad Sci 821:114–124.
Pantalon MV, Motta RW. 1998. Effectiveness of anxiety manage-
ment training in the treatment of posttraumatic stress disorder: a
preliminary report. J Behav Ther Exp Psychiatry 29:21–29.
Pitman RK, Orr SP, Altman B, Longpre RE, Poiré RE, Macklin
ML. 1996. Emotional processing during eye movement desensiti-
zation and reprocessing therapy of Vietnam veterans with chronic
posttraumatic stress disorder. Compr Psychiatry 37:419–429.
Radnitz CL, Schlein IS, Walczak S, Broderick CP, Binks TM,
Tirch DD, Willard J, Perez-Strumolo L, Festa J, Lillian LB,
Bockian N, Cytryn A, Green L. 1995. The prevalence of post-
traumatic stress disorder in veterans with spinal cord injury. SCI
Psychosocial Process 8:145–149.
Rothbaum BO, Hodges L, Alarcón RD, Ready DJ, Shahar F,
Graap K, Pair J, Hebert P, Gotz D, Wills B, Baltzell D. 1999.
Virtual reality exposure therapy for PTSD Vietnam veterans: a
case study. J Trauma Stress 12:263–271.
Shalev AY, Freedman SA, Peri T, Brandes D, Sahar T. 1997. Pre-
dicting PTSD in trauma survivors: prospective evaluation of self-
report and clinician-administered instruments. Br J Psychiatry
170:558–564.
Spielberger CD, Gorsuch RL, Lushene RE. 1970. STAI manual for
the State-Trait Anxiety Inventory. Palo Alto, CA: Consulting
Psychologists Press.
Standards for educational and psychological testing. 1999. Wash-
ington DC: American Educational Research Association. 194 p.
Tarrier N, Pilgrim H, Sommerfield C, Faragher B, Reynolds M,
Graham E, Barrowclough C. 1999. A randomized trial of cogni-
tive therapy and imaginal exposure in the treatment of chronic
posttraumatic stress disorder. J Consult Clin Psychol 67:13–18.
Taylor S, Kuch K, Koch WJ, Crockett DJ, Passey G. 1998. The
structure of posttraumatic stress symptoms. J Abnorm Psychol
107:154–160.
Thompson JA, Charlton PFC, Kerry R, Lee D, Turner SW. 1995.
An open trial of exposure therapy based on deconditioning for
post-traumatic stress disorder. Br J Clin Psychol 34:407–416.
Thrasher SM, Lovell K, Noshirvani M, Livanou M. 1996. Cogni-
tive restructuring in the treatment of post-traumatic stress disor-
der: two single cases. Clin Psychol Psychotherapy 3:137–148.
Van der Kolk BA, Dreyfuss D, Michaels MJ, Shera D, Berkowitz
R, Fisler RE, Saxe GN. 1994. Fluoxetine in posttraumatic stress
disorder. J Clin Psychiatry 55:517–522.
Weathers FW, Litz BT, Herman DS, Huska JA, Keane TM. 1993.
The PTSD Checklist (PCL): Reliability, validity, and diagnostic
utility. Paper presented at the 9th Annual Meeting of ISTSS.
Weathers FW, Blake DD, Krinsley KE, Haddad W, Ruscio AM,
Keane TM, Huska JA. 1999a. Reliability and validity of the clini-
cian-administered PTSD scale. Manuscript submitted for publi-
cation.
Weathers FW, Ruscio AM, Keane TM. 1999b. Psychometric prop-
erties of nine scoring rules for the Clinician-Administered Post-
traumatic Stress Disorder Scale. Psychol Assess 11:124–133.
Yehuda R, Kahan B, Schmeidler J, Southwick SM, Wilson S, Giller
EL. 1995. Impact of cumulative lifetime trauma and recent stress
on current posttraumatic stress disorder symptoms in Holocaust
survivors. Am J Psychiatry 152:1815–1818.
Zlotnick C, Davidson JRT, Shea MT, Pearlstein T. 1996. Valida-
tion of the Davidson Trauma Scale in a sample of survivors of
childhood sexual abuse. J Nerv Ment Dis 184:255–257.