JOURNAL OF PERSONALITY ASSESSMENT, 89(1), 41–55
Copyright C ?2007, Lawrence Erlbaum Associates, Inc.
The Shedler–Westen Assessment Procedure (SWAP):
Making Personality Diagnosis Clinically Meaningful
Department of Psychiatry
University of Colorado Health Sciences Center
Department of Psychology and Department of Psychiatry and Behavioral Sciences
There is a schism between science and practice in understanding and assessing personality.
Approaches derived from the research laboratory often strike clinical practitioners as clinically
na¨ ıve and of dubious clinical relevance. Approaches derived from clinical observation and
theory often strike empirical researchers as fanciful speculation. In this article, we describe an
approach to personality designed to bridge the science–practice divide. The Shedler–Westen
Assessment Procedure (SWAP; Shedler & Westen, 2004a, 2004b; Westen & Shedler, 1999a,
of clinical case description. In this article, we describe its use in diagnosis, case conceptualiza-
tion, and treatment planning. We review evidence for reliability, validity, and clinical utility.
Finally, in the article, we present a system for personality diagnosis, as an alternative to Di-
agnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; American Psychiatric
Association, 2000) Axis II, that is empirically grounded, clinically relevant, and practical for
routine use in both clinical and research contexts.
One of the greatest challenges facing psychiatry and psy-
chology is the growing schism between science and practice.
The schism is especially pronounced in conceptualizing and
assessing personality. For most clinical practitioners, per-
sonalitydiagnosis isataskrequiringjudgment and expertise.
Expert clinicians consider a wide range of psychological
data, attending not only to what patients say but also to how
they say it, and drawing complexly determined inferences
from patients’ accounts of their lives and relationships, from
their manner of interacting with the clinician, and from their
own emotional reactions to the patient (Westen & Arkowitz-
For example, clinicians tend not to assess lack of empa-
(PD), by administering self-report questionnaires or asking
patients direct questions (Westen, 1997). (Not only are nar-
cissistic patients unlikely to endorse such items, they may
well describe themselves as caring people and wonderful
friends.) An initial sign of lack of empathy on the part of
the patient is often a subtle sense on the part of the clinician
of being interchangeable or replaceable, of being treated as
a sounding board rather than as a fellow human being (for
empirical evidence, see Betan, Heim, Conklin, & Westen,
2005; for a clinical discussion, see McWilliams, 1994). The
clinician might go on to consider whether she consistently
feels this way with this particular patient and whether such
feelings are characteristic for her in her role as therapist. The
clinician might then become aware that the patient tends to
describe others more in terms of the functions they serve or
the needs they meet than in terms of who they are as peo-
dovetail with the facts the patient has provided about his life,
with the problems that led him to seek treatment, with in-
formation gleaned from family members or other collateral
contacts, and so on.
It is just such clinical judgment and inference that many
personality researchers eschew. As successive editions of
the Diagnostic and Statistical Manual of Mental Disorders
(DSM) have minimized the role of clinical inference, inves-
tigators have increasingly treated personality diagnosis as a
technical task of tabulating signs and symptoms with rela-
tively little consideration for how they fit together, the psy-
chological functions they serve, their meanings, the develop-
mental trajectory that gave rise to them, or the present-day
SHEDLER AND WESTEN
ment methods are designed to achieve interrater reliability
by minimizing the role of clinical judgment and substitut-
ing standardized questions and decision rules. Indeed, the
interviews are typically administered by research assistants
or trainees, not by experienced clinicians.
DSM and structured assessment procedures evolved as
they have for good reason. Prior to DSM–III, psychiatric
diagnosis was unsystematic, overly subjective, and of ques-
tionable scientific merit. It sometimes revealed more about
the clinician’s background and theoretical predilections than
it did about the patient’s personality dispositions. Structured
assessment methods evolved in the service of science and
in reaction against the unsystematic diagnostic methods of
the past. In the evolution of personality diagnosis from a
largely subjective, clinical enterprise to a largely technical,
research-driven enterprise, much has been gained and much
has been lost. The solution to the science–practice schism
cannot be to turn back the clock and abandon the scientific
advances of the past decades. Nor can it be to disregard the
cumulative insights of generations of clinical observers. The
solution, rather, may be a marriage of the best aspects of
clinical observation and empirical rigor.
In this article, we describe the Shedler–Westen Assess-
ment Procedure (SWAP; Shedler & Westen, 1998, 2004a,
2004b; Westen & Shedler, 1999a, 1999b), an approach to
and inference rather than eliminate it and combine the best
features of the clinical and empirical traditions in personality
assessment. It provides a means of assessing personality that
is both clinically relevant and empirically rigorous.
In this article, we (a) review problems with the DSM di-
agnostic system for PDs, (b) discuss the challenges of using
clinical observation and inference in research, (c) describe
the development of the SWAP as a method for systematiz-
ing clinical observation, (d) illustrate its use for diagnosis
and clinical case conceptualization, (e) review evidence for
reliability and validity, and (f) discuss recommendations for
revising Axis II for DSM-V.
WHY REVISE AXIS II?
The approach to PD diagnosis codified by DSM now finds
little favor with either clinicians or researchers. There is con-
sensus that DSM Axis II requires reconfiguration. Some of
the problems with Axis II include the following (see also
Clark, 1992; Grove & Tellegen, 1991; Jackson & Livesley,
1995; Livesley, 1995; Livesley & Jackson, 1992; Westen &
Shedler, 1999a, 2000; Widiger & Frances, 1985):
1. The diagnostic categories do not rest on a sound empir-
ical foundation and often disagree with findings from
cluster and factor analyses (Blais & Norman, 1977;
Clark,1992; Harkness, 1992; Livesley &Jackson, 1992;
2. DSM Axis II commits arbitrarily to a categorical diag-
nostic system. It may be more useful to conceptualize
borderline pathology, for example, on a continuum from
none through moderate to severe rather than classifying
borderline PD as present/absent (Widiger, 1993). This
same consideration applies to individual diagnostic cri-
teria. For example, just how little empathy constitutes
“lack of empathy?”
3. DSM Axis II lacks the capacity to weight criteria that
differ in diagnostic importance (Davis, Blashfield, &
4. Comorbidity between PD diagnoses is unacceptably
high. Patients who meet criteria for any PD often meet
criteria for four to six PDs (Blais & Norman, 1997;
Grilo, Sanislow, & McGlashan, 2002; Oldham et al.,
1992; Pilkonis et al., 1995; Watson & Sinha, 1998). This
suggests lack of discriminant validity of the diagnostic
constructs, assessment methods, or both.
5. In attempting to reduce comorbidity, DSM work groups
have gerrymandered diagnostic categories and criteria,
sometimes in ways faithful neither to clinical observa-
tion nor empirical data. For example, they excluded lack
of empathy and grandiosity from the diagnostic crite-
ria for antisocial PD to minimize comorbidity with nar-
cissistic PD, even though the traits apply to both PDs
(Westen & Shedler, 1999a, 1999b; Widiger & Corbitt,
criterion sets over time, progressively eroding the dis-
tinction between PDs (multifaceted syndromes encom-
passing cognition, affectivity, motivation, interpersonal
diagnostic criteria for paranoid PD, for example, are es-
ciousness. The diagnostic criteria no longer describe the
multifaceted personality syndrome recognized by most
experienced clinicians (Millon, 1990; Millon & Davis,
7. DSMAxisIIdoes not consider personalitystrengthsthat
of noting whether the patient has such positive qualities
as the capacity to love and sustain meaningful relation-
ships characterized by mutual caring and understanding.
8. DSM Axis II does not encompass the spectrum of
personality pathology clinicians see in practice. Among
patients receiving treatment for personality pathology,
fewer than 40% can be diagnosed on Axis II (Westen &
9. DSM Axis II diagnoses are not as clinically useful as
they might be. For example, knowing whether a patient
SHEDLER–WESTEN ASSESSMENT PROCEDURE (SWAP)
meets criteria for avoidant PD or dependent PD tells
one little about the function of the person’s symptoms,
which personality processes to target for treatment, or
how to treat them.
10. The algorithm used for diagnostic decisions (counting
symptoms) diverges from the methods clinicians use—
or could plausibly be expected to use—in real-world
practice. Cognitive research suggests that clinicians do
not make diagnoses by tabulating symptoms. Rather,
they gauge the overall “match” between a patient and a
cognitive template or prototype of the disorder (i.e., they
consider the features of a disorder as a configuration or
gestalt), or they apply causal theories that make sense of
the interrelations between symptoms (Blashfield, 1985;
Cantor & Genero, 1986; Kim & Ahn, 2002; Westen,
Heim, Morrison, Patterson, & Campbell, 2002).
11. PD assessment instruments do not meet standards for
reliability and validity normally expected in psycholog-
ical research. Questionnaires and structured interviews
show relatively weak convergence with one another and
with the longitudinal evaluation using all available data
(LEAD) standard (Perry, 1992; Pilkonis et al., 1995;
Skodol, Oldham, Rosnick, Kellman, & Hyler, 1991;
Spitzer, 1983; Westen, 1997). They also show poor
test–retest reliability at intervals greater than 6 weeks
(First et al., 1995; Zimmerman, 1994). Poor test–retest
reliability is especially problematic given that PDs are
by definition enduring and stable over time.1
Most of the proposed solutions to these problems share
the assumption that progress lies in further minimizing
the role of the clinician, either by developing increasingly
behavioral and less inferential diagnostic criteria or by by-
passing the clinician altogether through the use of self-report
instruments. These attempted solutions may, however, be
part of the problem. By eliminating clinical observation and
inference, investigators may inadvertently be eliminating
(Cousineau & Shedler, 2006; Shedler, Mayman, & Manis,
1993). An alternative to eliminating clinical inference is to
harness it for scientific use.
1Poor test–retest reliability has led some researchers to suggest that PDs
are less stable than previously believed. Such an interpretation of the data
seems inconsistent with the observations of virtually all clinical theorists.
A more viable hypothesis may be that the assessment instruments do not
capture core features of personality that are salient to clinicians who treat
patients with PDs and know them well. Specifically, the instruments may
overemphasize transient behavioral symptoms (such as self-cutting and sui-
cidality in borderline patients, which may emerge only when an attachment
relationship is threatened) and underemphasize underlying personality pro-
cesses that endure over time (such as affect dysregulation and feelings of
emptiness and self-loathing in borderline patients).
THE CHALLENGE OF CLINICAL DATA
The problem with clinical observation and inference is not
that it is inherently unreliable, as some investigators have as-
sumed (for a discussion and literature review, see Westen &
Weinberger, 2004). The problem is that it tends to come
in a form that is difficult to study systematically. Rulers
measure in inches and scales measure in pounds, but what
metric do psychotherapists share? Imagine three clinicians
reviewing the same case material. One might describe the
patient in terms of schemas and belief systems, another may
speak of conditioning history, and the third of conflicts and
It is not readily apparent whether the hypothetical clin-
icians can or cannot make similar observations. There are
threepossibilities:(1)They maybeobserving thesamething
but using different language and metaphor systems to de-
scribe it, (2) they may be attending to different aspects of
the clinical material, as in the parable of the elephant and the
blind men, and (3) they may not be able to make the same
observations at all. To determine whether the clinicians can
make the same observations and inferences, one must ensure
that they speak the same language and attend to the same
spectrum of clinical phenomena.
A STANDARD VOCABULARY FOR CASE
The SWAP is an assessment instrument designed to provide
clinicians of all theoretical orientations with a standard “vo-
cabulary” for case description. The vocabulary consists of
200 statements, each of which may describe a given patient
very well, somewhat, or not at all. The clinician describes
a patient by ranking or ordering the statements into eight
categories, from those that are most descriptive (assigned a
value of 7) to those that are not descriptive (assigned a value
of 0). Thus, the SWAP yields a score from 0 to 7 for each of
200 personality-descriptive variables. (A Web-based version
of the SWAP can be previewed at www.SWAPassessment.
The “standard vocabulary” of the SWAP allows clinicians
to provide in-depth psychological descriptions of patients in
a systematic and quantifiable form and ensures that all clini-
cians attend to the same spectrum of clinical phenomena (cf.
close to the data (e.g., “Tends to get into power struggles,”
or “Is capable of sustaining meaningful relationships char-
acterized by genuine intimacy and caring”), and statements
that require inference about internal processes are written in
clear, unambiguous language (e.g., “Tends to see own unac-
ceptable feelings or impulses in other people instead of in
him/herself”). Writing items in this jargon-free manner min-
imizes unreliable interpretive leaps and makes the item set
useful to clinicians of all theoretical perspectives.
SHEDLER AND WESTEN
Kernberg, O. (1975). Borderline conditions and pathological narcissism.
New York: Aronson.
Kernberg, O. (1984). Severe personality disorders. New Haven, CT: Yale
Kim, N. S., & Ahn, W. (2002). Clinical psychologists’ theory-based rep-
resentations of mental disorders predict their diagnostic reasoning and
memory. Journal of Experimental Psychology, 131, 451–476.
of personality disorder: Relation to self-reports and future research direc-
tions. Clinical Psychology: Science & Practice, 9, 300–311.
Kohut, H. (1971). The analysis of the self. New York: International Univer-
Krueger, R. F. (2002). The structure of common mental disorders. Archives
of General Psychiatry, 59, 570–571.
J. F., et al. (2006). Change in attachment patterns and reflective function
in a randomized control trial of transference-focused psychotherapy for
borderline personality disorder. Journal of Consulting and Clinical Psy-
chology, 74, 1027–1040.
Linehan, M. M. (1993). Cognitive-behavioral treatment of borderline per-
sonality disorder. New York: Guilford.
in psychotherapy with the SWAP–200: A case study. Journal of Person-
ality Assessment, 86, 23–32.
Livesley, W. J., & Jackson, D. N. (1992). Guidelines for developing, evalu-
ating, and revising the classification of personality disorders. Journal of
Nervous and Mental Disease, 180, 609–618.
Main, M., Kaplan, N., & Cassidy, J. (1985). Security in infancy, child-
hood, and adulthood: A move to the level of representation. Monographs
of the Society for Research in Child Development, 50(1–2, Serial No.
reliable and useful? Journal of Criminal Behaviour and Mental Health,
McCrae, R., & Costa, P. (1990). Personality in adulthood. New York:
McWilliams, N. (1994). Psychoanalytic diagnosis: Understanding person-
ality structure in the clinical process. New York: Guilford.
Millon, T. (1990). Toward a new psychology. New York: Wiley.
and standards. Journal of Abnormal Psychology, 100, 245–261.
Millon, T., & Davis, R. D. (1997). The place of assessment in clinical sci-
ence. In T. Millon (Ed.), The Millon inventories: Clinical and personality
assessment (pp. 3–20). New York: Guilford.
Morey, L. C. (1988). Personality disorders in DSM–III and DSM–III–R:
Convergence, coverage, and internal consistency. American Journal of
Psychiatry, 145, 573–577.
Morey, L. C. (1991). The Personality Assessment Inventory: Professional
manual. Odessa, FL: Psychological Assessment Resources.
(1992). Diagnosis of DSM–III–R personality disorders by two semistruc-
Perry, J. C. (1992). Problems and considerations in the valid assessment
of personality disorders. American Journal of Psychiatry, 149, 1645–
Perry, J. C., & Cooper, S. H. (1987). Empirical studies of psychologi-
cal defense mechanisms. In J. Cavenar & R. Michels (Eds.) Psychiatry.
Pilkonis, P. A., Heape, C. L., Proietti, J. M., Clark, S. W., McDavid, J. D., &
interviews for personality disorders. Archives of General Psychiatry, 52,
Pilkonis, P. A., Heape, C. L., Ruddy, J., & Serrao, P. (1991). Validity in
the diagnosis of personality disorders: The use of the LEAD standard.
Psychological Assessment, 31, 46–54.
Robins, E. & Guze, S. (1970). The establishment of diagnostic validity in
psychiatric illness: Its application to schizophrenia. American Journal of
Psychiatry, 126, 983–987.
Sawyer, J. (1966). Measurement and prediction, clinical and statistical. Psy-
chological Bulletin, 66, 178–200.
Shedler J., Mayman, M., & Manis, M. (1993). The illusion of mental health.
American Psychologist, 48, 1117–1131.
Shedler, J., & Westen, D. (1998). Refining the measurement of Axis II:
A Q-sort procedure for assessing personality pathology. Assessment, 5,
Shedler, J., & Westen, D. (2004a). Dimensions of personality pathology:An
alternative to the five factor model. American Journal of Psychiatry, 161,
Shedler, J., & Westen, D. (2004b). Refining DSM–IV personality disorder
diagnosis: Integrating science and practice. American Journal of Psychi-
atry, 161, 1350–1365.
Drs. Shedler and Westen reply. American Journal of Psychiatry, 162,
Shedler, J., & Westen, D. (2006). Personality diagnosis with the Shedler–
Westen Assessment Procedure (SWAP): Bridging the gulf between sci-
ence and practice. In Alliance Task Force (Ed.), Psychodynamic diag-
nostic manual (PDM) (pp. 573–613). Silver Spring, MD: Alliance of
Skodol, A., Oldham, J., Rosnick, L., Kellman, D., & Hyler, S. (1991).
Diagnosis of DSM-III-R personality disorders: A comparison of two
structured interviews. International Journal of Methods in Psychiatric
Research, 1, 13-26.
Spitzer, R. L. (1983). Psychiatric diagnosis: Are clinicians still necessary?
Comprehensive Psychiatry, 24, 399–411.
Spitzer, R. L., First, M. B., Shedler, J., Westen, D., & Skodal, A (2006).
Clinical utility of five dimensional systems for personality diagnosis: A
“consumer preference” study. Manuscript submitted for publication.
and researchers. Washington, DC: American Psychiatric Press.
Watson, D., & Sinha, B. K. (1998). Comorbidity of DSM–IV personality
disorders in a nonclinical sample. Journal of Clinical Psychology, 54,
Westen, D. (1991). Social cognition and object relations. Psychology Bul-
letin, 109, 429–455.
Westen, D. (1997). Divergences between clinical and research methods
for assessing personality disorders: Implications for research and the
evolution of Axis II. American Journal of Psychiatry, 154, 895–903.
Westen, D. (2002). Clinical Diagnostic Interview. Unpublished manual,
Emory University. Retrieved June 4, 2007, from www.psychsystems.
Westen, D., & Arkowitz-Westen, L. (1998). Limitations of Axis II in di-
agnosing personality pathology in clinical practice. American Journal of
Psychiatry, 155, 1767–1771.
pathology. British Journal of Psychiatry, 186, 227–238.
structure as a context for psychopathology. In R. F. Krueger & J. L.
Tackett (Eds.), Personality and psychopathology (pp. 335–384). New
Westen, D., Heim, A. K., Morrison, K., Patterson, M., & Campbell, L.
ing approach. In L. E. Beutler & M. Malik (Eds.), Rethinking the DSM:
SHEDLER–WESTEN ASSESSMENT PROCEDURE (SWAP)
A psychological perspective (pp. 221–250). Washington, DC: American
Psychological Association Press.
and social cognition in borderlines, major depressives, and normals: A
TAT analysis. Psychological Assessment: A Journal of Consulting and
Clinical Psychology, 2, 355–364.
Westen, D., & Muderrisoglu, S. (2003). Reliability and validity of personal-
ity disorder assessment using a systematic clinical interview: Evaluating
an alternative to structured interviews. Journal of Personality Disorders,
Westen, D., & Muderrisoglu, S. (in press). Clinical assessment of patholog-
ical personality traits. American Journal of Psychiatry.
Westen, D., Muderrisoglu, S., Fowler, C., Shedler, J., & Koren, D. (1997).
Affect regulation and affective experience: Individual differences, group
differences, and measurement using a Q-sort procedure. Journal of Con-
sulting and Clinical Psychology, 65, 429–439.
Westen, D., & Shedler, J. (1999a). Revising and assessing Axis II: I. De-
veloping a clinically and empirically valid assessment method. American
Journal of Psychiatry, 156, 258–272.
an empirically based and clinically useful classification of personality
disorders. American Journal of Psychiatry, 156, 273–285.
Westen D., & Shedler, J. (2000) A prototype matching approach to di-
agnosing personality disorders toward DSM–V. Journal of Personality
Disorders, 14, 109–126.
Westen, D., Shedler, J., & Bradley, R. (2006). A prototype approach to
personality diagnosis. American Journal of Psychiatry, 163, 846–856.
Westen D., Shedler, J., Durrett, C., Glass, S., & Martens, A. (2003). Per-
sonality diagnosis in adolescence: DSM–IV Axis II diagnoses and an
empirically derived alternative. American Journal of Psychiatry, 160,
Westen, D. W., Waller, N., Shedler, J., Blagov, P., & Bradley, R. (2006).
The structure of personality traits using the SWAP–II: Data from a large
normative sample. Under review.
Westen, D., & Weinberger, J. (2004). When clinical description
becomes statistical prediction. American Psychologist, 59, 595–
Widiger, T. A. (1993). The DSM–III–R categorical personality disorder
diagnoses: A critique and an alternative. Psychological Inquiry, 4, 75–90.
Widiger, T. A., & Corbitt, E. M. (1995). Antisocial personality disor-
der. In J. W. Livesley (Ed.), The DSM–IV personality disorders: Di-
agnosis and treatment of mental disorders (pp. 103–126). New York:
Widiger, T., & Frances, A. (1985). The DSM–III personality disorders:
Perspectives from psychology. Archives of General Psychiatry, 42, 615–
Widiger, T. A., & Samuel, D. B. (2005). Evidence-based assess-
ment of personality disorders. Psychological Assessment, 17, 278–
Widiger, T. A., & Simonsen, E. S. (2005). Alternative dimensional models
of personality disorders: Finding common ground. Journal of Personality
Disorders, 19, 110–130.
Zimmerman, M. (1994). Diagnosing personality disorders: A review of
issues and research methods. Archives of General Psychiatry, 51, 225–
Department of Psychiatry
University of Colorado Health Sciences Center
4455 E. 12th Avenue
Denver, CO 80220
Received March 28, 2006
Revised February 20, 2007